1. 14 8月, 2009 10 次提交
    • T
      x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA · 4518e6a0
      Tejun Heo 提交于
      Embedding percpu first chunk allocator can now handle very sparse unit
      mapping.  Use embedding allocator instead of lpage for 64bit NUMA.
      This removes extra TLB pressure and the need to do complex and fragile
      dancing when changing page attributes.
      
      For 32bit, using very sparse unit mapping isn't a good idea because
      the vmalloc space is very constrained.  32bit NUMA machines aren't
      exactly the focus of optimization and it isn't very clear whether
      lpage performs better than page.  Use page first chunk allocator for
      32bit NUMAs.
      
      As this leaves setup_pcpu_*() functions pretty much empty, fold them
      into setup_per_cpu_areas().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andi Kleen <andi@firstfloor.org>
      4518e6a0
    • T
      percpu: update embedding first chunk allocator to handle sparse units · c8826dd5
      Tejun Heo 提交于
      Now that percpu core can handle very sparse units, given that vmalloc
      space is large enough, embedding first chunk allocator can use any
      memory to build the first chunk.  This patch teaches
      pcpu_embed_first_chunk() about distances between cpus and to use
      alloc/free callbacks to allocate node specific areas for each group
      and use them for the first chunk.
      
      This brings the benefits of embedding allocator to NUMA configurations
      - no extra TLB pressure with the flexibility of unified dynamic
      allocator and no need to restructure arch code to build memory layout
      suitable for percpu.  With units put into atom_size aligned groups
      according to cpu distances, using large page for dynamic chunks is
      also easily possible with falling back to reuglar pages if large
      allocation fails.
      
      Embedding allocator users are converted to specify NULL
      cpu_distance_fn, so this patch doesn't cause any visible behavior
      difference.  Following patches will convert them.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c8826dd5
    • T
      percpu: add pcpu_unit_offsets[] · fb435d52
      Tejun Heo 提交于
      Currently units are mapped sequentially into address space.  This
      patch adds pcpu_unit_offsets[] which allows units to be mapped to
      arbitrary offsets from the chunk base address.  This is necessary to
      allow sparse embedding which might would need to allocate address
      ranges and memory areas which aren't aligned to unit size but
      allocation atom size (page or large page size).  This also simplifies
      things a bit by removing the need to calculate offset from unit
      number.
      
      With this change, there's no need for the arch code to know
      pcpu_unit_size.  Update pcpu_setup_first_chunk() and first chunk
      allocators to return regular 0 or -errno return code instead of unit
      size or -errno.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      fb435d52
    • T
      percpu: introduce pcpu_alloc_info and pcpu_group_info · fd1e8a1f
      Tejun Heo 提交于
      Till now, non-linear cpu->unit map was expressed using an integer
      array which maps each cpu to a unit and used only by lpage allocator.
      Although how many units have been placed in a single contiguos area
      (group) is known while building unit_map, the information is lost when
      the result is recorded into the unit_map array.  For lpage allocator,
      as all allocations are done by lpages and whether two adjacent lpages
      are in the same group or not is irrelevant, this didn't cause any
      problem.  Non-linear cpu->unit mapping will be used for sparse
      embedding and this grouping information is necessary for that.
      
      This patch introduces pcpu_alloc_info which contains all the
      information necessary for initializing percpu allocator.
      pcpu_alloc_info contains array of pcpu_group_info which describes how
      units are grouped and mapped to cpus.  pcpu_group_info also has
      base_offset field to specify its offset from the chunk's base address.
      pcpu_build_alloc_info() initializes this field as if all groups are
      allocated back-to-back as is currently done but this will be used to
      sparsely place groups.
      
      pcpu_alloc_info is a rather complex data structure which contains a
      flexible array which in turn points to nested cpu_map arrays.
      
      * pcpu_alloc_alloc_info() and pcpu_free_alloc_info() are provided to
        help dealing with pcpu_alloc_info.
      
      * pcpu_lpage_build_unit_map() is updated to build pcpu_alloc_info,
        generalized and renamed to pcpu_build_alloc_info().
        @cpu_distance_fn may be NULL indicating that all cpus are of
        LOCAL_DISTANCE.
      
      * pcpul_lpage_dump_cfg() is updated to process pcpu_alloc_info,
        generalized and renamed to pcpu_dump_alloc_info().  It now also
        prints which group each alloc unit belongs to.
      
      * pcpu_setup_first_chunk() now takes pcpu_alloc_info instead of the
        separate parameters.  All first chunk allocators are updated to use
        pcpu_build_alloc_info() to build alloc_info and call
        pcpu_setup_first_chunk() with it.  This has the side effect of
        packing units for sparse possible cpus.  ie. if cpus 0, 2 and 4 are
        possible, they'll be assigned unit 0, 1 and 2 instead of 0, 2 and 4.
      
      * x86 setup_pcpu_lpage() is updated to deal with alloc_info.
      
      * sparc64 setup_per_cpu_areas() is updated to build alloc_info.
      
      Although the changes made by this patch are pretty pervasive, it
      doesn't cause any behavior difference other than packing of sparse
      cpus.  It mostly changes how information is passed among
      initialization functions and makes room for more flexibility.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      fd1e8a1f
    • T
      percpu: add @align to pcpu_fc_alloc_fn_t · 3cbc8565
      Tejun Heo 提交于
      pcpu_fc_alloc_fn_t is about to see more interesting usage, add @align
      parameter.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3cbc8565
    • T
      percpu: drop @static_size from first chunk allocators · 9a773769
      Tejun Heo 提交于
      First chunk allocators assume percpu areas have been linked using one
      of PERCPU_*() macros and depend on __per_cpu_load symbol defined by
      those macros, so there isn't much point in passing in static area size
      explicitly when it can be easily calculated from __per_cpu_start and
      __per_cpu_end.  Drop @static_size from all percpu first chunk
      allocators and helpers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      9a773769
    • T
      percpu: generalize first chunk allocator selection · f58dc01b
      Tejun Heo 提交于
      Now that all first chunk allocators are in mm/percpu.c, it makes sense
      to make generalize percpu_alloc kernel parameter.  Define PCPU_FC_*
      and set pcpu_chosen_fc using early_param() in mm/percpu.c.  Arch code
      can use the set value to determine which first chunk allocator to use.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f58dc01b
    • T
      percpu: build first chunk allocators selectively · 08fc4580
      Tejun Heo 提交于
      There's no need to build unused first chunk allocators in.  Define
      CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs enable them
      selectively.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      08fc4580
    • T
      percpu: rename 4k first chunk allocator to page · 00ae4064
      Tejun Heo 提交于
      Page size isn't always 4k depending on arch and configuration.  Rename
      4k first chunk allocator to page.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: David Howells <dhowells@redhat.com>
      00ae4064
    • T
      percpu, sparc64: fix sparse possible cpu map handling · 74d46d6b
      Tejun Heo 提交于
      percpu code has been assuming num_possible_cpus() == nr_cpu_ids which
      is incorrect if cpu_possible_map contains holes.  This causes percpu
      code to access beyond allocated memories and vmalloc areas.  On a
      sparc64 machine with cpus 0 and 2 (u60), this triggers the following
      warning or fails boot.
      
       WARNING: at /devel/tj/os/work/mm/vmalloc.c:106 vmap_page_range_noflush+0x1f0/0x240()
       Modules linked in:
       Call Trace:
        [00000000004b17d0] vmap_page_range_noflush+0x1f0/0x240
        [00000000004b1840] map_vm_area+0x20/0x60
        [00000000004b1950] __vmalloc_area_node+0xd0/0x160
        [0000000000593434] deflate_init+0x14/0xe0
        [0000000000583b94] __crypto_alloc_tfm+0xd4/0x1e0
        [00000000005844f0] crypto_alloc_base+0x50/0xa0
        [000000000058b898] alg_test_comp+0x18/0x80
        [000000000058dad4] alg_test+0x54/0x180
        [000000000058af00] cryptomgr_test+0x40/0x60
        [0000000000473098] kthread+0x58/0x80
        [000000000042b590] kernel_thread+0x30/0x60
        [0000000000472fd0] kthreadd+0xf0/0x160
       ---[ end trace 429b268a213317ba ]---
      
      This patch fixes generic percpu functions and sparc64
      setup_per_cpu_areas() so that they handle sparse cpu_possible_map
      properly.
      
      Please note that on x86, cpu_possible_map() doesn't contain holes and
      thus num_possible_cpus() == nr_cpu_ids and this patch doesn't cause
      any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      74d46d6b
  2. 13 8月, 2009 3 次提交
  3. 12 8月, 2009 9 次提交
  4. 11 8月, 2009 6 次提交
    • K
      x86: Clear incorrectly forced X86_FEATURE_LAHF_LM flag · fbd8b181
      Kevin Winchester 提交于
      Due to an erratum with certain AMD Athlon 64 processors, the
      BIOS may need to force enable the LAHF_LM capability.
      Unfortunately, in at least one case, the BIOS does this even
      for processors that do not support the functionality.
      
      Add a specific check that will clear the feature bit for
      processors known not to support the LAHF/SAHF instructions.
      Signed-off-by: NKevin Winchester <kjwinchester@gmail.com>
      Acked-by: NBorislav Petkov <petkovbb@googlemail.com>
      LKML-Reference: <4A80A5AD.2000209@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fbd8b181
    • I
      perf_counter, x86: Fix generic cache events on P6-mobile CPUs · f64ccccb
      Ingo Molnar 提交于
      Johannes Stezenbach reported that 'perf stat' does not count
      cache-miss and cache-references events on his Pentium-M based
      laptop.
      
      This is because we left them blank in p6_perfmon_event_map[],
      fill them in.
      Reported-by: NJohannes Stezenbach <js@sig21.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f64ccccb
    • I
      perf_counter, x86: Fix lapic printk message · 3c581a7f
      Ingo Molnar 提交于
      Instead of this garbled bootup on UP Pentium-M systems:
      
      [    0.015048] Performance Counters:
      [    0.016004] no Local APIC, try rebooting with lapicno PMU driver, software counters only.
      
      Print:
      
      [    0.015050] Performance Counters:
      [    0.016004] no APIC, boot with the "lapic" boot parameter to force-enable it.
      [    0.017003] no PMU driver, software counters only.
      
      Cf: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c581a7f
    • D
      x86, mce: therm_throt - change when we print messages · 0d01f314
      Dmitry Torokhov 提交于
      My Latitude d630 seems to be handling thermal events in SMI by
      lowering the max frequency of the CPU till it cools down but
      still leaks the "everything is normal" events.
      
      This spams the console and with high priority printks.
      
      Adjust therm_throt driver to only print messages about the fact
      that temperatire returned back to normal when leaving the
      throttling state.
      
      Also lower the severity of "back to normal" message from
      KERN_CRIT to KERN_INFO.
      Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <20090810051513.0558F526EC9@mailhub.coreip.homeip.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d01f314
    • S
      x86: Add reboot quirk for every 5 series MacBook/Pro · 3e03bbea
      Shunichi Fuji 提交于
      Reboot does not work on my MacBook Pro 13 inch (MacBookPro5,5)
      too. It seems all unibody MacBook and MacBookPro require
      PCI reboot handling, i guess.
      
      Following model/machine ID list shows unibody MacBook/Pro have
      the 5 series of model number:
      
         http://www.everymac.com/systems/by_capability/macs-by-machine-model-machine-id.htmlSigned-off-by: NShunichi Fuji <palglowr@gmail.com>
      Cc: Ozan Çağlayan <ozan@pardus.org.tr>
      LKML-Reference: <30046e3b0908101134p6487ddbftd8776e4ddef204be@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e03bbea
    • L
      x86: Fix serialization in pit_expect_msb() · b6e61eef
      Linus Torvalds 提交于
      Wei Chong Tan reported a fast-PIT-calibration corner-case:
      
      | pit_expect_msb() is vulnerable to SMI disturbance corner case
      | in some platforms which causes /proc/cpuinfo to show wrong
      | CPU MHz value when quick_pit_calibrate() jumps to success
      | section.
      
      I think that the real issue isn't even an SMI - but the fact
      that in the very last iteration of the loop, there's no
      serializing instruction _after_ the last 'rdtsc'. So even in
      the absense of SMI's, we do have a situation where the cycle
      counter was read without proper serialization.
      
      The last check should be done outside the outer loop, since
      _inside_ the outer loop, we'll be testing that the PIT has
      the right MSB value has the right value in the next iteration.
      
      So only the _last_ iteration is special, because that's the one
      that will not check the PIT MSB value any more, and because the
      final 'get_cycles()' isn't serialized.
      
      In other words:
      
       - I'd like to move the PIT MSB check to after the last
         iteration, rather than in every iteration
      
       - I think we should comment on the fact that it's also a
         serializing instruction and so 'fences in' the TSC read.
      
      Here's a suggested replacement.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      Tested-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b6e61eef
  5. 10 8月, 2009 2 次提交
  6. 09 8月, 2009 2 次提交
  7. 08 8月, 2009 3 次提交
    • O
      x86: Add quirk to make Apple MacBookPro5,1 use reboot=pci · 498cdbfb
      Ozan Çağlayan 提交于
      MacBookPro5,1 is not able to reboot unless reboot=pci is set.
      This patch forces it through a DMI quirk specific to this
      device.
      Signed-off-by: NOzan Çağlayan <ozan@pardus.org.tr>
      LKML-Reference: <1249403971-6543-1-git-send-email-ozan@pardus.org.tr>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      498cdbfb
    • Y
      x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus · 087d7e56
      Yinghai Lu 提交于
      found a system where x2apic reports an MSI-X irq initialization
      failure:
      
      [  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
      [  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
      [  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
      [  302.894386] igbvf 0000:81:10.4: enabling bus mastering
      [  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
      [  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.940367]   alloc irq_desc for 265 on node 4
      [  302.956874]   alloc kstat_irqs on node 4
      [  302.959452] alloc irq_2_iommu on node 0
      [  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
      [  302.977778]   alloc irq_desc for 266 on node 4
      [  302.980347]   alloc kstat_irqs on node 4
      [  302.995312] free_memtype request 0xefb28000-0xefb29000
      [  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.
      
      ... it turns out that when trying to enable MSI-X,
      __assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
      get vector because for x2apic target-cpus returns cpumask_of(0)
      
      Update that to online_mask like xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <4A785AFF.3050902@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      087d7e56
    • G
      USB: musb: fix the nop registration for OMAP3EVM · e8e2ff46
      Gupta, Ajay Kumar 提交于
      OMAP3EVM uses ISP1504 phy which doesn't require any programming and
      thus has to use NOP otg transceiver.
      
      Cleanups being done:
      	- Remove unwanted code in usb-musb.c file
      	- Register NOP in OMAP3EVM board file using
      	  usb_nop_xceiv_register().
      	- Select NOP_USB_XCEIV for OMAP3EVM boards.
      	- Don't enable TWL4030_USB in omap3_evm_defconfig
      Signed-off-by: NAjay Kumar Gupta <ajay.gupta@ti.com>
      Signed-off-by: NEino-Ville Talvala <talvala@stanford.edu>
      Acked-by: NDavid Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      e8e2ff46
  8. 07 8月, 2009 2 次提交
  9. 06 8月, 2009 3 次提交