1. 14 8月, 2009 10 次提交
    • T
      x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA · 4518e6a0
      Tejun Heo 提交于
      Embedding percpu first chunk allocator can now handle very sparse unit
      mapping.  Use embedding allocator instead of lpage for 64bit NUMA.
      This removes extra TLB pressure and the need to do complex and fragile
      dancing when changing page attributes.
      
      For 32bit, using very sparse unit mapping isn't a good idea because
      the vmalloc space is very constrained.  32bit NUMA machines aren't
      exactly the focus of optimization and it isn't very clear whether
      lpage performs better than page.  Use page first chunk allocator for
      32bit NUMAs.
      
      As this leaves setup_pcpu_*() functions pretty much empty, fold them
      into setup_per_cpu_areas().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Andi Kleen <andi@firstfloor.org>
      4518e6a0
    • T
      percpu: update embedding first chunk allocator to handle sparse units · c8826dd5
      Tejun Heo 提交于
      Now that percpu core can handle very sparse units, given that vmalloc
      space is large enough, embedding first chunk allocator can use any
      memory to build the first chunk.  This patch teaches
      pcpu_embed_first_chunk() about distances between cpus and to use
      alloc/free callbacks to allocate node specific areas for each group
      and use them for the first chunk.
      
      This brings the benefits of embedding allocator to NUMA configurations
      - no extra TLB pressure with the flexibility of unified dynamic
      allocator and no need to restructure arch code to build memory layout
      suitable for percpu.  With units put into atom_size aligned groups
      according to cpu distances, using large page for dynamic chunks is
      also easily possible with falling back to reuglar pages if large
      allocation fails.
      
      Embedding allocator users are converted to specify NULL
      cpu_distance_fn, so this patch doesn't cause any visible behavior
      difference.  Following patches will convert them.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      c8826dd5
    • T
      percpu: add pcpu_unit_offsets[] · fb435d52
      Tejun Heo 提交于
      Currently units are mapped sequentially into address space.  This
      patch adds pcpu_unit_offsets[] which allows units to be mapped to
      arbitrary offsets from the chunk base address.  This is necessary to
      allow sparse embedding which might would need to allocate address
      ranges and memory areas which aren't aligned to unit size but
      allocation atom size (page or large page size).  This also simplifies
      things a bit by removing the need to calculate offset from unit
      number.
      
      With this change, there's no need for the arch code to know
      pcpu_unit_size.  Update pcpu_setup_first_chunk() and first chunk
      allocators to return regular 0 or -errno return code instead of unit
      size or -errno.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      fb435d52
    • T
      percpu: introduce pcpu_alloc_info and pcpu_group_info · fd1e8a1f
      Tejun Heo 提交于
      Till now, non-linear cpu->unit map was expressed using an integer
      array which maps each cpu to a unit and used only by lpage allocator.
      Although how many units have been placed in a single contiguos area
      (group) is known while building unit_map, the information is lost when
      the result is recorded into the unit_map array.  For lpage allocator,
      as all allocations are done by lpages and whether two adjacent lpages
      are in the same group or not is irrelevant, this didn't cause any
      problem.  Non-linear cpu->unit mapping will be used for sparse
      embedding and this grouping information is necessary for that.
      
      This patch introduces pcpu_alloc_info which contains all the
      information necessary for initializing percpu allocator.
      pcpu_alloc_info contains array of pcpu_group_info which describes how
      units are grouped and mapped to cpus.  pcpu_group_info also has
      base_offset field to specify its offset from the chunk's base address.
      pcpu_build_alloc_info() initializes this field as if all groups are
      allocated back-to-back as is currently done but this will be used to
      sparsely place groups.
      
      pcpu_alloc_info is a rather complex data structure which contains a
      flexible array which in turn points to nested cpu_map arrays.
      
      * pcpu_alloc_alloc_info() and pcpu_free_alloc_info() are provided to
        help dealing with pcpu_alloc_info.
      
      * pcpu_lpage_build_unit_map() is updated to build pcpu_alloc_info,
        generalized and renamed to pcpu_build_alloc_info().
        @cpu_distance_fn may be NULL indicating that all cpus are of
        LOCAL_DISTANCE.
      
      * pcpul_lpage_dump_cfg() is updated to process pcpu_alloc_info,
        generalized and renamed to pcpu_dump_alloc_info().  It now also
        prints which group each alloc unit belongs to.
      
      * pcpu_setup_first_chunk() now takes pcpu_alloc_info instead of the
        separate parameters.  All first chunk allocators are updated to use
        pcpu_build_alloc_info() to build alloc_info and call
        pcpu_setup_first_chunk() with it.  This has the side effect of
        packing units for sparse possible cpus.  ie. if cpus 0, 2 and 4 are
        possible, they'll be assigned unit 0, 1 and 2 instead of 0, 2 and 4.
      
      * x86 setup_pcpu_lpage() is updated to deal with alloc_info.
      
      * sparc64 setup_per_cpu_areas() is updated to build alloc_info.
      
      Although the changes made by this patch are pretty pervasive, it
      doesn't cause any behavior difference other than packing of sparse
      cpus.  It mostly changes how information is passed among
      initialization functions and makes room for more flexibility.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      fd1e8a1f
    • T
      percpu: add @align to pcpu_fc_alloc_fn_t · 3cbc8565
      Tejun Heo 提交于
      pcpu_fc_alloc_fn_t is about to see more interesting usage, add @align
      parameter.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3cbc8565
    • T
      percpu: drop @static_size from first chunk allocators · 9a773769
      Tejun Heo 提交于
      First chunk allocators assume percpu areas have been linked using one
      of PERCPU_*() macros and depend on __per_cpu_load symbol defined by
      those macros, so there isn't much point in passing in static area size
      explicitly when it can be easily calculated from __per_cpu_start and
      __per_cpu_end.  Drop @static_size from all percpu first chunk
      allocators and helpers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      9a773769
    • T
      percpu: generalize first chunk allocator selection · f58dc01b
      Tejun Heo 提交于
      Now that all first chunk allocators are in mm/percpu.c, it makes sense
      to make generalize percpu_alloc kernel parameter.  Define PCPU_FC_*
      and set pcpu_chosen_fc using early_param() in mm/percpu.c.  Arch code
      can use the set value to determine which first chunk allocator to use.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      f58dc01b
    • T
      percpu: build first chunk allocators selectively · 08fc4580
      Tejun Heo 提交于
      There's no need to build unused first chunk allocators in.  Define
      CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs enable them
      selectively.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      08fc4580
    • T
      percpu: rename 4k first chunk allocator to page · 00ae4064
      Tejun Heo 提交于
      Page size isn't always 4k depending on arch and configuration.  Rename
      4k first chunk allocator to page.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: David Howells <dhowells@redhat.com>
      00ae4064
    • T
      percpu, sparc64: fix sparse possible cpu map handling · 74d46d6b
      Tejun Heo 提交于
      percpu code has been assuming num_possible_cpus() == nr_cpu_ids which
      is incorrect if cpu_possible_map contains holes.  This causes percpu
      code to access beyond allocated memories and vmalloc areas.  On a
      sparc64 machine with cpus 0 and 2 (u60), this triggers the following
      warning or fails boot.
      
       WARNING: at /devel/tj/os/work/mm/vmalloc.c:106 vmap_page_range_noflush+0x1f0/0x240()
       Modules linked in:
       Call Trace:
        [00000000004b17d0] vmap_page_range_noflush+0x1f0/0x240
        [00000000004b1840] map_vm_area+0x20/0x60
        [00000000004b1950] __vmalloc_area_node+0xd0/0x160
        [0000000000593434] deflate_init+0x14/0xe0
        [0000000000583b94] __crypto_alloc_tfm+0xd4/0x1e0
        [00000000005844f0] crypto_alloc_base+0x50/0xa0
        [000000000058b898] alg_test_comp+0x18/0x80
        [000000000058dad4] alg_test+0x54/0x180
        [000000000058af00] cryptomgr_test+0x40/0x60
        [0000000000473098] kthread+0x58/0x80
        [000000000042b590] kernel_thread+0x30/0x60
        [0000000000472fd0] kthreadd+0xf0/0x160
       ---[ end trace 429b268a213317ba ]---
      
      This patch fixes generic percpu functions and sparc64
      setup_per_cpu_areas() so that they handle sparse cpu_possible_map
      properly.
      
      Please note that on x86, cpu_possible_map() doesn't contain holes and
      thus num_possible_cpus() == nr_cpu_ids and this patch doesn't cause
      any behavior difference.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      74d46d6b
  2. 12 8月, 2009 2 次提交
    • I
      perf_counter, x86: Fix/improve apic fallback · 04da8a43
      Ingo Molnar 提交于
      Johannes Stezenbach reported that his Pentium-M based
      laptop does not have the local APIC enabled by default,
      and hence perfcounters do not get initialized.
      
      Add a fallback for this case: allow non-sampled counters
      and return with an error on sampled counters. This allows
      'perf stat' to work out of box - and allows 'perf top'
      and 'perf record' to fall back on a hrtimer based sampling
      method.
      
      ( Passing 'lapic' on the boot line will allow hardware
        sampling to occur - but if the APIC is disabled
        permanently by the hardware then this fallback still
        allows more systems to use perfcounters. )
      
      Also decouple perfcounter support from X86_LOCAL_APIC.
      
      -v2: fix typo breaking counters on all other systems ...
      Reported-by: NJohannes Stezenbach <js@sig21.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      04da8a43
    • O
      x86: Fix oops in identify_cpu() on CPUs without CPUID · e8055139
      Ondrej Zary 提交于
      Kernel is broken for x86 CPUs without CPUID since 2.6.28. It
      crashes with NULL pointer dereference in identify_cpu():
      
      766        generic_identify(c);
      767
      768-->     if (this_cpu->c_identify)
      769               this_cpu->c_identify(c);
      
      this_cpu is NULL. This is because it's only initialized in
      get_cpu_vendor() function, which is not called if the CPU has
      no CPUID instruction.
      Signed-off-by: NOndrej Zary <linux@rainbow-software.org>
      LKML-Reference: <200908112000.15993.linux@rainbow-software.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8055139
  3. 11 8月, 2009 6 次提交
    • K
      x86: Clear incorrectly forced X86_FEATURE_LAHF_LM flag · fbd8b181
      Kevin Winchester 提交于
      Due to an erratum with certain AMD Athlon 64 processors, the
      BIOS may need to force enable the LAHF_LM capability.
      Unfortunately, in at least one case, the BIOS does this even
      for processors that do not support the functionality.
      
      Add a specific check that will clear the feature bit for
      processors known not to support the LAHF/SAHF instructions.
      Signed-off-by: NKevin Winchester <kjwinchester@gmail.com>
      Acked-by: NBorislav Petkov <petkovbb@googlemail.com>
      LKML-Reference: <4A80A5AD.2000209@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fbd8b181
    • I
      perf_counter, x86: Fix generic cache events on P6-mobile CPUs · f64ccccb
      Ingo Molnar 提交于
      Johannes Stezenbach reported that 'perf stat' does not count
      cache-miss and cache-references events on his Pentium-M based
      laptop.
      
      This is because we left them blank in p6_perfmon_event_map[],
      fill them in.
      Reported-by: NJohannes Stezenbach <js@sig21.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f64ccccb
    • I
      perf_counter, x86: Fix lapic printk message · 3c581a7f
      Ingo Molnar 提交于
      Instead of this garbled bootup on UP Pentium-M systems:
      
      [    0.015048] Performance Counters:
      [    0.016004] no Local APIC, try rebooting with lapicno PMU driver, software counters only.
      
      Print:
      
      [    0.015050] Performance Counters:
      [    0.016004] no APIC, boot with the "lapic" boot parameter to force-enable it.
      [    0.017003] no PMU driver, software counters only.
      
      Cf: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3c581a7f
    • D
      x86, mce: therm_throt - change when we print messages · 0d01f314
      Dmitry Torokhov 提交于
      My Latitude d630 seems to be handling thermal events in SMI by
      lowering the max frequency of the CPU till it cools down but
      still leaks the "everything is normal" events.
      
      This spams the console and with high priority printks.
      
      Adjust therm_throt driver to only print messages about the fact
      that temperatire returned back to normal when leaving the
      throttling state.
      
      Also lower the severity of "back to normal" message from
      KERN_CRIT to KERN_INFO.
      Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <20090810051513.0558F526EC9@mailhub.coreip.homeip.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d01f314
    • S
      x86: Add reboot quirk for every 5 series MacBook/Pro · 3e03bbea
      Shunichi Fuji 提交于
      Reboot does not work on my MacBook Pro 13 inch (MacBookPro5,5)
      too. It seems all unibody MacBook and MacBookPro require
      PCI reboot handling, i guess.
      
      Following model/machine ID list shows unibody MacBook/Pro have
      the 5 series of model number:
      
         http://www.everymac.com/systems/by_capability/macs-by-machine-model-machine-id.htmlSigned-off-by: NShunichi Fuji <palglowr@gmail.com>
      Cc: Ozan Çağlayan <ozan@pardus.org.tr>
      LKML-Reference: <30046e3b0908101134p6487ddbftd8776e4ddef204be@mail.gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e03bbea
    • L
      x86: Fix serialization in pit_expect_msb() · b6e61eef
      Linus Torvalds 提交于
      Wei Chong Tan reported a fast-PIT-calibration corner-case:
      
      | pit_expect_msb() is vulnerable to SMI disturbance corner case
      | in some platforms which causes /proc/cpuinfo to show wrong
      | CPU MHz value when quick_pit_calibrate() jumps to success
      | section.
      
      I think that the real issue isn't even an SMI - but the fact
      that in the very last iteration of the loop, there's no
      serializing instruction _after_ the last 'rdtsc'. So even in
      the absense of SMI's, we do have a situation where the cycle
      counter was read without proper serialization.
      
      The last check should be done outside the outer loop, since
      _inside_ the outer loop, we'll be testing that the PIT has
      the right MSB value has the right value in the next iteration.
      
      So only the _last_ iteration is special, because that's the one
      that will not check the PIT MSB value any more, and because the
      final 'get_cycles()' isn't serialized.
      
      In other words:
      
       - I'd like to move the PIT MSB check to after the last
         iteration, rather than in every iteration
      
       - I think we should comment on the fact that it's also a
         serializing instruction and so 'fences in' the TSC read.
      
      Here's a suggested replacement.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      Tested-by: N"Tan, Wei Chong" <wei.chong.tan@intel.com>
      LKML-Reference: <B28277FD4E0F9247A3D55704C440A140D5D683F3@pgsmsx504.gar.corp.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b6e61eef
  4. 09 8月, 2009 1 次提交
  5. 08 8月, 2009 2 次提交
    • O
      x86: Add quirk to make Apple MacBookPro5,1 use reboot=pci · 498cdbfb
      Ozan Çağlayan 提交于
      MacBookPro5,1 is not able to reboot unless reboot=pci is set.
      This patch forces it through a DMI quirk specific to this
      device.
      Signed-off-by: NOzan Çağlayan <ozan@pardus.org.tr>
      LKML-Reference: <1249403971-6543-1-git-send-email-ozan@pardus.org.tr>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      498cdbfb
    • Y
      x86: Fix MSI-X initialization by using online_mask for x2apic target_cpus · 087d7e56
      Yinghai Lu 提交于
      found a system where x2apic reports an MSI-X irq initialization
      failure:
      
      [  302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
      [  302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
      [  302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
      [  302.894386] igbvf 0000:81:10.4: enabling bus mastering
      [  302.898171] igbvf 0000:81:10.4: setting latency timer to 64
      [  302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
      [  302.940367]   alloc irq_desc for 265 on node 4
      [  302.956874]   alloc kstat_irqs on node 4
      [  302.959452] alloc irq_2_iommu on node 0
      [  302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
      [  302.977778]   alloc irq_desc for 266 on node 4
      [  302.980347]   alloc kstat_irqs on node 4
      [  302.995312] free_memtype request 0xefb28000-0xefb29000
      [  302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.
      
      ... it turns out that when trying to enable MSI-X,
      __assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
      get vector because for x2apic target-cpus returns cpumask_of(0)
      
      Update that to online_mask like xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <4A785AFF.3050902@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      087d7e56
  6. 06 8月, 2009 1 次提交
  7. 05 8月, 2009 8 次提交
  8. 04 8月, 2009 10 次提交
    • S
      x86: Work around compilation warning in arch/x86/kernel/apm_32.c · dc731fbb
      Subrata Modak 提交于
      The following fix was initially inspired by David Howells fix
      few days back:
      
        http://lkml.org/lkml/2009/7/9/109
      
      However, Ingo disapproves such fixes as it's dangerous (it can
      hide future, relevant warnings) - in something as
      performance-uncritical.
      
      So, initialize 'err' to '0' to work around a GCC false positive
      warning:
      
        http://lkml.org/lkml/2009/7/18/89
      
      Signed-off-by: Subrata Modak<subrata@linux.vnet.ibm.com>
      Cc: Sachin P Sant <sachinp@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      LKML-Reference: <20090721023226.31855.67236.sendpatchset@subratamodak.linux.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc731fbb
    • J
      x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq() · 2a5ef416
      Jack Steiner 提交于
      In uv_setup_irq(), the call to create_irq() initially assigns
      IRQ vectors to cpu 0. The subsequent call to
      assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to
      another cpu and frees the cpu 0 vector - at least it will be
      freed as soon as the "IRQ move" completes.
      
      arch_enable_uv_irq() needs to send a cleanup IPI to complete
      the IRQ move. Otherwise, assignment of GRU interrupts on large
      systems (>200 cpus) will exhaust the cpu 0 interrupt vectors
      and initialization of the GRU driver will fail.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090720142840.GA8885@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2a5ef416
    • J
      x86, 32-bit: Fix double accounting in reserve_top_address() · 6abf6551
      Jan Beulich 提交于
      With VMALLOC_END included in the calculation of MAXMEM (as of
      2.6.28) it is no longer correct to also bump __VMALLOC_RESERVE
      in reserve_top_address(). Doing so results in needlessly small
      lowmem.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      LKML-Reference: <4A71DD2A020000780000D482@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6abf6551
    • Y
      x86: Don't use current_cpu_data in x2apic phys_pkg_id · d8c7eb34
      Yinghai Lu 提交于
      One system has socket 1 come up as BSP.
      
      kexeced kernel reports BSP as:
      
      [    1.524550] Initializing cgroup subsys cpuacct
      [    1.536064] initial_apicid:20
      [    1.537135] ht_mask_width:1
      [    1.538128] core_select_mask:f
      [    1.539126] core_plus_mask_width:5
      [    1.558479] CPU: Physical Processor ID: 0
      [    1.559501] CPU: Processor Core ID: 0
      [    1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K
      [    1.579098] CPU: L2 cache: 256K
      [    1.580085] CPU: L3 cache: 24576K
      [    1.581108] CPU 0/0x20 -> Node 0
      [    1.596193] CPU 0 microcode level: 0xffff0008
      
      It doesn't have correct physical processor id and will get an
      error:
      
      [   38.840859] CPU0 attaching sched-domain:
      [   38.848287]  domain 0: span 0,8,72 level SIBLING
      [   38.851151]   groups: 0 8 72
      [   38.858137]   domain 1: span 0,8-15,72-79 level MC
      [   38.868944]    groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79
      [   38.881383] ERROR: parent span is not a superset of domain->span
      [   38.890724]    domain 2: span 0-7,64-71 level CPU
      [   38.899237] ERROR: domain->groups does not contain CPU0
      [   38.909229]     groups: 8-15,72-79
      [   38.912547] ERROR: groups don't span domain->span
      [   38.919665]     domain 3: span 0-127 level NODE
      [   38.930739]      groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127
      
      it turns out: we can not use current_cpu_data in phys_pgd_id
      for x2apic.
      
      identify_boot_cpu() is called by check_bugs() before
      smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
      for bsp is assigned with boot_cpu_data.
      
      Just make phys_pkg_id for x2apic is aligned to xapic.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A6ADD0D.10002@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d8c7eb34
    • J
      x86, UV: Fix UV apic mode · c5997fa8
      Jack Steiner 提交于
      Change SGI UV default apicid mode to "physical". This is
      required to match settings in the UV hub chip.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143856.GA8905@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c5997fa8
    • J
      x86, UV: Fix macros for accessing large node numbers · 67e83f30
      Jack Steiner 提交于
      The UV chipset automatically supplies the upper bits on nodes
      being referenced by MMR accesses. These bit can be deleted from
      the hub addressing macros.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20090727143808.GA8076@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67e83f30
    • J
      x86, UV: Delete mapping of MMR rangs mapped by BIOS · cc5e4fa1
      Jack Steiner 提交于
      The UV BIOS has added additional MMR ranges that are mapped via
      EFI virtual mode mappings. These ranges should be deleted from
      ranges mapped by uv_system_init().
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143656.GA7698@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc5e4fa1
    • J
      x86, UV: Handle missing blade-local memory correctly · 6c7184b7
      Jack Steiner 提交于
      UV blades may not have any blade-local memory. Add a field
      (nid) to the UV blade structure to indicates whether the node
      has local memory. This is needed by the GRU driver (pushed
      separately).
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: linux-mm@kvack.org
      LKML-Reference: <20090727143507.GA7006@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6c7184b7
    • H
      x86: fix assembly constraints in native_save_fl() · f1f029c7
      H. Peter Anvin 提交于
      From Gabe Black in bugzilla 13888:
      
      native_save_fl is implemented as follows:
      
        11static inline unsigned long native_save_fl(void)
        12{
        13        unsigned long flags;
        14
        15        asm volatile("# __raw_save_flags\n\t"
        16                     "pushf ; pop %0"
        17                     : "=g" (flags)
        18                     : /* no input */
        19                     : "memory");
        20
        21        return flags;
        22}
      
      If gcc chooses to put flags on the stack, for instance because this is
      inlined into a larger function with more register pressure, the offset
      of the flags variable from the stack pointer will change when the
      pushf is performed. gcc doesn't attempt to understand that fact, and
      address used for pop will still be the same. It will write to
      somewhere near flags on the stack but not actually into it and
      overwrite some other value.
      
      I saw this happen in the ide_device_add_all function when running in a
      simulator I work on. I'm assuming that some quirk of how the simulated
      hardware is set up caused the code path this is on to be executed when
      it normally wouldn't.
      
      A simple fix might be to change "=g" to "=r".
      Reported-by: NGabe Black <spamforgabe@umich.edu>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Stable Team <stable@kernel.org>
      f1f029c7
    • B
      x86, msr: execute on the correct CPU subset · bab9a3da
      Borislav Petkov 提交于
      Make rdmsr_on_cpus/wrmsr_on_cpus execute on the current CPU only if it
      is in the supplied bitmask.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      bab9a3da