1. 20 5月, 2011 2 次提交
  2. 18 5月, 2011 5 次提交
  3. 17 5月, 2011 2 次提交
  4. 16 5月, 2011 1 次提交
    • Y
      x86, apic: Fix spurious error interrupts triggering on all non-boot APs · e503f9e4
      Youquan Song 提交于
      This patch fixes a bug reported by a customer, who found
      that many unreasonable error interrupts reported on all
      non-boot CPUs (APs) during the system boot stage.
      
      According to Chapter 10 of Intel Software Developer Manual
      Volume 3A, Local APIC may signal an illegal vector error when
      an LVT entry is set as an illegal vector value (0~15) under
      FIXED delivery mode (bits 8-11 is 0), regardless of whether
      the mask bit is set or an interrupt actually happen. These
      errors are seen as error interrupts.
      
      The initial value of thermal LVT entries on all APs always reads
      0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
      sequence to them and LVT registers are reset to 0s except for
      the mask bits which are set to 1s when APs receive INIT IPI.
      
      When the BIOS takes over the thermal throttling interrupt,
      the LVT thermal deliver mode should be SMI and it is required
      from the kernel to keep AP's LVT thermal monitoring register
      programmed as such as well.
      
      This issue happens when BIOS does not take over thermal throttling
      interrupt, AP's LVT thermal monitor register will be restored to
      0x10000 which means vector 0 and fixed deliver mode, so all APs will
      signal illegal vector error interrupts.
      
      This patch check if interrupt delivery mode is not fixed mode before
      restoring AP's LVT thermal monitor register.
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Acked-by: NYong Wang <yong.y.wang@intel.com>
      Cc: hpa@linux.intel.com
      Cc: joe@perches.com
      Cc: jbaron@redhat.com
      Cc: trenn@suse.de
      Cc: kent.liu@intel.com
      Cc: chaohong.guo@intel.com
      Cc: <stable@kernel.org> # As far back as possible
      Link: http://lkml.kernel.org/r/1303402963-17738-1-git-send-email-youquan.song@intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      e503f9e4
  5. 14 5月, 2011 1 次提交
  6. 13 5月, 2011 2 次提交
    • C
      x86: Fix UV BAU for non-consecutive nasids · 77ed23f8
      Cliff Wickman 提交于
      This is a fix for the SGI Altix-UV Broadcast Assist Unit code,
      which is used for TLB flushing.
      
      Certain hardware configurations (that customers are ordering)
      cause nasids (numa address space id's) to be non-consecutive.
      Specifically, once you have more than 4 blades in a IRU
      (Individual Rack Unit - or 1/2 rack) but less than the maximum
      of 16, the nasid numbering becomes non-consecutive.  This
      currently results in a 'catastrophic error' (CATERR) detected by
      the firmware during OS boot.  The BAU is generating an 'INTD'
      request that is targeting a non-existent nasid value. Such
      configurations may also occur when a blade is configured off
      because of hardware errors. (There is one UV hub per blade.)
      
      This patch is required to support such configurations.
      
      The problem with the tlb_uv.c code is that is using the
      consecutive hub numbers as indices to the BAU distribution bit
      map. These are simply the ordinal position of the hub or blade
      within its partition.  It should be using physical node numbers
      (pnodes), which correspond to the physical nasid values. Use of
      the hub number only works as long as the nasids in the partition
      are consecutive and increase with a stride of 1.
      
      This patch changes the index to be the pnode number, thus
      allowing nasids to be non-consecutive.
      It also provides a table in local memory for each cpu to
      translate target cpu number to target pnode and nasid.
      And it improves naming to properly reflect 'node' and 'uvhub'
      versus 'nasid'.
      Signed-off-by: NCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/E1QJmxX-0002Mz-Fk@eag09.americas.sgi.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      77ed23f8
    • S
      x86,xen: introduce x86_init.mapping.pagetable_reserve · 279b706b
      Stefano Stabellini 提交于
      Introduce a new x86_init hook called pagetable_reserve that at the end
      of init_memory_mapping is used to reserve a range of memory addresses for
      the kernel pagetable pages we used and free the other ones.
      
      On native it just calls memblock_x86_reserve_range while on xen it also
      takes care of setting the spare memory previously allocated
      for kernel pagetable pages from RO to RW, so that it can be used for
      other purposes.
      
      A detailed explanation of the reason why this hook is needed follows.
      
      As a consequence of the commit:
      
      commit 4b239f45
      Author: Yinghai Lu <yinghai@kernel.org>
      Date:   Fri Dec 17 16:58:28 2010 -0800
      
          x86-64, mm: Put early page table high
      
      at some point init_memory_mapping is going to reach the pagetable pages
      area and map those pages too (mapping them as normal memory that falls
      in the range of addresses passed to init_memory_mapping as argument).
      Some of those pages are already pagetable pages (they are in the range
      pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
      everything is fine.
      Some of these pages are not pagetable pages yet (they fall in the range
      pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
      are going to be mapped RW.  When these pages become pagetable pages and
      are hooked into the pagetable, xen will find that the guest has already
      a RW mapping of them somewhere and fail the operation.
      The reason Xen requires pagetables to be RO is that the hypervisor needs
      to verify that the pagetables are valid before using them. The validation
      operations are called "pinning" (more details in arch/x86/xen/mmu.c).
      
      In order to fix the issue we mark all the pages in the entire range
      pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
      is completed only the range pgt_buf_start-pgt_buf_end is reserved by
      init_memory_mapping. Hence the kernel is going to crash as soon as one
      of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
      ranges are RO).
      
      For this reason we need a hook to reserve the kernel pagetable pages we
      used and free the other ones so that they can be reused for other
      purposes.
      On native it just means calling memblock_x86_reserve_range, on Xen it
      also means marking RW the pagetable pages that we allocated before but
      that haven't been used before.
      
      Another way to fix this is without using the hook is by adding a 'if
      (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
      counterpart, but that is just nasty.
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      279b706b
  7. 12 5月, 2011 1 次提交
  8. 10 5月, 2011 3 次提交
  9. 02 5月, 2011 13 次提交
    • T
      x86, NUMA: Enable emulation on 32bit too · 1b7e03ef
      Tejun Heo 提交于
      Now that NUMA init path is unified, NUMA emulation can be enabled on
      32bit.  Make numa_emluation.c safe on 32bit by doing the followings.
      
      * Define MAX_DMA32_PFN on 32bit too.
      
      * Include bootmem.h for max_pfn declaration.
      
      * Use u64 explicitly and always use PFN_PHYS() when converting page
        number to address.
      
      * Avoid __udivdi3() generation on 32bit by doing number of pages
        calculation instead in split_nodes_interleave().
      
      And drop X86_64 dependency from Kconfig.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      1b7e03ef
    • T
      x86, NUMA: Make numa_init_array() static · 752d4f37
      Tejun Heo 提交于
      numa_init_array() no longer has users outside of numa.c.  Make it
      static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      752d4f37
    • T
      x86, NUMA: Make 32bit use common NUMA init path · bd6709a9
      Tejun Heo 提交于
      With both _numa_init() methods converted and the rest of init code
      adjusted, numa_32.c now can switch from the 32bit only init code to
      the common one in numa.c.
      
      * Shim get_memcfg_*()'s are dropped and initmem_init() calls
        x86_numa_init(), which is updated to handle NUMAQ.
      
      * All boilerplate operations including node range limiting, pgdat
        alloc/init are handled by numa_init().  32bit only implementation is
        removed.
      
      * 32bit numa_add_memblk(), numa_set_distance() and
        memory_add_physaddr_to_nid() removed and common versions in
        numa_32.c enabled for 32bit.
      
      This change causes the following behavior changes.
      
      * NODE_DATA()->node_start_pfn/node_spanned_pages properly initialized
        for 32bit too.
      
      * Much more sanity checks and configuration cleanups.
      
      * Proper handling of node distances.
      
      * The same NUMA init messages as 64bit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      bd6709a9
    • T
      x86, NUMA: Enable build of generic NUMA init code on 32bit · 744baba0
      Tejun Heo 提交于
      Generic NUMA init code was moved to numa.c from numa_64.c but is still
      guaraded by CONFIG_X86_64.  This patch removes the compile guard and
      enables compiling on 32bit.
      
      * numa_add_memblk() and numa_set_distance() clash with the shim
        implementation in numa_32.c and are left out.
      
      * memory_add_physaddr_to_nid() clashes with 32bit implementation and
        is left out.
      
      * MAX_DMA_PFN definition in dma.h moved out of !CONFIG_X86_32.
      
      * node_data definition in numa_32.c removed in favor of the one in
        numa.c.
      
      There are places where ulong is assumed to be 64bit.  The next patch
      will fix them up.  Note that although the code is compiled it isn't
      used yet and this patch doesn't cause any functional change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      744baba0
    • T
      x86, NUMA: Move NUMA init logic from numa_64.c to numa.c · a4106eae
      Tejun Heo 提交于
      Move the generic 64bit NUMA init machinery from numa_64.c to numa.c.
      
      * node_data[], numa_mem_info and numa_distance
      * numa_add_memblk[_to](), numa_remove_memblk[_from]()
      * numa_set_distance() and friends
      * numa_init() and all the numa_meminfo handling helpers called from it
      * dummy_numa_init()
      * memory_add_physaddr_to_nid()
      
      A new function x86_numa_init() is added and the content of
      numa_64.c::initmem_init() is moved into it.  initmem_init() now simply
      calls x86_numa_init().
      
      Constants and numa_off declaration are moved from numa_{32|64}.h to
      numa.h.
      
      This is code reorganization and doesn't involve any functional change.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      a4106eae
    • T
      x86-32, NUMA: Update numaq to use new NUMA init protocol · 299a180a
      Tejun Heo 提交于
      Update numaq such that it calls numa_add_memblk() and sets
      numa_nodes_parsed instead of directly diddling with NUMA states.  The
      original get_memcfg_numaq() is renamed to numaq_numa_init() and new
      get_memcfg_numaq() is created in numa_32.c.
      
      The shim numa_add_memblk() implementation handles node_start/end_pfn[]
      and node_set_online() for nodes with memory.  The new
      get_memcfg_numaq() exactly the same with get_memcfg_from_srat() other
      than calling the numaq init function.  Things get_memcfgs_numaq() do
      are not strictly necessary for numaq but added for consistency and to
      help unifying NUMA init handling.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      299a180a
    • T
      x86-32, NUMA: Replace srat_32.c with srat.c · 5acd91ab
      Tejun Heo 提交于
      SRAT support implementation in srat_32.c and srat.c are generally
      similar; however, there are some differences.
      
      First of all, 64bit implementation supports more types of SRAT
      entries.  64bit supports x2apic, affinity, memory and SLIT.  32bit
      only supports processor and memory.
      
      Most other differences stem from different initialization protocols
      employed by 64bit and 32bit NUMA init paths.
      
      On 64bit,
      
      * Mappings among PXM, node and apicid are directly done in each SRAT
        entry callback.
      
      * Memory affinity information is passed to numa_add_memblk() which
        takes care of all interfacing with NUMA init.
      
      * Doesn't directly initialize NUMA configurations.  All the
        information is recorded in numa_nodes_parsed and memblks.
      
      On 32bit,
      
      * Checks numa_off.
      
      * Things go through one more level of indirection via private tables
        but eventually end up initializing the same mappings.
      
      * node_start/end_pfn[] are initialized and
        memblock_x86_register_active_regions() is called for each memory
        chunk.
      
      * node_set_online() is called for each online node.
      
      * sort_node_map() is called.
      
      There are also other minor differences in sanity checking and messages
      but taking 64bit version should be good enough.
      
      This patch drops the 32bit specific implementation and makes the 64bit
      implementation common for both 32 and 64bit.
      
      The init protocol differences are dealt with in two places - the
      numa_add_memblk() shim added in the previous patch and new temporary
      numa_32.c:get_memcfg_from_srat() which wraps invocation of
      x86_acpi_numa_init().
      
      The shim numa_add_memblk() handles the folowings.
      
      * node_start/end_pfn[] initialization.
      
      * node_set_online() for memory nodes.
      
      * Invocation of memblock_x86_register_active_regions().
      
      The shim get_memcfg_from_srat() handles the followings.
      
      * numa_off check.
      
      * node_set_online() for CPU nodes.
      
      * sort_node_map() invocation.
      
      * Clearing of numa_nodes_parsed and active_ranges on failure.
      
      The shims are temporary and will be removed as the generic NUMA init
      path in 32bit is replaced with 64bit one.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      5acd91ab
    • T
      x86-32, NUMA: implement temporary NUMA init shims · b0d31080
      Tejun Heo 提交于
      To help transition to common NUMA init, implement temporary 32bit
      shims for numa_add_memblk() and numa_set_distance().
      numa_add_memblk() registers the memblk and adjusts
      node_start/end_pfn[].  numa_set_distance() is noop.
      
      These shims will allow using 64bit NUMA init functions on 32bit and
      gradual transition to common NUMA init path.
      
      For detailed description, please read description of commits which
      make use of the shim functions.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      b0d31080
    • T
      x86, NUMA: Move numa_nodes_parsed to numa.[hc] · e6df595b
      Tejun Heo 提交于
      Move numa_nodes_parsed from numa_64.[hc] to numa.[hc] to prepare for
      NUMA init path unification.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      e6df595b
    • T
      x86-32, NUMA: Move get_memcfg_numa() into numa_32.c · daf4f480
      Tejun Heo 提交于
      There's no reason get_memcfg_numa() to be implemented inline in
      mmzone_32.h.  Move it to numa_32.c and also make
      get_memcfg_numa_flag() static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      daf4f480
    • T
      x86, NUMA: trivial cleanups · 1201e10a
      Tejun Heo 提交于
      * Kill no longer used struct bootnode.
      
      * Kill dangling declaration of pxm_to_nid() in numa_32.h.
      
      * Make setup_node_bootmem() static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      1201e10a
    • T
      x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional · 84914ed0
      Tejun Heo 提交于
      NUMAQ is the only meaningful user of this callback and
      setup_local_APIC() the only callsite.  Stop torturing everyone else by
      making the callback optional and removing all the boilerplate
      implementations and assignments.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      84914ed0
    • T
      x86, NUMA: Unify 32/64bit numa_cpu_node() implementation · 6bd26273
      Tejun Heo 提交于
      Currently, the only meaningful user of apic->x86_32_numa_cpu_node() is
      NUMAQ which returns valid mapping only after CPU is initialized during
      SMP bringup; thus, the previous patch to set apicid -> node in
      setup_local_APIC() makes __apicid_to_node[] always contain the correct
      mapping whether custom apic->x86_32_numa_cpu_node() is used or not.
      
      So, there is no reason to keep separate 32bit implementation.  We can
      always consult __apicid_to_node[].  Move 64bit implementation from
      numa_64.c to numa.c and remove 32bit implementation from numa_32.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      6bd26273
  10. 30 4月, 2011 2 次提交
  11. 28 4月, 2011 1 次提交
    • S
      x86: devicetree: Configure IOAPIC pin only once · 20443598
      Sebastian Andrzej Siewior 提交于
      We use io_apic_setup_irq_pin() in order to configure pin's interrupt
      number polarity and type. This is done on every irq_create_of_mapping()
      which happens for instance during pci enable calls. Level typed
      interrupts are masked by default, edge are unmasked.
      
      On the first ->xlate() call the level interrupt is configured and
      masked. The driver calls request_irq() and the line is unmasked. Lets
      assume the interrupt line is shared with another device and we call
      pci_enable_device() for this device. The ->xlate() configures the pin
      again and it is masked. request_irq() does not unmask the line because
      it _is_ already unmasked according to its internal state. So the
      interrupt will never be unmasked again.
      
      This patch is based on an earlier work by Torben Hohn and solves the
      problem by configuring the pin only once. Since all devices must agree
      on the same type and polarity there is no point in configuring the pin
      more than once.
      
      [ tglx: Split out the ce4100 part into a separate patch ]
      
      Cc: Torben Hohn <torbenh@linutronix.de>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Link: http://lkml.kernel.org/r/%3C20110427143052.GA15211%40linutronix.de%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      20443598
  12. 24 4月, 2011 1 次提交
  13. 21 4月, 2011 2 次提交
  14. 19 4月, 2011 4 次提交