1. 02 5月, 2011 12 次提交
    • T
      x86-32, NUMA: implement temporary NUMA init shims · b0d31080
      Tejun Heo 提交于
      To help transition to common NUMA init, implement temporary 32bit
      shims for numa_add_memblk() and numa_set_distance().
      numa_add_memblk() registers the memblk and adjusts
      node_start/end_pfn[].  numa_set_distance() is noop.
      
      These shims will allow using 64bit NUMA init functions on 32bit and
      gradual transition to common NUMA init path.
      
      For detailed description, please read description of commits which
      make use of the shim functions.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      b0d31080
    • T
      x86, NUMA: Move numa_nodes_parsed to numa.[hc] · e6df595b
      Tejun Heo 提交于
      Move numa_nodes_parsed from numa_64.[hc] to numa.[hc] to prepare for
      NUMA init path unification.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      e6df595b
    • T
      x86-32, NUMA: Move get_memcfg_numa() into numa_32.c · daf4f480
      Tejun Heo 提交于
      There's no reason get_memcfg_numa() to be implemented inline in
      mmzone_32.h.  Move it to numa_32.c and also make
      get_memcfg_numa_flag() static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      daf4f480
    • T
      x86, NUMA: make srat.c 32bit safe · eca9ad31
      Tejun Heo 提交于
      Make srat.c 32bit safe by removing the assumption that unsigned long
      is 64bit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      eca9ad31
    • T
      x86, NUMA: rename srat_64.c to srat.c · 7b2600f8
      Tejun Heo 提交于
      Rename srat_64.c to srat.c.  This is to prepare for unification of
      NUMA init paths between 32 and 64bit.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      7b2600f8
    • T
      x86, NUMA: trivial cleanups · 1201e10a
      Tejun Heo 提交于
      * Kill no longer used struct bootnode.
      
      * Kill dangling declaration of pxm_to_nid() in numa_32.h.
      
      * Make setup_node_bootmem() static.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      1201e10a
    • T
      x86-32, NUMA: use sparse_memory_present_with_active_regions() · 797390d8
      Tejun Heo 提交于
      Instead of calling memory_present() for each region from NUMA init,
      call sparse_memory_present_with_active_regions() from paging_init()
      similarly to x86-64.
      
      For flat and numaq, this results in exactly the same memory_present()
      calls.  For srat, if there are multiple memory chunks for a node,
      after this change, memory_present() will be called separately for each
      chunk instead of being called once to encompass the whole range, which
      doesn't cause any harm and actually is the better behavior.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      797390d8
    • T
      x86, NUMA: Unify 32/64bit numa_cpu_node() implementation · 6bd26273
      Tejun Heo 提交于
      Currently, the only meaningful user of apic->x86_32_numa_cpu_node() is
      NUMAQ which returns valid mapping only after CPU is initialized during
      SMP bringup; thus, the previous patch to set apicid -> node in
      setup_local_APIC() makes __apicid_to_node[] always contain the correct
      mapping whether custom apic->x86_32_numa_cpu_node() is used or not.
      
      So, there is no reason to keep separate 32bit implementation.  We can
      always consult __apicid_to_node[].  Move 64bit implementation from
      numa_64.c to numa.c and remove 32bit implementation from numa_32.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      6bd26273
    • T
      x86-64, NUMA: simplify nodedata allocation · acd26d61
      Tejun Heo 提交于
      With top-down memblock allocation, the allocation range limits in
      ealry_node_mem() can be simplified - try node-local first, then any
      node but in any case don't allocate below DMA limit.
      
      Remove early_node_mem() and implement simplified allocation directly
      in setup_node_bootmem().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      acd26d61
    • T
      x86-64, NUMA: trivial cleanups for setup_node_bootmem() · ebe685f2
      Tejun Heo 提交于
      Make the following trivial changes in preparation for further updates.
      
      * nodeid -> nid, nid -> tnid
      * use nd_ prefix for nodedata related variables
      * remove start/end_pfn and use start/end directly
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      ebe685f2
    • T
      x86-64, NUMA: Simplify hotadd memory handling · 9688678a
      Tejun Heo 提交于
      The only special handling NUMA needs to do for hotadd memory is
      determining the node for the hotadd memory given the address of it and
      there's nothing specific to specific config method used.
      
      srat_64.c does somewhat elaborate error checking on
      ACPI_SRAT_MEM_HOT_PLUGGABLE regions, remembers them and implements
      memory_add_physaddr_to_nid() which determines the node for given
      hotadd address.
      
      This is almost completely redundant.  All the information is already
      available to the generic NUMA code which already performs all the
      sanity checking and merging.  All that's necessary is not using
      __initdata from numa_meminfo and providing a function which uses it to
      map address to node.
      
      Drop the specific implementation from srat_64.c and add generic
      memory_add_physaddr_to_nid() in numa_64.c, which is enabled if
      CONFIG_MEMORY_HOTPLUG is set.  Other than dropping the code, srat_64.c
      doesn't need any change as it already calls numa_add_memblk() for hot
      pluggable regions which is enough.
      
      While at it, change CONFIG_MEMORY_HOTPLUG_SPARSE in srat_64.c to
      CONFIG_MEMORY_HOTPLUG, for NUMA on x86-64, the two are always the
      same.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      9688678a
    • Y
      x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo() · 2be19102
      Yinghai Lu 提交于
      numa_cleanup_meminfo() trims each memblk between low (0) and
      high (max_pfn) limits and discards empty ones.  However, the
      emptiness detection incorrectly used equality test.  If the
      start of a memblk is higher than max_pfn, it is empty but fails
      the equality test and doesn't get discarded.
      
      The condition triggers when max_pfn is lower than start of a
      NUMA node and results in memory misconfiguration - leading to
      WARN_ON()s and other funnies.  The bug was discovered in devel
      branch where 32bit too uses this code path for NUMA init.  If a
      node is above the addressing limit, max_pfn ends up lower than
      the node triggering this problem.
      
      The failure hasn't been observed on x86-64 but is still possible
      with broken hardware e820/NUMA info.  As the fix is very low
      risk, it would be better to apply it even for 64bit.
      
      Fix it by using >= instead of ==.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      [ Extracted the actual fix from the original patch and rewrote patch description. ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/20110501171204.GO29280@htj.dyndns.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      2be19102
  2. 29 4月, 2011 1 次提交
  3. 21 4月, 2011 1 次提交
  4. 07 4月, 2011 14 次提交
  5. 04 4月, 2011 2 次提交
    • T
      x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change · 765af22d
      Tejun Heo 提交于
      Commit d8fc3afc (x86, NUMA: Move *_numa_init() invocations
      into initmem_init()) moved acpi_numa_init() call into NUMA
      initmem_init() but forgot to update 32bit NUMA init breaking ACPI
      NUMA configuration for 32bit.
      
      acpi_numa_init() call was later moved again to srat_64.c.  Match
      it by adding the call to get_memcfg_from_srat() in srat_32.c.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <20110404100645.GE1420@mtj.dyndns.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      765af22d
    • F
      x86-64, NUMA: Remove unused variable · 711b8c87
      Florian Mickler 提交于
      In case !CONFIG_ACPI_NUMA and !CONFIG_AMD_NUMA gcc emits a warning
      about the unused variable ret.
      
      As that variable is in fact not needed I choose to remove it.
      Signed-off-by: NFlorian Mickler <florian@mickler.org>
      LKML-Reference: <1301843624-22364-1-git-send-email-florian@mickler.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      711b8c87
  6. 01 4月, 2011 1 次提交
    • T
      x86-64, NUMA: Remove custom phys_to_nid() implementation · 05293608
      Tejun Heo 提交于
      phys_to_nid() maps physical address to NUMA node id.  This is
      implemented by building perfect hash in compute_hash_shift() during
      initialization.
      
      However, with SPARSE memory model, the nid is encoded in page flags.
      The perfect hash implementation was for DISCONTIG memory model which
      got removed years ago by b263295d (x86: 64-bit, make sparsemem
      vmemmap the only memory model).
      
      So, the perfect hash ends up being used only during initialization
      when the core SPARSE code already provides perfectly acceptable
      generic early_pfn_to_nid() implementation.
      
      Drop phys_to_nid() and use the generic ealry_pfn_to_nid() instead.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      05293608
  7. 24 3月, 2011 3 次提交
  8. 20 3月, 2011 1 次提交
    • Y
      x86: Cleanup highmap after brk is concluded · e5f15b45
      Yinghai Lu 提交于
      Now cleanup_highmap actually is in two steps: one is early in head64.c
      and only clears above _end; a second one is in init_memory_mapping() and
      tries to clean from _brk_end to _end.
      It should check if those boundaries are PMD_SIZE aligned but currently
      does not.
      Also init_memory_mapping() is called several times for numa or memory
      hotplug, so we really should not handle initial kernel mappings there.
      
      This patch moves cleanup_highmap() down after _brk_end is settled so
      we can do everything in one step.
      Also we honor max_pfn_mapped in the implementation of cleanup_highmap.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
      LKML-Reference: <alpine.DEB.2.00.1103171739050.3382@kaball-desktop>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      e5f15b45
  9. 18 3月, 2011 2 次提交
    • S
      x86: Flush TLB if PGD entry is changed in i386 PAE mode · 4981d01e
      Shaohua Li 提交于
      According to intel CPU manual, every time PGD entry is changed in i386 PAE
      mode, we need do a full TLB flush. Current code follows this and there is
      comment for this too in the code.
      
      But current code misses the multi-threaded case. A changed page table
      might be used by several CPUs, every such CPU should flush TLB. Usually
      this isn't a problem, because we prepopulate all PGD entries at process
      fork. But when the process does munmap and follows new mmap, this issue
      will be triggered.
      
      When it happens, some CPUs keep doing page faults:
      
        http://marc.info/?l=linux-kernel&m=129915020508238&w=2
      
      Reported-by: Yasunori Goto<y-goto@jp.fujitsu.com>
      Tested-by: Yasunori Goto<y-goto@jp.fujitsu.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: Shaohua Li<shaohua.li@intel.com>
      Cc: Mallick Asit K <asit.k.mallick@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: stable <stable@kernel.org>
      LKML-Reference: <1300246649.2337.95.camel@sli10-conroe>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4981d01e
    • L
      x86: Fix common misspellings · 0d2eb44f
      Lucas De Marchi 提交于
      They were generated by 'codespell' and then manually reviewed.
      Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: trivial@kernel.org
      LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0d2eb44f
  10. 15 3月, 2011 1 次提交
  11. 12 3月, 2011 1 次提交
  12. 10 3月, 2011 1 次提交