1. 24 12月, 2010 4 次提交
    • D
      x86, numa: Fix cpu to node mapping for sparse node ids · a387e95a
      David Rientjes 提交于
      NUMA boot code assumes that physical node ids start at 0, but the DIMMs
      that the apic id represents may not be reachable.  If this is the case,
      node 0 is never online and cpus never end up getting appropriately
      assigned to a node.  This causes the cpumask of all online nodes to be
      empty and machines crash with kernel code assuming online nodes have
      valid cpus.
      
      The fix is to appropriately map all the address ranges for physical nodes
      and ensure the cpu to node mapping function checks all possible nodes (up
      to MAX_NUMNODES) instead of simply checking nodes 0-N, where N is the
      number of physical nodes, for valid address ranges.
      
      This requires no longer "compressing" the address ranges of nodes in the
      physical node map from 0-N, but rather leave indices in physnodes[] to
      represent the actual node id of the physical node.  Accordingly, the
      topology exported by both amd_get_nodes() and acpi_get_nodes() no longer
      must return the number of nodes to iterate through; all such iterations
      will now be to MAX_NUMNODES.
      
      This change also passes the end address of system RAM (which may be
      different from normal operation if mem= is specified on the command line)
      before the physnodes[] array is populated.  ACPI parsed nodes are
      truncated to fit within the address range that respect the mem=
      boundaries and even some physical nodes may become unreachable in such
      cases.
      
      When NUMA emulation does succeed, any apicid to node mapping that exists
      for unreachable nodes are given default values so that proximity domains
      can still be assigned.  This is important for node_distance() to
      function as desired.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      LKML-Reference: <alpine.DEB.2.00.1012221702090.3701@chino.kir.corp.google.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      a387e95a
    • D
      x86, numa: Fake apicid and pxm mappings for NUMA emulation · f51bf307
      David Rientjes 提交于
      This patch adds the equivalent of acpi_fake_nodes() for AMD Northbridge
      platforms.  The goal is to fake the apicid-to-node mappings for NUMA
      emulation so the physical topology of the machine is correctly maintained
      within the kernel.
      
      This change also fakes proximity domains for both ACPI and k8 code so the
      physical distance between emulated nodes is maintained via
      node_distance().  This exports the correct distances via
      /sys/devices/system/node/.../distance based on the underlying topology.
      
      A new helper function, fake_physnodes(), is introduced to correctly
      invoke the correct NUMA code to fake these two mappings based on the
      system type.  If there is no underlying NUMA configuration, all cpus are
      mapped to node 0 for local distance.
      
      Since acpi_fake_nodes() is no longer called with CONFIG_ACPI_NUMA, it's
      prototype can be removed from the header file for such a configuration.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      LKML-Reference: <alpine.DEB.2.00.1012221701360.3701@chino.kir.corp.google.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      f51bf307
    • D
      x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU · 4e76f4e6
      David Rientjes 提交于
      Both acpi_get_nodes() and amd_get_nodes() are only necessary when
      CONFIG_NUMA_EMU is enabled, so avoid compiling them when the option is
      disabled.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      LKML-Reference: <alpine.DEB.2.00.1012221701210.3701@chino.kir.corp.google.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      4e76f4e6
    • D
      x86, numa: Reduce minimum fake node size to 32M · 34dc9e74
      David Rientjes 提交于
      This patch changes the minimum fake node size from 64MB to 32MB so it is
      possible to test NUMA code at a greater scale on smaller machines
      (64 nodes on a 2G machine, 1024 nodes on 32G machine with
      CONFIG_NODES_SHIFT=10).
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      LKML-Reference: <alpine.DEB.2.00.1012221700590.3701@chino.kir.corp.google.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      34dc9e74
  2. 18 11月, 2010 3 次提交
  3. 10 11月, 2010 2 次提交
    • J
      x86, UV: Update node controller MMRs · 62b0cfc2
      Jack Steiner 提交于
      A new version of the SGI UV hub node controller is being
      developed. A few of the MMRs (control registers) that exist on
      the current hub no longer exist on the new hub. Fortunately,
      there are alternate MMRs that are are functionally equivalent
      and that exist on both hubs.
      
      This patch changes the UV code to use MMRs that exist in BOTH
      versions of the hub node controller.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      LKML-Reference: <20101106204056.GA27584@sgi.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      62b0cfc2
    • A
      x86: Address gcc4.6 "set but not used" warnings in apic.h · 0059b243
      Andi Kleen 提交于
      native_apic_msr_read() and x2apic_enabled() use rdmsr(msr, low, high),
      but only use the low part.
      
      gcc4.6 complains about this:
      .../apic.h:144:11: warning: variable 'high' set but not used [-Wunused-but-set-variable]
      
      rdmsr() is just a wrapper around rdmsrl() which splits the 64bit value
      into low and high, so using rdmsrl() directly solves this.
      
      [tglx: Changed the variables to u64 as suggested by Cyrill. It's less
             confusing and has no code impact as this is 64bit only anyway.
             Massaged changelog as well. ]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: x86@kernel.org
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <1289251229-19589-1-git-send-email-andi@firstfloor.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      0059b243
  4. 27 10月, 2010 4 次提交
    • B
      x86-32: Allocate irq stacks seperate from percpu area · 22d4cd4c
      Brian Gerst 提交于
      The percpu allocator cannot handle alignments larger than one
      page. Allocate the irq stacks seperately, and only keep the
      pointers as percpu data.
      Signed-off-by: NBrian Gerst <brgerst@gmail.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: tj@kernel.org
      LKML-Reference: <1288158182-1753-1-git-send-email-brgerst@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      22d4cd4c
    • P
      mm: remove pte_*map_nested() · ece0e2b6
      Peter Zijlstra 提交于
      Since we no longer need to provide KM_type, the whole pte_*map_nested()
      API is now redundant, remove it.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ece0e2b6
    • P
      mm: stack based kmap_atomic() · 3e4d3af5
      Peter Zijlstra 提交于
      Keep the current interface but ignore the KM_type and use a stack based
      approach.
      
      The advantage is that we get rid of crappy code like:
      
      	#define __KM_PTE			\
      		(in_nmi() ? KM_NMI_PTE : 	\
      		 in_irq() ? KM_IRQ_PTE :	\
      		 KM_PTE0)
      
      and in general can stop worrying about what context we're in and what kmap
      slots might be appropriate for that.
      
      The downside is that FRV kmap_atomic() gets more expensive.
      
      For now we use a CPP trick suggested by Andrew:
      
        #define kmap_atomic(page, args...) __kmap_atomic(page)
      
      to avoid having to touch all kmap_atomic() users in a single patch.
      
      [ not compiled on:
        - mn10300: the arch doesn't actually build with highmem to begin with ]
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c]
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e4d3af5
    • R
      x86, uv: Enable Westmere support on SGI UV · c8f730b1
      Russ Anderson 提交于
      Enable Westmere support on SGI UV.  The UV initialization code is dependent on
      the APICID bits.  Westmere-EX uses different APIC bit mapping than Nehalem-EX.
      This code reads the apic shift value from a UV MMR to do the proper bit
      decoding to determint the pnode.
      Signed-off-by: NRuss Anderson <rja@sgi.com>
      LKML-Reference: <20101026212728.GB15071@sgi.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      c8f730b1
  5. 24 10月, 2010 27 次提交