1. 02 10月, 2010 1 次提交
  2. 03 8月, 2010 2 次提交
  3. 29 7月, 2010 1 次提交
    • C
      slub numa: Fix rare allocation from unexpected node · bc6488e9
      Christoph Lameter 提交于
      The network developers have seen sporadic allocations resulting in objects
      coming from unexpected NUMA nodes despite asking for objects from a
      specific node.
      
      This is due to get_partial() calling get_any_partial() if partial
      slabs are exhausted for a node even if a node was specified and therefore
      one would expect allocations only from the specified node.
      
      get_any_partial() sporadically may return a slab from a foreign
      node to gradually reduce the size of partial lists on remote nodes
      and thereby reduce total memory use for a slab cache.
      
      The behavior is controlled by the remote_defrag_ratio of each cache.
      
      Strictly speaking this is permitted behavior since __GFP_THISNODE was
      not specified for the allocation but it is certain surprising.
      
      This patch makes sure that the remote defrag behavior only occurs
      if no node was specified.
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      bc6488e9
  4. 16 7月, 2010 5 次提交
  5. 09 6月, 2010 1 次提交
  6. 25 5月, 2010 2 次提交
    • M
      cpuset,mm: fix no node to alloc memory when changing cpuset's mems · c0ff7453
      Miao Xie 提交于
      Before applying this patch, cpuset updates task->mems_allowed and
      mempolicy by setting all new bits in the nodemask first, and clearing all
      old unallowed bits later.  But in the way, the allocator may find that
      there is no node to alloc memory.
      
      The reason is that cpuset rebinds the task's mempolicy, it cleans the
      nodes which the allocater can alloc pages on, for example:
      
      (mpol: mempolicy)
      	task1			task1's mpol	task2
      	alloc page		1
      	  alloc on node0? NO	1
      				1		change mems from 1 to 0
      				1		rebind task1's mpol
      				0-1		  set new bits
      				0	  	  clear disallowed bits
      	  alloc on node1? NO	0
      	  ...
      	can't alloc page
      	  goto oom
      
      This patch fixes this problem by expanding the nodes range first(set newly
      allowed bits) and shrink it lazily(clear newly disallowed bits).  So we
      use a variable to tell the write-side task that read-side task is reading
      nodemask, and the write-side task clears newly disallowed nodes after
      read-side task ends the current memory allocation.
      
      [akpm@linux-foundation.org: fix spello]
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Paul Menage <menage@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Ravikiran Thirumalai <kiran@scalex86.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0ff7453
    • A
      slub: move kmem_cache_node into it's own cacheline · 73367bd8
      Alexander Duyck 提交于
      This patch is meant to improve the performance of SLUB by moving the local
      kmem_cache_node lock into it's own cacheline separate from kmem_cache.
      This is accomplished by simply removing the local_node when NUMA is enabled.
      
      On my system with 2 nodes I saw around a 5% performance increase w/
      hackbench times dropping from 6.2 seconds to 5.9 seconds on average.  I
      suspect the performance gain would increase as the number of nodes
      increases, but I do not have the data to currently back that up.
      
      Bugzilla-Reference: http://bugzilla.kernel.org/show_bug.cgi?id=15713
      Cc: <stable@kernel.org>
      Reported-by: NAlex Shi <alex.shi@intel.com>
      Tested-by: NAlex Shi <alex.shi@intel.com>
      Acked-by: NYanmin Zhang <yanmin_zhang@linux.intel.com>
      Acked-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      73367bd8
  7. 22 5月, 2010 3 次提交
  8. 20 5月, 2010 1 次提交
  9. 06 5月, 2010 1 次提交
  10. 10 4月, 2010 1 次提交
  11. 08 3月, 2010 2 次提交
  12. 04 3月, 2010 1 次提交
    • S
      SLUB: Fix per-cpu merge conflict · 1154fab7
      Stephen Rothwell 提交于
      The slab tree adds a percpu variable usage case (commit
      9dfc6e68 "SLUB: Use this_cpu operations in
      slub"), but the percpu tree removes the prefixing of percpu variables (commit
      dd17c8f7 "percpu: remove per_cpu__ prefix"),
      thus causing the following compilation error:
      
          CC      mm/slub.o
        mm/slub.c: In function ‘alloc_kmem_cache_cpus’:
        mm/slub.c:2078: error: implicit declaration of function ‘per_cpu_var’
        mm/slub.c:2078: warning: assignment makes pointer from integer without a cast
        make[1]: *** [mm/slub.o] Error 1
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      1154fab7
  13. 27 2月, 2010 1 次提交
  14. 05 2月, 2010 1 次提交
  15. 23 1月, 2010 2 次提交
  16. 20 12月, 2009 4 次提交
    • C
      SLUB: Make slub statistics use this_cpu_inc · 84e554e6
      Christoph Lameter 提交于
      this_cpu_inc() translates into a single instruction on x86 and does not
      need any register. So use it in stat(). We also want to avoid the
      calculation of the per cpu kmem_cache_cpu structure pointer. So pass
      a kmem_cache pointer instead of a kmem_cache_cpu pointer.
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      84e554e6
    • C
      SLUB: this_cpu: Remove slub kmem_cache fields · ff12059e
      Christoph Lameter 提交于
      Remove the fields in struct kmem_cache_cpu that were used to cache data from
      struct kmem_cache when they were in different cachelines. The cacheline that
      holds the per cpu array pointer now also holds these values. We can cut down
      the struct kmem_cache_cpu size to almost half.
      
      The get_freepointer() and set_freepointer() functions that used to be only
      intended for the slow path now are also useful for the hot path since access
      to the size field does not require accessing an additional cacheline anymore.
      This results in consistent use of functions for setting the freepointer of
      objects throughout SLUB.
      
      Also we initialize all possible kmem_cache_cpu structures when a slab is
      created. No need to initialize them when a processor or node comes online.
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      ff12059e
    • C
      SLUB: Get rid of dynamic DMA kmalloc cache allocation · 756dee75
      Christoph Lameter 提交于
      Dynamic DMA kmalloc cache allocation is troublesome since the
      new percpu allocator does not support allocations in atomic contexts.
      Reserve some statically allocated kmalloc_cpu structures instead.
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      756dee75
    • C
      SLUB: Use this_cpu operations in slub · 9dfc6e68
      Christoph Lameter 提交于
      Using per cpu allocations removes the needs for the per cpu arrays in the
      kmem_cache struct. These could get quite big if we have to support systems
      with thousands of cpus. The use of this_cpu_xx operations results in:
      
      1. The size of kmem_cache for SMP configuration shrinks since we will only
         need 1 pointer instead of NR_CPUS. The same pointer can be used by all
         processors. Reduces cache footprint of the allocator.
      
      2. We can dynamically size kmem_cache according to the actual nodes in the
         system meaning less memory overhead for configurations that may potentially
         support up to 1k NUMA nodes / 4k cpus.
      
      3. We can remove the diddle widdle with allocating and releasing of
         kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu
         alloc logic will do it all for us. Removes some portions of the cpu hotplug
         functionality.
      
      4. Fastpath performance increases since per cpu pointer lookups and
         address calculations are avoided.
      
      V7-V8
      - Convert missed get_cpu_slab() under CONFIG_SLUB_STATS
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      9dfc6e68
  17. 11 12月, 2009 1 次提交
  18. 29 11月, 2009 1 次提交
  19. 16 10月, 2009 1 次提交
    • D
      slub: allow stats to be cleared · 78eb00cc
      David Rientjes 提交于
      When collecting slub stats for particular workloads, it's necessary to
      collect each statistic for all caches before the job is even started
      because the counters are usually greater than zero just from boot and
      initialization.
      
      This allows a statistic to be cleared on each cpu by writing '0' to its
      sysfs file.  This creates a baseline for statistics of interest before
      the workload is started.
      
      Setting a statistic to a particular value is not supported, so all values
      written to these files other than '0' returns -EINVAL.
      
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      78eb00cc
  20. 22 9月, 2009 1 次提交
  21. 16 9月, 2009 1 次提交
  22. 14 9月, 2009 1 次提交
  23. 04 9月, 2009 2 次提交
  24. 30 8月, 2009 1 次提交
  25. 20 8月, 2009 1 次提交
  26. 19 8月, 2009 1 次提交