1. 23 11月, 2015 1 次提交
    • J
      slab/slub: adjust kmem_cache_alloc_bulk API · 865762a8
      Jesper Dangaard Brouer 提交于
      Adjust kmem_cache_alloc_bulk API before we have any real users.
      
      Adjust API to return type 'int' instead of previously type 'bool'.  This
      is done to allow future extension of the bulk alloc API.
      
      A future extension could be to allow SLUB to stop at a page boundary, when
      specified by a flag, and then return the number of objects.
      
      The advantage of this approach, would make it easier to make bulk alloc
      run without local IRQs disabled.  With an approach of cmpxchg "stealing"
      the entire c->freelist or page->freelist.  To avoid overshooting we would
      stop processing at a slab-page boundary.  Else we always end up returning
      some objects at the cost of another cmpxchg.
      
      To keep compatible with future users of this API linking against an older
      kernel when using the new flag, we need to return the number of allocated
      objects with this API change.
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      865762a8
  2. 09 9月, 2015 1 次提交
    • V
      mm: rename alloc_pages_exact_node() to __alloc_pages_node() · 96db800f
      Vlastimil Babka 提交于
      alloc_pages_exact_node() was introduced in commit 6484eb3e ("page
      allocator: do not check NUMA node ID when the caller knows the node is
      valid") as an optimized variant of alloc_pages_node(), that doesn't
      fallback to current node for nid == NUMA_NO_NODE.  Unfortunately the
      name of the function can easily suggest that the allocation is
      restricted to the given node and fails otherwise.  In truth, the node is
      only preferred, unless __GFP_THISNODE is passed among the gfp flags.
      
      The misleading name has lead to mistakes in the past, see for example
      commits 5265047a ("mm, thp: really limit transparent hugepage
      allocation to local node") and b360edb4 ("mm, mempolicy:
      migrate_to_node should only migrate to node").
      
      Another issue with the name is that there's a family of
      alloc_pages_exact*() functions where 'exact' means exact size (instead
      of page order), which leads to more confusion.
      
      To prevent further mistakes, this patch effectively renames
      alloc_pages_exact_node() to __alloc_pages_node() to better convey that
      it's an optimized variant of alloc_pages_node() not intended for general
      usage.  Both functions get described in comments.
      
      It has been also considered to really provide a convenience function for
      allocations restricted to a node, but the major opinion seems to be that
      __GFP_THISNODE already provides that functionality and we shouldn't
      duplicate the API needlessly.  The number of users would be small
      anyway.
      
      Existing callers of alloc_pages_exact_node() are simply converted to
      call __alloc_pages_node(), with the exception of sba_alloc_coherent()
      which open-codes the check for NUMA_NO_NODE, so it is converted to use
      alloc_pages_node() instead.  This means it no longer performs some
      VM_BUG_ON checks, and since the current check for nid in
      alloc_pages_node() uses a 'nid < 0' comparison (which includes
      NUMA_NO_NODE), it may hide wrong values which would be previously
      exposed.
      
      Both differences will be rectified by the next patch.
      
      To sum up, this patch makes no functional changes, except temporarily
      hiding potentially buggy callers.  Restricting the checks in
      alloc_pages_node() is left for the next patch which can in turn expose
      more existing buggy callers.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NRobin Holt <robinmholt@gmail.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Cliff Whickman <cpw@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      96db800f
  3. 05 9月, 2015 1 次提交
    • C
      slab: infrastructure for bulk object allocation and freeing · 484748f0
      Christoph Lameter 提交于
      Add the basic infrastructure for alloc/free operations on pointer arrays.
      It includes a generic function in the common slab code that is used in
      this infrastructure patch to create the unoptimized functionality for slab
      bulk operations.
      
      Allocators can then provide optimized allocation functions for situations
      in which large numbers of objects are needed.  These optimization may
      avoid taking locks repeatedly and bypass metadata creation if all objects
      in slab pages can be used to provide the objects required.
      
      Allocators can extend the skeletons provided and add their own code to the
      bulk alloc and free functions.  They can keep the generic allocation and
      freeing and just fall back to those if optimizations would not work (like
      for example when debugging is on).
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      484748f0
  4. 15 4月, 2015 1 次提交
  5. 13 2月, 2015 1 次提交
    • V
      slub: make dead caches discard free slabs immediately · d6e0b7fa
      Vladimir Davydov 提交于
      To speed up further allocations SLUB may store empty slabs in per cpu/node
      partial lists instead of freeing them immediately.  This prevents per
      memcg caches destruction, because kmem caches created for a memory cgroup
      are only destroyed after the last page charged to the cgroup is freed.
      
      To fix this issue, this patch resurrects approach first proposed in [1].
      It forbids SLUB to cache empty slabs after the memory cgroup that the
      cache belongs to was destroyed.  It is achieved by setting kmem_cache's
      cpu_partial and min_partial constants to 0 and tuning put_cpu_partial() so
      that it would drop frozen empty slabs immediately if cpu_partial = 0.
      
      The runtime overhead is minimal.  From all the hot functions, we only
      touch relatively cold put_cpu_partial(): we make it call
      unfreeze_partials() after freezing a slab that belongs to an offline
      memory cgroup.  Since slab freezing exists to avoid moving slabs from/to a
      partial list on free/alloc, and there can't be allocations from dead
      caches, it shouldn't cause any overhead.  We do have to disable preemption
      for put_cpu_partial() to achieve that though.
      
      The original patch was accepted well and even merged to the mm tree.
      However, I decided to withdraw it due to changes happening to the memcg
      core at that time.  I had an idea of introducing per-memcg shrinkers for
      kmem caches, but now, as memcg has finally settled down, I do not see it
      as an option, because SLUB shrinker would be too costly to call since SLUB
      does not keep free slabs on a separate list.  Besides, we currently do not
      even call per-memcg shrinkers for offline memcgs.  Overall, it would
      introduce much more complexity to both SLUB and memcg than this small
      patch.
      
      Regarding to SLAB, there's no problem with it, because it shrinks
      per-cpu/node caches periodically.  Thanks to list_lru reparenting, we no
      longer keep entries for offline cgroups in per-memcg arrays (such as
      memcg_cache_params->memcg_caches), so we do not have to bother if a
      per-memcg cache will be shrunk a bit later than it could be.
      
      [1] http://thread.gmane.org/gmane.linux.kernel.mm/118649/focus=118650Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6e0b7fa
  6. 10 10月, 2014 1 次提交
  7. 05 6月, 2014 1 次提交
    • V
      slab: get_online_mems for kmem_cache_{create,destroy,shrink} · 03afc0e2
      Vladimir Davydov 提交于
      When we create a sl[au]b cache, we allocate kmem_cache_node structures
      for each online NUMA node.  To handle nodes taken online/offline, we
      register memory hotplug notifier and allocate/free kmem_cache_node
      corresponding to the node that changes its state for each kmem cache.
      
      To synchronize between the two paths we hold the slab_mutex during both
      the cache creationg/destruction path and while tuning per-node parts of
      kmem caches in memory hotplug handler, but that's not quite right,
      because it does not guarantee that a newly created cache will have all
      kmem_cache_nodes initialized in case it races with memory hotplug.  For
      instance, in case of slub:
      
          CPU0                            CPU1
          ----                            ----
          kmem_cache_create:              online_pages:
           __kmem_cache_create:            slab_memory_callback:
                                            slab_mem_going_online_callback:
                                             lock slab_mutex
                                             for each slab_caches list entry
                                                 allocate kmem_cache node
                                             unlock slab_mutex
            lock slab_mutex
            init_kmem_cache_nodes:
             for_each_node_state(node, N_NORMAL_MEMORY)
                 allocate kmem_cache node
            add kmem_cache to slab_caches list
            unlock slab_mutex
                                          online_pages (continued):
                                           node_states_set_node
      
      As a result we'll get a kmem cache with not all kmem_cache_nodes
      allocated.
      
      To avoid issues like that we should hold get/put_online_mems() during
      the whole kmem cache creation/destruction/shrink paths, just like we
      deal with cpu hotplug.  This patch does the trick.
      
      Note, that after it's applied, there is no need in taking the slab_mutex
      for kmem_cache_shrink any more, so it is removed from there.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      03afc0e2
  8. 11 4月, 2014 1 次提交
  9. 05 9月, 2013 1 次提交
    • C
      mm/sl[aou]b: Move kmallocXXX functions to common code · f1b6eb6e
      Christoph Lameter 提交于
      The kmalloc* functions of all slab allcoators are similar now so
      lets move them into slab.h. This requires some function naming changes
      in slob.
      
      As a results of this patch there is a common set of functions for
      all allocators. Also means that kmalloc_large() is now available
      in general to perform large order allocations that go directly
      via the page allocator. kmalloc_large() can be substituted if
      kmalloc() throws warnings because of too large allocations.
      
      kmalloc_large() has exactly the same semantics as kmalloc but
      can only used for allocations > PAGE_SIZE.
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      f1b6eb6e
  10. 08 7月, 2013 1 次提交
    • S
      slob: Check for NULL pointer before calling ctor() · c1e854e9
      Steven Rostedt 提交于
      While doing some code inspection, I noticed that the slob constructor
      method can be called with a NULL pointer. If memory is tight and slob
      fails to allocate with slob_alloc() or slob_new_pages() it still calls
      the ctor() method with a NULL pointer. Looking at the first ctor()
      method I found, I noticed that it can not handle a NULL pointer (I'm
      sure others probably can't either):
      
      static void sighand_ctor(void *data)
      {
              struct sighand_struct *sighand = data;
      
              spin_lock_init(&sighand->siglock);
              init_waitqueue_head(&sighand->signalfd_wqh);
      }
      
      The solution is to only call the ctor() method if allocation succeeded.
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      c1e854e9
  11. 07 7月, 2013 1 次提交
  12. 24 2月, 2013 1 次提交
  13. 19 12月, 2012 1 次提交
  14. 11 12月, 2012 1 次提交
  15. 31 10月, 2012 5 次提交
  16. 10 10月, 2012 1 次提交
    • A
      mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN · baaf1dd4
      Arnd Bergmann 提交于
      The definition of ARCH_SLAB_MINALIGN is architecture dependent
      and can be either of type size_t or int. Comparing that value
      with ARCH_KMALLOC_MINALIGN can cause harmless warnings on
      platforms where they are different. Since both are always
      small positive integer numbers, using the size_t type to compare
      them is safe and gets rid of the warning.
      
      Without this patch, building ARM collie_defconfig results in:
      
      mm/slob.c: In function '__kmalloc_node':
      mm/slob.c:431:152: warning: comparison of distinct pointer types lacks a cast [enabled by default]
      mm/slob.c: In function 'kfree':
      mm/slob.c:484:153: warning: comparison of distinct pointer types lacks a cast [enabled by default]
      mm/slob.c: In function 'ksize':
      mm/slob.c:503:153: warning: comparison of distinct pointer types lacks a cast [enabled by default]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      baaf1dd4
  17. 26 9月, 2012 1 次提交
    • D
      mm, slob: fix build breakage in __kmalloc_node_track_caller · 82bd5508
      David Rientjes 提交于
      On Sat, 8 Sep 2012, Ezequiel Garcia wrote:
      
      > @@ -454,15 +455,35 @@ void *__kmalloc_node(size_t size, gfp_t gfp, int node)
      >  			gfp |= __GFP_COMP;
      >  		ret = slob_new_pages(gfp, order, node);
      >
      > -		trace_kmalloc_node(_RET_IP_, ret,
      > +		trace_kmalloc_node(caller, ret,
      >  				   size, PAGE_SIZE << order, gfp, node);
      >  	}
      >
      >  	kmemleak_alloc(ret, size, 1, gfp);
      >  	return ret;
      >  }
      > +
      > +void *__kmalloc_node(size_t size, gfp_t gfp, int node)
      > +{
      > +	return __do_kmalloc_node(size, gfp, node, _RET_IP_);
      > +}
      >  EXPORT_SYMBOL(__kmalloc_node);
      >
      > +#ifdef CONFIG_TRACING
      > +void *__kmalloc_track_caller(size_t size, gfp_t gfp, unsigned long caller)
      > +{
      > +	return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, caller);
      > +}
      > +
      > +#ifdef CONFIG_NUMA
      > +void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags,
      > +					int node, unsigned long caller)
      > +{
      > +	return __do_kmalloc_node(size, gfp, node, caller);
      > +}
      > +#endif
      
      This breaks Pekka's slab/next tree with this:
      
      mm/slob.c: In function '__kmalloc_node_track_caller':
      mm/slob.c:488: error: 'gfp' undeclared (first use in this function)
      mm/slob.c:488: error: (Each undeclared identifier is reported only once
      mm/slob.c:488: error: for each function it appears in.)
      
      mm, slob: fix build breakage in __kmalloc_node_track_caller
      
      "mm, slob: Add support for kmalloc_track_caller()" breaks the build
      because gfp is undeclared.  Fix it.
      Acked-by: NEzequiel Garcia <elezegarcia@gmail.com>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      82bd5508
  18. 25 9月, 2012 2 次提交
  19. 05 9月, 2012 8 次提交
  20. 12 7月, 2012 1 次提交
  21. 09 7月, 2012 2 次提交
  22. 14 6月, 2012 4 次提交
  23. 31 10月, 2011 1 次提交
  24. 27 7月, 2011 1 次提交