1. 10 10月, 2014 3 次提交
  2. 07 8月, 2014 8 次提交
  3. 04 7月, 2014 1 次提交
  4. 07 6月, 2014 1 次提交
  5. 05 6月, 2014 10 次提交
    • C
      mm: replace __get_cpu_var uses with this_cpu_ptr · 7c8e0181
      Christoph Lameter 提交于
      Replace places where __get_cpu_var() is used for an address calculation
      with this_cpu_ptr().
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7c8e0181
    • V
      memcg, slab: merge memcg_{bind,release}_pages to memcg_{un}charge_slab · c67a8a68
      Vladimir Davydov 提交于
      Currently we have two pairs of kmemcg-related functions that are called on
      slab alloc/free.  The first is memcg_{bind,release}_pages that count the
      total number of pages allocated on a kmem cache.  The second is
      memcg_{un}charge_slab that {un}charge slab pages to kmemcg resource
      counter.  Let's just merge them to keep the code clean.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Glauber Costa <glommer@gmail.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c67a8a68
    • V
      slab: get_online_mems for kmem_cache_{create,destroy,shrink} · 03afc0e2
      Vladimir Davydov 提交于
      When we create a sl[au]b cache, we allocate kmem_cache_node structures
      for each online NUMA node.  To handle nodes taken online/offline, we
      register memory hotplug notifier and allocate/free kmem_cache_node
      corresponding to the node that changes its state for each kmem cache.
      
      To synchronize between the two paths we hold the slab_mutex during both
      the cache creationg/destruction path and while tuning per-node parts of
      kmem caches in memory hotplug handler, but that's not quite right,
      because it does not guarantee that a newly created cache will have all
      kmem_cache_nodes initialized in case it races with memory hotplug.  For
      instance, in case of slub:
      
          CPU0                            CPU1
          ----                            ----
          kmem_cache_create:              online_pages:
           __kmem_cache_create:            slab_memory_callback:
                                            slab_mem_going_online_callback:
                                             lock slab_mutex
                                             for each slab_caches list entry
                                                 allocate kmem_cache node
                                             unlock slab_mutex
            lock slab_mutex
            init_kmem_cache_nodes:
             for_each_node_state(node, N_NORMAL_MEMORY)
                 allocate kmem_cache node
            add kmem_cache to slab_caches list
            unlock slab_mutex
                                          online_pages (continued):
                                           node_states_set_node
      
      As a result we'll get a kmem cache with not all kmem_cache_nodes
      allocated.
      
      To avoid issues like that we should hold get/put_online_mems() during
      the whole kmem cache creation/destruction/shrink paths, just like we
      deal with cpu hotplug.  This patch does the trick.
      
      Note, that after it's applied, there is no need in taking the slab_mutex
      for kmem_cache_shrink any more, so it is removed from there.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      03afc0e2
    • V
      mem-hotplug: implement get/put_online_mems · bfc8c901
      Vladimir Davydov 提交于
      kmem_cache_{create,destroy,shrink} need to get a stable value of
      cpu/node online mask, because they init/destroy/access per-cpu/node
      kmem_cache parts, which can be allocated or destroyed on cpu/mem
      hotplug.  To protect against cpu hotplug, these functions use
      {get,put}_online_cpus.  However, they do nothing to synchronize with
      memory hotplug - taking the slab_mutex does not eliminate the
      possibility of race as described in patch 2.
      
      What we need there is something like get_online_cpus, but for memory.
      We already have lock_memory_hotplug, which serves for the purpose, but
      it's a bit of a hammer right now, because it's backed by a mutex.  As a
      result, it imposes some limitations to locking order, which are not
      desirable, and can't be used just like get_online_cpus.  That's why in
      patch 1 I substitute it with get/put_online_mems, which work exactly
      like get/put_online_cpus except they block not cpu, but memory hotplug.
      
      [ v1 can be found at https://lkml.org/lkml/2014/4/6/68.  I NAK'ed it by
        myself, because it used an rw semaphore for get/put_online_mems,
        making them dead lock prune.  ]
      
      This patch (of 2):
      
      {un}lock_memory_hotplug, which is used to synchronize against memory
      hotplug, is currently backed by a mutex, which makes it a bit of a
      hammer - threads that only want to get a stable value of online nodes
      mask won't be able to proceed concurrently.  Also, it imposes some
      strong locking ordering rules on it, which narrows down the set of its
      usage scenarios.
      
      This patch introduces get/put_online_mems, which are the same as
      get/put_online_cpus, but for memory hotplug, i.e.  executing a code
      inside a get/put_online_mems section will guarantee a stable value of
      online nodes, present pages, etc.
      
      lock_memory_hotplug()/unlock_memory_hotplug() are removed altogether.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bfc8c901
    • V
      mm: get rid of __GFP_KMEMCG · 52383431
      Vladimir Davydov 提交于
      Currently to allocate a page that should be charged to kmemcg (e.g.
      threadinfo), we pass __GFP_KMEMCG flag to the page allocator.  The page
      allocated is then to be freed by free_memcg_kmem_pages.  Apart from
      looking asymmetrical, this also requires intrusion to the general
      allocation path.  So let's introduce separate functions that will
      alloc/free pages charged to kmemcg.
      
      The new functions are called alloc_kmem_pages and free_kmem_pages.  They
      should be used when the caller actually would like to use kmalloc, but
      has to fall back to the page allocator for the allocation is large.
      They only differ from alloc_pages and free_pages in that besides
      allocating or freeing pages they also charge them to the kmem resource
      counter of the current memory cgroup.
      
      [sfr@canb.auug.org.au: export kmalloc_order() to modules]
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Acked-by: NGreg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Glauber Costa <glommer@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      52383431
    • V
      sl[au]b: charge slabs to kmemcg explicitly · 5dfb4175
      Vladimir Davydov 提交于
      We have only a few places where we actually want to charge kmem so
      instead of intruding into the general page allocation path with
      __GFP_KMEMCG it's better to explictly charge kmem there.  All kmem
      charges will be easier to follow that way.
      
      This is a step towards removing __GFP_KMEMCG.  It removes __GFP_KMEMCG
      from memcg caches' allocflags.  Instead it makes slab allocation path
      call memcg_charge_kmem directly getting memcg to charge from the cache's
      memcg params.
      
      This also eliminates any possibility of misaccounting an allocation
      going from one memcg's cache to another memcg, because now we always
      charge slabs against the memcg the cache belongs to.  That's why this
      patch removes the big comment to memcg_kmem_get_cache.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Acked-by: NGreg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Glauber Costa <glommer@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5dfb4175
    • D
      mm: slub: fix ALLOC_SLOWPATH stat · 8eae1492
      Dave Hansen 提交于
      There used to be only one path out of __slab_alloc(), and ALLOC_SLOWPATH
      got bumped in that exit path.  Now there are two, and a bunch of gotos.
      ALLOC_SLOWPATH can now get set more than once during a single call to
      __slab_alloc() which is pretty bogus.  Here's the sequence:
      
      1. Enter __slab_alloc(), fall through all the way to the
         stat(s, ALLOC_SLOWPATH);
      2. hit 'if (!freelist)', and bump DEACTIVATE_BYPASS, jump to
         new_slab (goto #1)
      3. Hit 'if (c->partial)', bump CPU_PARTIAL_ALLOC, goto redo
         (goto #2)
      4. Fall through in the same path we did before all the way to
         stat(s, ALLOC_SLOWPATH)
      5. bump ALLOC_REFILL stat, then return
      
      Doing this is obviously bogus.  It keeps us from being able to
      accurately compare ALLOC_SLOWPATH vs.  ALLOC_FASTPATH.  It also means
      that the total number of allocs always exceeds the total number of
      frees.
      
      This patch moves stat(s, ALLOC_SLOWPATH) to be called from the same
      place that __slab_alloc() is.  This makes it much less likely that
      ALLOC_SLOWPATH will get botched again in the spaghetti-code inside
      __slab_alloc().
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8eae1492
    • D
      mm, slab: suppress out of memory warning unless debug is enabled · 9a02d699
      David Rientjes 提交于
      When the slab or slub allocators cannot allocate additional slab pages,
      they emit diagnostic information to the kernel log such as current
      number of slabs, number of objects, active objects, etc.  This is always
      coupled with a page allocation failure warning since it is controlled by
      !__GFP_NOWARN.
      
      Suppress this out of memory warning if the allocator is configured
      without debug supported.  The page allocation failure warning will
      indicate it is a failed slab allocation, the order, and the gfp mask, so
      this is only useful to diagnose allocator issues.
      
      Since CONFIG_SLUB_DEBUG is already enabled by default for the slub
      allocator, there is no functional change with this patch.  If debug is
      disabled, however, the warnings are now suppressed.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a02d699
    • F
      mm/slub.c: convert vnsprintf-static to va_format · ecc42fbe
      Fabian Frederick 提交于
      Inspired by Joe Perches suggestion in ntfs logging clean-up.
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ecc42fbe
    • F
      mm/slub.c: convert printk to pr_foo() · f9f58285
      Fabian Frederick 提交于
      All printk(KERN_foo converted to pr_foo()
      
      Default printk converted to pr_warn()
      
      Coalesce format fragments
      Signed-off-by: NFabian Frederick <fabf@skynet.be>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f9f58285
  6. 07 5月, 2014 2 次提交
    • C
      slub: use sysfs'es release mechanism for kmem_cache · 41a21285
      Christoph Lameter 提交于
      debugobjects warning during netfilter exit:
      
          ------------[ cut here ]------------
          WARNING: CPU: 6 PID: 4178 at lib/debugobjects.c:260 debug_print_object+0x8d/0xb0()
          ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
          Modules linked in:
          CPU: 6 PID: 4178 Comm: kworker/u16:2 Tainted: G        W 3.11.0-next-20130906-sasha #3984
          Workqueue: netns cleanup_net
          Call Trace:
            dump_stack+0x52/0x87
            warn_slowpath_common+0x8c/0xc0
            warn_slowpath_fmt+0x46/0x50
            debug_print_object+0x8d/0xb0
            __debug_check_no_obj_freed+0xa5/0x220
            debug_check_no_obj_freed+0x15/0x20
            kmem_cache_free+0x197/0x340
            kmem_cache_destroy+0x86/0xe0
            nf_conntrack_cleanup_net_list+0x131/0x170
            nf_conntrack_pernet_exit+0x5d/0x70
            ops_exit_list+0x5e/0x70
            cleanup_net+0xfb/0x1c0
            process_one_work+0x338/0x550
            worker_thread+0x215/0x350
            kthread+0xe7/0xf0
            ret_from_fork+0x7c/0xb0
      
      Also during dcookie cleanup:
      
          WARNING: CPU: 12 PID: 9725 at lib/debugobjects.c:260 debug_print_object+0x8c/0xb0()
          ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
          Modules linked in:
          CPU: 12 PID: 9725 Comm: trinity-c141 Not tainted 3.15.0-rc2-next-20140423-sasha-00018-gc4ff6c4 #408
          Call Trace:
            dump_stack (lib/dump_stack.c:52)
            warn_slowpath_common (kernel/panic.c:430)
            warn_slowpath_fmt (kernel/panic.c:445)
            debug_print_object (lib/debugobjects.c:262)
            __debug_check_no_obj_freed (lib/debugobjects.c:697)
            debug_check_no_obj_freed (lib/debugobjects.c:726)
            kmem_cache_free (mm/slub.c:2689 mm/slub.c:2717)
            kmem_cache_destroy (mm/slab_common.c:363)
            dcookie_unregister (fs/dcookies.c:302 fs/dcookies.c:343)
            event_buffer_release (arch/x86/oprofile/../../../drivers/oprofile/event_buffer.c:153)
            __fput (fs/file_table.c:217)
            ____fput (fs/file_table.c:253)
            task_work_run (kernel/task_work.c:125 (discriminator 1))
            do_notify_resume (include/linux/tracehook.h:196 arch/x86/kernel/signal.c:751)
            int_signal (arch/x86/kernel/entry_64.S:807)
      
      Sysfs has a release mechanism.  Use that to release the kmem_cache
      structure if CONFIG_SYSFS is enabled.
      
      Only slub is changed - slab currently only supports /proc/slabinfo and
      not /sys/kernel/slab/*.  We talked about adding that and someone was
      working on it.
      
      [akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
      [akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more]
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Tested-by: NSasha Levin <sasha.levin@oracle.com>
      Acked-by: NGreg KH <greg@kroah.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41a21285
    • V
      slub: fix memcg_propagate_slab_attrs · 93030d83
      Vladimir Davydov 提交于
      After creating a cache for a memcg we should initialize its sysfs attrs
      with the values from its parent.  That's what memcg_propagate_slab_attrs
      is for.  Currently it's broken - we clearly muddled root-vs-memcg caches
      there.  Let's fix it up.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93030d83
  7. 08 4月, 2014 6 次提交
    • C
      slub: use raw_cpu_inc for incrementing statistics · 88da03a6
      Christoph Lameter 提交于
      Statistics are not critical to the operation of the allocation but
      should also not cause too much overhead.
      
      When __this_cpu_inc is altered to check if preemption is disabled this
      triggers.  Use raw_cpu_inc to avoid the checks.  Using this_cpu_ops may
      cause interrupt disable/enable sequences on various arches which may
      significantly impact allocator performance.
      
      [akpm@linux-foundation.org: add comment]
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      88da03a6
    • D
      slub: fix leak of 'name' in sysfs_slab_add · 54b6a731
      Dave Jones 提交于
      The failure paths of sysfs_slab_add don't release the allocation of
      'name' made by create_unique_id() a few lines above the context of the
      diff below.  Create a common exit path to make it more obvious what
      needs freeing.
      
      [vdavydov@parallels.com: free the name only if !unmergeable]
      Signed-off-by: NDave Jones <davej@fedoraproject.org>
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54b6a731
    • V
      slub: rework sysfs layout for memcg caches · 9a41707b
      Vladimir Davydov 提交于
      Currently, we try to arrange sysfs entries for memcg caches in the same
      manner as for global caches.  Apart from turning /sys/kernel/slab into a
      mess when there are a lot of kmem-active memcgs created, it actually
      does not work properly - we won't create more than one link to a memcg
      cache in case its parent is merged with another cache.  For instance, if
      A is a root cache merged with another root cache B, we will have the
      following sysfs setup:
      
        X
        A -> X
        B -> X
      
      where X is some unique id (see create_unique_id()).  Now if memcgs M and
      N start to allocate from cache A (or B, which is the same), we will get:
      
        X
        X:M
        X:N
        A -> X
        B -> X
        A:M -> X:M
        A:N -> X:N
      
      Since B is an alias for A, we won't get entries B:M and B:N, which is
      confusing.
      
      It is more logical to have entries for memcg caches under the
      corresponding root cache's sysfs directory.  This would allow us to keep
      sysfs layout clean, and avoid such inconsistencies like one described
      above.
      
      This patch does the trick.  It creates a "cgroup" kset in each root
      cache kobject to keep its children caches there.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Glauber Costa <glommer@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9a41707b
    • V
      slub: adjust memcg caches when creating cache alias · 84d0ddd6
      Vladimir Davydov 提交于
      Otherwise, kzalloc() called from a memcg won't clear the whole object.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Glauber Costa <glommer@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      84d0ddd6
    • V
      memcg, slab: never try to merge memcg caches · a44cb944
      Vladimir Davydov 提交于
      When a kmem cache is created (kmem_cache_create_memcg()), we first try to
      find a compatible cache that already exists and can handle requests from
      the new cache, i.e.  has the same object size, alignment, ctor, etc.  If
      there is such a cache, we do not create any new caches, instead we simply
      increment the refcount of the cache found and return it.
      
      Currently we do this procedure not only when creating root caches, but
      also for memcg caches.  However, there is no point in that, because, as
      every memcg cache has exactly the same parameters as its parent and cache
      merging cannot be turned off in runtime (only on boot by passing
      "slub_nomerge"), the root caches of any two potentially mergeable memcg
      caches should be merged already, i.e.  it must be the same root cache, and
      therefore we couldn't even get to the memcg cache creation, because it
      already exists.
      
      The only exception is boot caches - they are explicitly forbidden to be
      merged by setting their refcount to -1.  There are currently only two of
      them - kmem_cache and kmem_cache_node, which are used in slab internals (I
      do not count kmalloc caches as their refcount is set to 1 immediately
      after creation).  Since they are prevented from merging preliminary I
      guess we should avoid to merge their children too.
      
      So let's remove the useless code responsible for merging memcg caches.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Glauber Costa <glommer@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a44cb944
    • D
      mm, mempolicy: rename slab_node for clarity · 2a389610
      David Rientjes 提交于
      slab_node() is actually a mempolicy function, so rename it to
      mempolicy_slab_node() to make it clearer that it used for processes with
      mempolicies.
      
      At the same time, cleanup its code by saving numa_mem_id() in a local
      variable (since we require a node with memory, not just any node) and
      remove an obsolete comment that assumes the mempolicy is actually passed
      into the function.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Tim Hockin <thockin@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2a389610
  8. 04 4月, 2014 2 次提交
    • V
      slub: do not drop slab_mutex for sysfs_slab_add · 421af243
      Vladimir Davydov 提交于
      We release the slab_mutex while calling sysfs_slab_add from
      __kmem_cache_create since commit 66c4c35c ("slub: Do not hold
      slub_lock when calling sysfs_slab_add()"), because kobject_uevent called
      by sysfs_slab_add might block waiting for the usermode helper to exec,
      which would result in a deadlock if we took the slab_mutex while
      executing it.
      
      However, apart from complicating synchronization rules, releasing the
      slab_mutex on kmem cache creation can result in a kmemcg-related race.
      The point is that we check if the memcg cache exists before going to
      __kmem_cache_create, but register the new cache in memcg subsys after
      it.  Since we can drop the mutex there, several threads can see that the
      memcg cache does not exist and proceed to creating it, which is wrong.
      
      Fortunately, recently kobject_uevent was patched to call the usermode
      helper with the UMH_NO_WAIT flag, making the deadlock impossible.
      Therefore there is no point in releasing the slab_mutex while calling
      sysfs_slab_add, so let's simplify kmem_cache_create synchronization and
      fix the kmemcg-race mentioned above by holding the slab_mutex during the
      whole cache creation path.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      421af243
    • M
      mm: optimize put_mems_allowed() usage · d26914d1
      Mel Gorman 提交于
      Since put_mems_allowed() is strictly optional, its a seqcount retry, we
      don't need to evaluate the function if the allocation was in fact
      successful, saving a smp_rmb some loads and comparisons on some relative
      fast-paths.
      
      Since the naming, get/put_mems_allowed() does suggest a mandatory
      pairing, rename the interface, as suggested by Mel, to resemble the
      seqcount interface.
      
      This gives us: read_mems_allowed_begin() and read_mems_allowed_retry(),
      where it is important to note that the return value of the latter call
      is inverted from its previous incarnation.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d26914d1
  9. 27 3月, 2014 1 次提交
  10. 11 2月, 2014 2 次提交
  11. 31 1月, 2014 2 次提交
    • D
      mm: slub: work around unneeded lockdep warning · 67b6c900
      Dave Hansen 提交于
      The slub code does some setup during early boot in
      early_kmem_cache_node_alloc() with some local data.  There is no
      possible way that another CPU can see this data, so the slub code
      doesn't unnecessarily lock it.  However, some new lockdep asserts
      check to make sure that add_partial() _always_ has the list_lock
      held.
      
      Just add the locking, even though it is technically unnecessary.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      67b6c900
    • D
      mm/slub.c: fix page->_count corruption (again) · a0320865
      Dave Hansen 提交于
      Commit abca7c49 ("mm: fix slab->page _count corruption when using
      slub") notes that we can not _set_ a page->counters directly, except
      when using a real double-cmpxchg.  Doing so can lose updates to
      ->_count.
      
      That is an absolute rule:
      
              You may not *set* page->counters except via a cmpxchg.
      
      Commit abca7c49 fixed this for the folks who have the slub
      cmpxchg_double code turned off at compile time, but it left the bad case
      alone.  It can still be reached, and the same bug triggered in two
      cases:
      
      1. Turning on slub debugging at runtime, which is available on
         the distro kernels that I looked at.
      2. On 64-bit CPUs with no CMPXCHG16B (some early AMD x86-64
         cpus, evidently)
      
      There are at least 3 ways we could fix this:
      
      1. Take all of the exising calls to cmpxchg_double_slab() and
         __cmpxchg_double_slab() and convert them to take an old, new
         and target 'struct page'.
      2. Do (1), but with the newly-introduced 'slub_data'.
      3. Do some magic inside the two cmpxchg...slab() functions to
         pull the counters out of new_counters and only set those
         fields in page->{inuse,frozen,objects}.
      
      I've done (2) as well, but it's a bunch more code.  This patch is an
      attempt at (3).  This was the most straightforward and foolproof way
      that I could think to do this.
      
      This would also technically allow us to get rid of the ugly
      
      #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \
             defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE)
      
      in 'struct page', but leaving it alone has the added benefit that
      'counters' stays 'unsigned' instead of 'unsigned long', so all the
      copies that the slub code does stay a bit smaller.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0320865
  12. 30 1月, 2014 1 次提交
  13. 24 1月, 2014 1 次提交