• T
    slab: remove synchronous synchronize_sched() from memcg cache deactivation path · 01fb58bc
    Tejun Heo 提交于
    With kmem cgroup support enabled, kmem_caches can be created and
    destroyed frequently and a great number of near empty kmem_caches can
    accumulate if there are a lot of transient cgroups and the system is not
    under memory pressure.  When memory reclaim starts under such
    conditions, it can lead to consecutive deactivation and destruction of
    many kmem_caches, easily hundreds of thousands on moderately large
    systems, exposing scalability issues in the current slab management
    code.  This is one of the patches to address the issue.
    
    slub uses synchronize_sched() to deactivate a memcg cache.
    synchronize_sched() is an expensive and slow operation and doesn't scale
    when a huge number of caches are destroyed back-to-back.  While there
    used to be a simple batching mechanism, the batching was too restricted
    to be helpful.
    
    This patch implements slab_deactivate_memcg_cache_rcu_sched() which slub
    can use to schedule sched RCU callback instead of performing
    synchronize_sched() synchronously while holding cgroup_mutex.  While
    this adds online cpus, mems and slab_mutex operations, operating on
    these locks back-to-back from the same kworker, which is what's gonna
    happen when there are many to deactivate, isn't expensive at all and
    this gets rid of the scalability problem completely.
    
    Link: http://lkml.kernel.org/r/20170117235411.9408-9-tj@kernel.orgSigned-off-by: NTejun Heo <tj@kernel.org>
    Reported-by: NJay Vana <jsvana@fb.com>
    Acked-by: NVladimir Davydov <vdavydov.dev@gmail.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
    01fb58bc
slab.h 20.7 KB