1. 18 3月, 2016 5 次提交
    • V
      mm: memcontrol: enable kmem accounting for all cgroups in the legacy hierarchy · b313aeee
      Vladimir Davydov 提交于
      Workingset code was recently made memcg aware, but shadow node shrinker
      is still global.  As a result, one small cgroup can consume all memory
      available for shadow nodes, possibly hurting other cgroups by reclaiming
      their shadow nodes, even though reclaim distances stored in its shadow
      nodes have no effect.  To avoid this, we need to make shadow node
      shrinker memcg aware.
      
      The actual work is done in patch 6 of the series.  Patches 1 and 2
      prepare memcg/shrinker infrastructure for the change.  Patch 3 is just a
      collateral cleanup.  Patch 4 makes radix_tree_node accounted, which is
      necessary for making shadow node shrinker memcg aware.  Patch 5 reduces
      shadow nodes overhead in case workload mostly uses anonymous pages.
      
      This patch:
      
      Currently, in the legacy hierarchy kmem accounting is off for all
      cgroups by default and must be enabled explicitly by writing something
      to memory.kmem.limit_in_bytes.  Since we don't support reclaim on
      hitting kmem limit, nor do we have any plans to implement it, this is
      likely to be -1, just to enable kmem accounting and limit kernel memory
      consumption by the memory.limit_in_bytes along with user memory.
      
      This user API was introduced when the implementation of kmem accounting
      lacked slab shrinker support and hence was useless in practice.  Things
      have changed since then - slab shrinkers were made memcg aware, the
      accounting overhead seems to be negligible, and a failure to charge a
      kmem allocation should not have critical consequences, because we only
      account those kernel objects that should be safe to fail.  That's why
      kmem accounting is enabled by default for all cgroups in the default
      hierarchy, which will eventually replace the legacy one.
      
      The ability to enable kmem accounting for some cgroups while keeping it
      disabled for others is getting difficult to maintain.  E.g.  to make
      shadow node shrinker memcg aware (see mm/workingset.c), we need to know
      the relationship between the number of shadow nodes allocated for a
      cgroup and the size of its lru list.  If kmem accounting is enabled for
      all cgroups there is no problem, but what should we do if kmem
      accounting is enabled only for half of cgroups? We've no other choice
      but use global lru stats while scanning root cgroup's shadow nodes, but
      that would be wrong if kmem accounting was enabled for all cgroups
      (which is the case if the unified hierarchy is used), in which case we
      should use lru stats of the root cgroup's lruvec.
      
      That being said, let's enable kmem accounting for all memory cgroups by
      default.  If one finds it unstable or too costly, it can always be
      disabled system-wide by passing cgroup.memory=nokmem to the kernel at
      boot time.
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b313aeee
    • V
      mm: memcontrol: report kernel stack usage in cgroup2 memory.stat · 12580e4b
      Vladimir Davydov 提交于
      Show how much memory is allocated to kernel stacks.
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      12580e4b
    • V
      mm: memcontrol: report slab usage in cgroup2 memory.stat · 27ee57c9
      Vladimir Davydov 提交于
      Show how much memory is used for storing reclaimable and unreclaimable
      in-kernel data structures allocated from slab caches.
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      27ee57c9
    • V
      mm: memcontrol: make tree_{stat,events} fetch all stats · 72b54e73
      Vladimir Davydov 提交于
      Currently, tree_{stat,events} helpers can only get one stat index at a
      time, so when there are a lot of stats to be reported one has to call it
      over and over again (see memory_stat_show).  This is neither effective,
      nor does it look good.  Instead, let's make these helpers take a
      snapshot of all available counters.
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72b54e73
    • V
      mm: memcontrol: do not bypass slab charge if memcg is offline · fcff7d7e
      Vladimir Davydov 提交于
      Slab pages are charged in two steps.  First, an appropriate per memcg
      cache is selected (see memcg_kmem_get_cache) basing on the current
      context, then the new slab page is charged to the memory cgroup which
      the selected cache was created for (see memcg_charge_slab ->
      __memcg_kmem_charge_memcg).  It is OK to bypass kmemcg charge at step 1,
      but if step 1 succeeded and we successfully allocated a new slab page,
      step 2 must be performed, otherwise we would get a per memcg kmem cache
      which contains a slab that does not hold a reference to the memory
      cgroup owning the cache.  Since per memcg kmem caches are destroyed on
      memcg css free, this could result in freeing a cache while there are
      still active objects in it.
      
      However, currently we will bypass slab page charge if the memory cgroup
      owning the cache is offline (see __memcg_kmem_charge_memcg).  This is
      very unlikely to occur in practice, because for this to happen a process
      must be migrated to a different cgroup and the old cgroup must be
      removed while the process is in kmalloc somewhere between steps 1 and 2
      (e.g.  trying to allocate a new page).  Nevertheless, it's still better
      to eliminate such a possibility.
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fcff7d7e
  2. 16 3月, 2016 5 次提交
  3. 22 1月, 2016 1 次提交
  4. 21 1月, 2016 19 次提交
  5. 16 1月, 2016 6 次提交
  6. 15 1月, 2016 4 次提交