1. 13 11月, 2013 1 次提交
    • Y
      memcg: support hierarchical memory.numa_stats · 071aee13
      Ying Han 提交于
      The memory.numa_stat file was not hierarchical.  Memory charged to the
      children was not shown in parent's numa_stat.
      
      This change adds the "hierarchical_" stats to the existing stats.  The
      new hierarchical stats include the sum of all children's values in
      addition to the value of the memcg.
      
      Tested: Create cgroup a, a/b and run workload under b.  The values of
      b are included in the "hierarchical_*" under a.
      
      $ cd /sys/fs/cgroup
      $ echo 1 > memory.use_hierarchy
      $ mkdir a a/b
      
      Run workload in a/b:
      $ (echo $BASHPID >> a/b/cgroup.procs && cat /some/file && bash) &
      
      The hierarchical_ fields in parent (a) show use of workload in a/b:
      $ cat a/memory.numa_stat
      total=0 N0=0 N1=0 N2=0 N3=0
      file=0 N0=0 N1=0 N2=0 N3=0
      anon=0 N0=0 N1=0 N2=0 N3=0
      unevictable=0 N0=0 N1=0 N2=0 N3=0
      hierarchical_total=908 N0=552 N1=317 N2=39 N3=0
      hierarchical_file=850 N0=549 N1=301 N2=0 N3=0
      hierarchical_anon=58 N0=3 N1=16 N2=39 N3=0
      hierarchical_unevictable=0 N0=0 N1=0 N2=0 N3=0
      
      $ cat a/b/memory.numa_stat
      total=908 N0=552 N1=317 N2=39 N3=0
      file=850 N0=549 N1=301 N2=0 N3=0
      anon=58 N0=3 N1=16 N2=39 N3=0
      unevictable=0 N0=0 N1=0 N2=0 N3=0
      hierarchical_total=908 N0=552 N1=317 N2=39 N3=0
      hierarchical_file=850 N0=549 N1=301 N2=0 N3=0
      hierarchical_anon=58 N0=3 N1=16 N2=39 N3=0
      hierarchical_unevictable=0 N0=0 N1=0 N2=0 N3=0
      Signed-off-by: NYing Han <yinghan@google.com>
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      071aee13
  2. 13 9月, 2013 1 次提交
  3. 04 7月, 2013 1 次提交
  4. 24 6月, 2013 1 次提交
  5. 28 5月, 2013 1 次提交
  6. 08 5月, 2013 1 次提交
  7. 30 4月, 2013 1 次提交
    • A
      memcg: add memory.pressure_level events · 70ddf637
      Anton Vorontsov 提交于
      With this patch userland applications that want to maintain the
      interactivity/memory allocation cost can use the pressure level
      notifications.  The levels are defined like this:
      
      The "low" level means that the system is reclaiming memory for new
      allocations.  Monitoring this reclaiming activity might be useful for
      maintaining cache level.  Upon notification, the program (typically
      "Activity Manager") might analyze vmstat and act in advance (i.e.
      prematurely shutdown unimportant services).
      
      The "medium" level means that the system is experiencing medium memory
      pressure, the system might be making swap, paging out active file
      caches, etc.  Upon this event applications may decide to further analyze
      vmstat/zoneinfo/memcg or internal memory usage statistics and free any
      resources that can be easily reconstructed or re-read from a disk.
      
      The "critical" level means that the system is actively thrashing, it is
      about to out of memory (OOM) or even the in-kernel OOM killer is on its
      way to trigger.  Applications should do whatever they can to help the
      system.  It might be too late to consult with vmstat or any other
      statistics, so it's advisable to take an immediate action.
      
      The events are propagated upward until the event is handled, i.e.  the
      events are not pass-through.  Here is what this means: for example you
      have three cgroups: A->B->C.  Now you set up an event listener on
      cgroups A, B and C, and suppose group C experiences some pressure.  In
      this situation, only group C will receive the notification, i.e.  groups
      A and B will not receive it.  This is done to avoid excessive
      "broadcasting" of messages, which disturbs the system and which is
      especially bad if we are low on memory or thrashing.  So, organize the
      cgroups wisely, or propagate the events manually (or, ask us to
      implement the pass-through events, explaining why would you need them.)
      
      Performance wise, the memory pressure notifications feature itself is
      lightweight and does not require much of bookkeeping, in contrast to the
      rest of memcg features.  Unfortunately, as of current memcg
      implementation, pages accounting is an inseparable part and cannot be
      turned off.  The good news is that there are some efforts[1] to improve
      the situation; plus, implementing the same, fully API-compatible[2]
      interface for CONFIG_MEMCG=n case (e.g.  embedded) is also a viable
      option, so it will not require any changes on the userland side.
      
      [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291
      [2] http://lkml.org/lkml/2013/2/21/454
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix CONFIG_CGROPUPS=n warnings]
      Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Acked-by: NKirill A. Shutemov <kirill@shutemov.name>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Leonid Moiseichuk <leonid.moiseichuk@nokia.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      70ddf637
  8. 27 3月, 2013 1 次提交
  9. 19 12月, 2012 2 次提交
  10. 12 12月, 2012 1 次提交
  11. 17 11月, 2012 1 次提交
  12. 09 10月, 2012 1 次提交
  13. 01 8月, 2012 2 次提交
  14. 30 5月, 2012 3 次提交
  15. 13 4月, 2012 1 次提交
  16. 13 1月, 2012 2 次提交
  17. 23 12月, 2011 1 次提交
    • G
      Partial revert "Basic kernel memory functionality for the Memory Controller" · 65c64ce8
      Glauber Costa 提交于
      This reverts commit e5671dfa.
      
      After a follow up discussion with Michal, it was agreed it would
      be better to leave the kmem controller with just the tcp files,
      deferring the behavior of the other general memory.kmem.* files
      for a later time, when more caches are controlled. This is because
      generic kmem files are not used by tcp accounting and it is
      not clear how other slab caches would fit into the scheme.
      
      We are reverting the original commit so we can track the reference.
      Part of the patch is kept, because it was used by the later tcp
      code. Conflicts are shown in the bottom. init/Kconfig is removed from
      the revert entirely.
      Signed-off-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      CC: Kirill A. Shutemov <kirill@shutemov.name>
      CC: Paul Menage <paul@paulmenage.org>
      CC: Greg Thelen <gthelen@google.com>
      CC: Johannes Weiner <jweiner@redhat.com>
      CC: David S. Miller <davem@davemloft.net>
      
      Conflicts:
      
      	Documentation/cgroups/memory.txt
      	mm/memcontrol.c
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65c64ce8
  18. 13 12月, 2011 5 次提交
  19. 03 11月, 2011 1 次提交
  20. 15 9月, 2011 1 次提交
  21. 27 7月, 2011 1 次提交
    • K
      memcg: add memory.vmscan_stat · 82f9d486
      KAMEZAWA Hiroyuki 提交于
      The commit log of 0ae5e89c ("memcg: count the soft_limit reclaim
      in...") says it adds scanning stats to memory.stat file.  But it doesn't
      because we considered we needed to make a concensus for such new APIs.
      
      This patch is a trial to add memory.scan_stat. This shows
        - the number of scanned pages(total, anon, file)
        - the number of rotated pages(total, anon, file)
        - the number of freed pages(total, anon, file)
        - the number of elaplsed time (including sleep/pause time)
      
        for both of direct/soft reclaim.
      
      The biggest difference with oringinal Ying's one is that this file
      can be reset by some write, as
      
        # echo 0 ...../memory.scan_stat
      
      Example of output is here. This is a result after make -j 6 kernel
      under 300M limit.
      
        [kamezawa@bluextal ~]$ cat /cgroup/memory/A/memory.scan_stat
        [kamezawa@bluextal ~]$ cat /cgroup/memory/A/memory.vmscan_stat
        scanned_pages_by_limit 9471864
        scanned_anon_pages_by_limit 6640629
        scanned_file_pages_by_limit 2831235
        rotated_pages_by_limit 4243974
        rotated_anon_pages_by_limit 3971968
        rotated_file_pages_by_limit 272006
        freed_pages_by_limit 2318492
        freed_anon_pages_by_limit 962052
        freed_file_pages_by_limit 1356440
        elapsed_ns_by_limit 351386416101
        scanned_pages_by_system 0
        scanned_anon_pages_by_system 0
        scanned_file_pages_by_system 0
        rotated_pages_by_system 0
        rotated_anon_pages_by_system 0
        rotated_file_pages_by_system 0
        freed_pages_by_system 0
        freed_anon_pages_by_system 0
        freed_file_pages_by_system 0
        elapsed_ns_by_system 0
        scanned_pages_by_limit_under_hierarchy 9471864
        scanned_anon_pages_by_limit_under_hierarchy 6640629
        scanned_file_pages_by_limit_under_hierarchy 2831235
        rotated_pages_by_limit_under_hierarchy 4243974
        rotated_anon_pages_by_limit_under_hierarchy 3971968
        rotated_file_pages_by_limit_under_hierarchy 272006
        freed_pages_by_limit_under_hierarchy 2318492
        freed_anon_pages_by_limit_under_hierarchy 962052
        freed_file_pages_by_limit_under_hierarchy 1356440
        elapsed_ns_by_limit_under_hierarchy 351386416101
        scanned_pages_by_system_under_hierarchy 0
        scanned_anon_pages_by_system_under_hierarchy 0
        scanned_file_pages_by_system_under_hierarchy 0
        rotated_pages_by_system_under_hierarchy 0
        rotated_anon_pages_by_system_under_hierarchy 0
        rotated_file_pages_by_system_under_hierarchy 0
        freed_pages_by_system_under_hierarchy 0
        freed_anon_pages_by_system_under_hierarchy 0
        freed_file_pages_by_system_under_hierarchy 0
        elapsed_ns_by_system_under_hierarchy 0
      
      total_xxxx is for hierarchy management.
      
      This will be useful for further memcg developments and need to be
      developped before we do some complicated rework on LRU/softlimit
      management.
      
      This patch adds a new struct memcg_scanrecord into scan_control struct.
      sc->nr_scanned at el is not designed for exporting information.  For
      example, nr_scanned is reset frequentrly and incremented +2 at scanning
      mapped pages.
      
      To avoid complexity, I added a new param in scan_control which is for
      exporting scanning score.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Ying Han <yinghan@google.com>
      Cc: Andrew Bresticker <abrestic@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      82f9d486
  22. 16 6月, 2011 3 次提交
  23. 29 4月, 2011 1 次提交
  24. 17 2月, 2011 1 次提交
  25. 14 1月, 2011 2 次提交
  26. 28 5月, 2010 3 次提交