1. 01 8月, 2013 2 次提交
  2. 30 4月, 2013 1 次提交
    • A
      memcg: add memory.pressure_level events · 70ddf637
      Anton Vorontsov 提交于
      With this patch userland applications that want to maintain the
      interactivity/memory allocation cost can use the pressure level
      notifications.  The levels are defined like this:
      
      The "low" level means that the system is reclaiming memory for new
      allocations.  Monitoring this reclaiming activity might be useful for
      maintaining cache level.  Upon notification, the program (typically
      "Activity Manager") might analyze vmstat and act in advance (i.e.
      prematurely shutdown unimportant services).
      
      The "medium" level means that the system is experiencing medium memory
      pressure, the system might be making swap, paging out active file
      caches, etc.  Upon this event applications may decide to further analyze
      vmstat/zoneinfo/memcg or internal memory usage statistics and free any
      resources that can be easily reconstructed or re-read from a disk.
      
      The "critical" level means that the system is actively thrashing, it is
      about to out of memory (OOM) or even the in-kernel OOM killer is on its
      way to trigger.  Applications should do whatever they can to help the
      system.  It might be too late to consult with vmstat or any other
      statistics, so it's advisable to take an immediate action.
      
      The events are propagated upward until the event is handled, i.e.  the
      events are not pass-through.  Here is what this means: for example you
      have three cgroups: A->B->C.  Now you set up an event listener on
      cgroups A, B and C, and suppose group C experiences some pressure.  In
      this situation, only group C will receive the notification, i.e.  groups
      A and B will not receive it.  This is done to avoid excessive
      "broadcasting" of messages, which disturbs the system and which is
      especially bad if we are low on memory or thrashing.  So, organize the
      cgroups wisely, or propagate the events manually (or, ask us to
      implement the pass-through events, explaining why would you need them.)
      
      Performance wise, the memory pressure notifications feature itself is
      lightweight and does not require much of bookkeeping, in contrast to the
      rest of memcg features.  Unfortunately, as of current memcg
      implementation, pages accounting is an inseparable part and cannot be
      turned off.  The good news is that there are some efforts[1] to improve
      the situation; plus, implementing the same, fully API-compatible[2]
      interface for CONFIG_MEMCG=n case (e.g.  embedded) is also a viable
      option, so it will not require any changes on the userland side.
      
      [1] http://permalink.gmane.org/gmane.linux.kernel.cgroups/6291
      [2] http://lkml.org/lkml/2013/2/21/454
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix CONFIG_CGROPUPS=n warnings]
      Signed-off-by: NAnton Vorontsov <anton.vorontsov@linaro.org>
      Acked-by: NKirill A. Shutemov <kirill@shutemov.name>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Leonid Moiseichuk <leonid.moiseichuk@nokia.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      70ddf637