1. 09 1月, 2009 6 次提交
    • K
      memcg: show reclaim stat · 7f016ee8
      KOSAKI Motohiro 提交于
      Add the following four fields to memory.stat file:
      
        - inactive_ratio
        - recent_rotated_anon
        - recent_rotated_file
        - recent_scanned_anon
        - recent_scanned_file
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f016ee8
    • B
      memcg: memory cgroup hierarchy documentation · 52bc0d82
      Balbir Singh 提交于
      Documentation updates for hierarchy support
      Signed-off-by: NBalbir Singh <balbir@linux.vnet.ibm.com>
      Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Cc: Paul Menage <menage@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pavel Emelianov <xemul@openvz.org>
      Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      52bc0d82
    • K
      memcg: mem+swap controller core · 8c7c6e34
      KAMEZAWA Hiroyuki 提交于
      This patch implements per cgroup limit for usage of memory+swap.  However
      there are SwapCache, double counting of swap-cache and swap-entry is
      avoided.
      
      Mem+Swap controller works as following.
        - memory usage is limited by memory.limit_in_bytes.
        - memory + swap usage is limited by memory.memsw_limit_in_bytes.
      
      This has following benefits.
        - A user can limit total resource usage of mem+swap.
      
          Without this, because memory resource controller doesn't take care of
          usage of swap, a process can exhaust all the swap (by memory leak.)
          We can avoid this case.
      
          And Swap is shared resource but it cannot be reclaimed (goes back to memory)
          until it's used. This characteristic can be trouble when the memory
          is divided into some parts by cpuset or memcg.
          Assume group A and group B.
          After some application executes, the system can be..
      
          Group A -- very large free memory space but occupy 99% of swap.
          Group B -- under memory shortage but cannot use swap...it's nearly full.
      
          Ability to set appropriate swap limit for each group is required.
      
      Maybe someone wonder "why not swap but mem+swap ?"
      
        - The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
          to move account from memory to swap...there is no change in usage of
          mem+swap.
      
          In other words, when we want to limit the usage of swap without affecting
          global LRU, mem+swap limit is better than just limiting swap.
      
      Accounting target information is stored in swap_cgroup which is
      per swap entry record.
      
      Charge is done as following.
        map
          - charge  page and memsw.
      
        unmap
          - uncharge page/memsw if not SwapCache.
      
        swap-out (__delete_from_swap_cache)
          - uncharge page
          - record mem_cgroup information to swap_cgroup.
      
        swap-in (do_swap_page)
          - charged as page and memsw.
            record in swap_cgroup is cleared.
            memsw accounting is decremented.
      
        swap-free (swap_free())
          - if swap entry is freed, memsw is uncharged by PAGE_SIZE.
      
      There are people work under never-swap environments and consider swap as
      something bad. For such people, this mem+swap controller extension is just an
      overhead.  This overhead is avoided by config or boot option.
      (see Kconfig. detail is not in this patch.)
      
      TODO:
       - maybe more optimization can be don in swap-in path. (but not very safe.)
         But we just do simple accounting at this stage.
      
      [nishimura@mxp.nes.nec.co.jp: make resize limit hold mutex]
      [hugh@veritas.com: memswap controller core swapcache fixes]
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c7c6e34
    • K
      memcg: handle swap caches · d13d1443
      KAMEZAWA Hiroyuki 提交于
      SwapCache support for memory resource controller (memcg)
      
      Before mem+swap controller, memcg itself should handle SwapCache in proper
      way.  This is cut-out from it.
      
      In current memcg, SwapCache is just leaked and the user can create tons of
      SwapCache.  This is a leak of account and should be handled.
      
      SwapCache accounting is done as following.
      
        charge (anon)
      	- charged when it's mapped.
      	  (because of readahead, charge at add_to_swap_cache() is not sane)
        uncharge (anon)
      	- uncharged when it's dropped from swapcache and fully unmapped.
      	  means it's not uncharged at unmap.
      	  Note: delete from swap cache at swap-in is done after rmap information
      	        is established.
        charge (shmem)
      	- charged at swap-in. this prevents charge at add_to_page_cache().
      
        uncharge (shmem)
      	- uncharged when it's dropped from swapcache and not on shmem's
      	  radix-tree.
      
        at migration, check against 'old page' is modified to handle shmem.
      
      Comparing to the old version discussed (and caused troubles), we have
      advantages of
        - PCG_USED bit.
        - simple migrating handling.
      
      So, situation is much easier than several months ago, maybe.
      
      [hugh@veritas.com: memcg: handle swap caches build fix]
      Reviewed-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Tested-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d13d1443
    • K
      memcg: new force_empty to free pages under group · c1e862c1
      KAMEZAWA Hiroyuki 提交于
      By memcg-move-all-accounts-to-parent-at-rmdir.patch, there is no leak of
      memory usage and force_empty is removed.
      
      This patch adds "force_empty" again, in reasonable manner.
      
      memory.force_empty file works when
      
        #echo 0 (or some) > memory.force_empty
        and have following function.
      
        1. only works when there are no task in this cgroup.
        2. free all page under this cgroup as much as possible.
        3. page which cannot be freed will be moved up to parent.
        4. Then, memcg will be empty after above echo returns.
      
      This is much better behavior than old "force_empty" which just forget
      all accounts. This patch also check signal_pending() and above "echo"
      can be stopped by "Ctrl-C".
      
      [akpm@linux-foundation.org: cleanup]
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c1e862c1
    • K
      memcg: move all acccounting to parent at rmdir() · f817ed48
      KAMEZAWA Hiroyuki 提交于
      This patch provides a function to move account information of a page
      between mem_cgroups and rewrite force_empty to make use of this.
      
      This moving of page_cgroup is done under
       - lru_lock of source/destination mem_cgroup is held.
       - lock_page_cgroup() is held.
      
      Then, a routine which touches pc->mem_cgroup without lock_page_cgroup()
      should confirm pc->mem_cgroup is still valid or not.  Typical code can be
      following.
      
      (while page is not under lock_page())
      	mem = pc->mem_cgroup;
      	mz = page_cgroup_zoneinfo(pc)
      	spin_lock_irqsave(&mz->lru_lock);
      	if (pc->mem_cgroup == mem)
      		...../* some list handling */
      	spin_unlock_irqrestore(&mz->lru_lock);
      
      Of course, better way is
      	lock_page_cgroup(pc);
      	....
      	unlock_page_cgroup(pc);
      
      But you should confirm the nest of lock and avoid deadlock.
      
      If you treats page_cgroup from mem_cgroup's LRU under mz->lru_lock,
      you don't have to worry about what pc->mem_cgroup points to.
      moved pages are added to head of lru, not to tail.
      
      Expected users of this routine is:
        - force_empty (rmdir)
        - moving tasks between cgroup (for moving account information.)
        - hierarchy (maybe useful.)
      
      force_empty(rmdir) uses this move_account and move pages to its parent.
      This "move" will not cause OOM (I added "oom" parameter to try_charge().)
      
      If the parent is busy (not enough memory), force_empty calls try_to_free_page()
      and reduce usage.
      
      Purpose of this behavior is
        - Fix "forget all" behavior of force_empty and avoid leak of accounting.
        - By "moving first, free if necessary", keep pages on memory as much as
          possible.
      
      Adding a switch to change behavior of force_empty to
        - free first, move if necessary
        - free all, if there is mlocked/busy pages, return -EBUSY.
      is under consideration. (I'll add if someone requtests.)
      
      This patch also removes memory.force_empty file, a brutal debug-only interface.
      Reviewed-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Tested-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Paul Menage <menage@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f817ed48
  2. 20 10月, 2008 1 次提交
  3. 26 7月, 2008 1 次提交
  4. 05 3月, 2008 2 次提交
  5. 24 2月, 2008 1 次提交
  6. 08 2月, 2008 3 次提交