1. 24 9月, 2009 1 次提交
  2. 23 9月, 2009 17 次提交
  3. 22 9月, 2009 8 次提交
  4. 19 8月, 2009 1 次提交
    • K
      mm: revert "oom: move oom_adj value" · 0753ba01
      KOSAKI Motohiro 提交于
      The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
      the mm_struct.  It was a very good first step for sanitize OOM.
      
      However Paul Menage reported the commit makes regression to his job
      scheduler.  Current OOM logic can kill OOM_DISABLED process.
      
      Why? His program has the code of similar to the following.
      
      	...
      	set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
      	...
      	if (vfork() == 0) {
      		set_oom_adj(0); /* Invoked child can be killed */
      		execve("foo-bar-cmd");
      	}
      	....
      
      vfork() parent and child are shared the same mm_struct.  then above
      set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
      change oom_adj for vfork() parent.  Then, vfork() parent (job scheduler)
      lost OOM immune and it was killed.
      
      Actually, fork-setting-exec idiom is very frequently used in userland program.
      We must not break this assumption.
      
      Then, this patch revert commit 2ff05b2b and related commit.
      
      Reverted commit list
      ---------------------
      - commit 2ff05b2b (oom: move oom_adj value from task_struct to mm_struct)
      - commit 4d8b9135 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
      - commit 81236810 (oom: only oom kill exiting tasks with attached memory)
      - commit 933b787b (mm: copy over oom_adj value at fork time)
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Paul Menage <menage@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0753ba01
  5. 10 8月, 2009 5 次提交
  6. 24 6月, 2009 1 次提交
    • O
      mm_for_maps: simplify, use ptrace_may_access() · a893a84e
      Oleg Nesterov 提交于
      It would be nice to kill __ptrace_may_access(). It requires task_lock(),
      but this lock is only needed to read mm->flags in the middle.
      
      Convert mm_for_maps() to use ptrace_may_access(), this also simplifies
      the code a little bit.
      
      Also, we do not need to take ->mmap_sem in advance. In fact I think
      mm_for_maps() should not play with ->mmap_sem at all, the caller should
      take this lock.
      
      With or without this patch, without ->cred_guard_mutex held we can race
      with exec() and get the new ->mm but check old creds.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      a893a84e
  7. 19 6月, 2009 3 次提交
  8. 17 6月, 2009 4 次提交
    • D
      oom: move oom_adj value from task_struct to mm_struct · 2ff05b2b
      David Rientjes 提交于
      The per-task oom_adj value is a characteristic of its mm more than the
      task itself since it's not possible to oom kill any thread that shares the
      mm.  If a task were to be killed while attached to an mm that could not be
      freed because another thread were set to OOM_DISABLE, it would have
      needlessly been terminated since there is no potential for future memory
      freeing.
      
      This patch moves oomkilladj (now more appropriately named oom_adj) from
      struct task_struct to struct mm_struct.  This requires task_lock() on a
      task to check its oom_adj value to protect against exec, but it's already
      necessary to take the lock when dereferencing the mm to find the total VM
      size for the badness heuristic.
      
      This fixes a livelock if the oom killer chooses a task and another thread
      sharing the same memory has an oom_adj value of OOM_DISABLE.  This occurs
      because oom_kill_task() repeatedly returns 1 and refuses to kill the
      chosen task while select_bad_process() will repeatedly choose the same
      task during the next retry.
      
      Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in
      oom_kill_task() to check for threads sharing the same memory will be
      removed in the next patch in this series where it will no longer be
      necessary.
      
      Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since
      these threads are immune from oom killing already.  They simply report an
      oom_adj value of OOM_DISABLE.
      
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ff05b2b
    • K
      mm: remove CONFIG_UNEVICTABLE_LRU config option · 68377659
      KOSAKI Motohiro 提交于
      Currently, nobody wants to turn UNEVICTABLE_LRU off.  Thus this
      configurability is unnecessary.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Acked-by: NMinchan Kim <minchan.kim@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      68377659
    • W
      proc: export more page flags in /proc/kpageflags · 17797549
      Wu Fengguang 提交于
      Export all page flags faithfully in /proc/kpageflags.
      
      	11. KPF_MMAP		(pseudo flag) memory mapped page
      	12. KPF_ANON		(pseudo flag) memory mapped page (anonymous)
      	13. KPF_SWAPCACHE	page is in swap cache
      	14. KPF_SWAPBACKED	page is swap/RAM backed
      	15. KPF_COMPOUND_HEAD	(*)
      	16. KPF_COMPOUND_TAIL	(*)
      	17. KPF_HUGE		hugeTLB pages
      	18. KPF_UNEVICTABLE	page is in the unevictable LRU list
      	19. KPF_HWPOISON(TBD)	hardware detected corruption
      	20. KPF_NOPAGE		(pseudo flag) no page frame at the address
      	32-39.			more obscure flags for kernel developers
      
      	(*) For compound pages, exporting _both_ head/tail info enables
      	    users to tell where a compound page starts/ends, and its order.
      
      The accompanying page-types tool will handle the details like decoupling
      overloaded flags and hiding obscure flags to normal users.
      
      Thanks to KOSAKI and Andi for their valuable recommendations!
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      17797549
    • W
      proc: kpagecount/kpageflags code cleanup · ed7ce0f1
      Wu Fengguang 提交于
      Move increments of pfn/out to bottom of the loop.
      Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Acked-by: NMatt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ed7ce0f1