提交 37b23e05 编写于 作者: K KOSAKI Motohiro 提交者: Linus Torvalds

x86,mm: make pagefault killable

When an oom killing occurs, almost all processes are getting stuck at the
following two points.

	1) __alloc_pages_nodemask
	2) __lock_page_or_retry

1) is not very problematic because TIF_MEMDIE leads to an allocation
failure and getting out from page allocator.

2) is more problematic.  In an OOM situation, zones typically don't have
page cache at all and memory starvation might lead to greatly reduced IO
performance.  When a fork bomb occurs, TIF_MEMDIE tasks don't die quickly,
meaning that a fork bomb may create new process quickly rather than the
oom-killer killing it.  Then, the system may become livelocked.

This patch makes the pagefault interruptible by SIGKILL.
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
上级 f62e00cc
...@@ -965,7 +965,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code) ...@@ -965,7 +965,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
struct mm_struct *mm; struct mm_struct *mm;
int fault; int fault;
int write = error_code & PF_WRITE; int write = error_code & PF_WRITE;
unsigned int flags = FAULT_FLAG_ALLOW_RETRY | unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
(write ? FAULT_FLAG_WRITE : 0); (write ? FAULT_FLAG_WRITE : 0);
tsk = current; tsk = current;
...@@ -1138,6 +1138,16 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code) ...@@ -1138,6 +1138,16 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
return; return;
} }
/*
* Pagefault was interrupted by SIGKILL. We have no reason to
* continue pagefault.
*/
if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current)) {
if (!(error_code & PF_USER))
no_context(regs, error_code, address);
return;
}
/* /*
* Major/minor page fault accounting is only done on the * Major/minor page fault accounting is only done on the
* initial attempt. If we go through a retry, it is extremely * initial attempt. If we go through a retry, it is extremely
......
...@@ -153,6 +153,7 @@ extern pgprot_t protection_map[16]; ...@@ -153,6 +153,7 @@ extern pgprot_t protection_map[16];
#define FAULT_FLAG_MKWRITE 0x04 /* Fault was mkwrite of existing pte */ #define FAULT_FLAG_MKWRITE 0x04 /* Fault was mkwrite of existing pte */
#define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking */ #define FAULT_FLAG_ALLOW_RETRY 0x08 /* Retry fault if blocking */
#define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Don't drop mmap_sem and wait when retrying */ #define FAULT_FLAG_RETRY_NOWAIT 0x10 /* Don't drop mmap_sem and wait when retrying */
#define FAULT_FLAG_KILLABLE 0x20 /* The fault task is in SIGKILL killable region */
/* /*
* This interface is used by x86 PAT code to identify a pfn mapping that is * This interface is used by x86 PAT code to identify a pfn mapping that is
......
...@@ -654,15 +654,32 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); ...@@ -654,15 +654,32 @@ EXPORT_SYMBOL_GPL(__lock_page_killable);
int __lock_page_or_retry(struct page *page, struct mm_struct *mm, int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags) unsigned int flags)
{ {
if (!(flags & FAULT_FLAG_ALLOW_RETRY)) { if (flags & FAULT_FLAG_ALLOW_RETRY) {
__lock_page(page); /*
return 1; * CAUTION! In this case, mmap_sem is not released
} else { * even though return 0.
if (!(flags & FAULT_FLAG_RETRY_NOWAIT)) { */
if (flags & FAULT_FLAG_RETRY_NOWAIT)
return 0;
up_read(&mm->mmap_sem); up_read(&mm->mmap_sem);
if (flags & FAULT_FLAG_KILLABLE)
wait_on_page_locked_killable(page);
else
wait_on_page_locked(page); wait_on_page_locked(page);
}
return 0; return 0;
} else {
if (flags & FAULT_FLAG_KILLABLE) {
int ret;
ret = __lock_page_killable(page);
if (ret) {
up_read(&mm->mmap_sem);
return 0;
}
} else
__lock_page(page);
return 1;
} }
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册