提交 f4e53d91 编写于 作者: L Lee Schermerhorn 提交者: Linus Torvalds

mempolicy: write lock mmap_sem while changing task mempolicy

A read of /proc/<pid>/numa_maps holds the target task's mmap_sem for read
while examining each vma's mempolicy.  A vma's mempolicy can fall back to the
task's policy.  However, the task could be changing it's task policy and free
the one that the show_numa_maps() is examining.

To prevent this, grab the mmap_sem for write when updating task mempolicy.
Pointed out to me by Christoph Lameter and extracted and reworked from
Christoph's alternative mempol reference counting patch.

This is analogous to the way that do_mbind() and do_get_mempolicy() prevent
races between task's sharing an mm_struct [a.k.a.  threads] setting and
querying a mempolicy for a particular address.

Note: this is necessary, but not sufficient, to allow us to stop taking an
extra reference on "other task's mempolicy" in get_vma_policy.  Subsequent
patches will complete this update, allowing us to simplify the tests for
whether we need to unref a mempolicy at various points in the code.
Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
上级 846a16bf
...@@ -591,16 +591,29 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags, ...@@ -591,16 +591,29 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
nodemask_t *nodes) nodemask_t *nodes)
{ {
struct mempolicy *new; struct mempolicy *new;
struct mm_struct *mm = current->mm;
new = mpol_new(mode, flags, nodes); new = mpol_new(mode, flags, nodes);
if (IS_ERR(new)) if (IS_ERR(new))
return PTR_ERR(new); return PTR_ERR(new);
/*
* prevent changing our mempolicy while show_numa_maps()
* is using it.
* Note: do_set_mempolicy() can be called at init time
* with no 'mm'.
*/
if (mm)
down_write(&mm->mmap_sem);
mpol_put(current->mempolicy); mpol_put(current->mempolicy);
current->mempolicy = new; current->mempolicy = new;
mpol_set_task_struct_flag(); mpol_set_task_struct_flag();
if (new && new->policy == MPOL_INTERLEAVE && if (new && new->policy == MPOL_INTERLEAVE &&
nodes_weight(new->v.nodes)) nodes_weight(new->v.nodes))
current->il_next = first_node(new->v.nodes); current->il_next = first_node(new->v.nodes);
if (mm)
up_write(&mm->mmap_sem);
return 0; return 0;
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册