提交 abb242f5 编写于 作者: J Johannes Weiner 提交者: Linus Torvalds

mm: memcontrol: fix stat-corrupting race in charge moving

The move_lock is a per-memcg lock, but the VM accounting code that needs
to acquire it comes from the page and follows page->mem_cgroup under RCU
protection.  That means that the page becomes unlocked not when we drop
the move_lock, but when we update page->mem_cgroup.  And that assignment
doesn't imply any memory ordering.  If that pointer write gets reordered
against the reads of the page state - page_mapped, PageDirty etc.  the
state may change while we rely on it being stable and we can end up
corrupting the counters.

Place an SMP memory barrier to make sure we're done with all page state by
the time the new page->mem_cgroup becomes visible.

Also replace the open-coded move_lock with a lock_page_memcg() to make it
more obvious what we're serializing against.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: NShakeel Butt <shakeelb@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Link: http://lkml.kernel.org/r/20200508183105.225460-3-hannes@cmpxchg.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
上级 f4129ea3
...@@ -5432,7 +5432,6 @@ static int mem_cgroup_move_account(struct page *page, ...@@ -5432,7 +5432,6 @@ static int mem_cgroup_move_account(struct page *page,
{ {
struct lruvec *from_vec, *to_vec; struct lruvec *from_vec, *to_vec;
struct pglist_data *pgdat; struct pglist_data *pgdat;
unsigned long flags;
unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1; unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
int ret; int ret;
bool anon; bool anon;
...@@ -5459,18 +5458,13 @@ static int mem_cgroup_move_account(struct page *page, ...@@ -5459,18 +5458,13 @@ static int mem_cgroup_move_account(struct page *page,
from_vec = mem_cgroup_lruvec(from, pgdat); from_vec = mem_cgroup_lruvec(from, pgdat);
to_vec = mem_cgroup_lruvec(to, pgdat); to_vec = mem_cgroup_lruvec(to, pgdat);
spin_lock_irqsave(&from->move_lock, flags); lock_page_memcg(page);
if (!anon && page_mapped(page)) { if (!anon && page_mapped(page)) {
__mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages);
__mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages);
} }
/*
* move_lock grabbed above and caller set from->moving_account, so
* mod_memcg_page_state will serialize updates to PageDirty.
* So mapping should be stable for dirty pages.
*/
if (!anon && PageDirty(page)) { if (!anon && PageDirty(page)) {
struct address_space *mapping = page_mapping(page); struct address_space *mapping = page_mapping(page);
...@@ -5486,15 +5480,23 @@ static int mem_cgroup_move_account(struct page *page, ...@@ -5486,15 +5480,23 @@ static int mem_cgroup_move_account(struct page *page,
} }
/* /*
* All state has been migrated, let's switch to the new memcg.
*
* It is safe to change page->mem_cgroup here because the page * It is safe to change page->mem_cgroup here because the page
* is referenced, charged, and isolated - we can't race with * is referenced, charged, isolated, and locked: we can't race
* uncharging, charging, migration, or LRU putback. * with (un)charging, migration, LRU putback, or anything else
* that would rely on a stable page->mem_cgroup.
*
* Note that lock_page_memcg is a memcg lock, not a page lock,
* to save space. As soon as we switch page->mem_cgroup to a
* new memcg that isn't locked, the above state can change
* concurrently again. Make sure we're truly done with it.
*/ */
smp_mb();
/* caller should have done css_get */ page->mem_cgroup = to; /* caller should have done css_get */
page->mem_cgroup = to;
spin_unlock_irqrestore(&from->move_lock, flags); __unlock_page_memcg(from);
ret = 0; ret = 0;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册