mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync... (388cef14) · 提交 · openeuler / Kernel

提交 388cef14 编写于 10月 10, 2020 作者: R Roman Gushchin 提交者： Yang Yingliang 10月 10, 2020

mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync...

mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones"

mainline inclusion
from mainline-v5.3-rc7
commit b4c46484
category: bugfix
bugzilla: 34611
CVE: NA

-------------------------------------------------

Commit 766a4c19 ("mm/memcontrol.c: keep local VM counters in sync
with the hierarchical ones") effectively decreased the precision of
per-memcg vmstats_local and per-memcg-per-node lruvec percpu counters.

That's good for displaying in memory.stat, but brings a serious
regression into the reclaim process.

One issue I've discovered and debugged is the following:
lruvec_lru_size() can return 0 instead of the actual number of pages in
the lru list, preventing the kernel to reclaim last remaining pages.
Result is yet another dying memory cgroups flooding.  The opposite is
also happening: scanning an empty lru list is the waste of cpu time.

Also, inactive_list_is_low() can return incorrect values, preventing the
active lru from being scanned and freed.  It can fail both because the
size of active and inactive lists are inaccurate, and because the number
of workingset refaults isn't precise.  In other words, the result is
pretty random.

I'm not sure, if using the approximate number of slab pages in
count_shadow_number() is acceptable, but issues described above are
enough to partially revert the patch.

Let's keep per-memcg vmstat_local batched (they are only used for
displaying stats to the userspace), but keep lruvec stats precise.  This
change fixes the dead memcg flooding on my setup.

Link: http://lkml.kernel.org/r/20190817004726.2530670-1-guro@fb.com
Fixes: 766a4c19 ("mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones")
Signed-off-by: NRoman Gushchin <guro@fb.com>
Acked-by: NYafang Shao <laoar.shao@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NChen Zhou <chenzhou10@huawei.com>
Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>

上级 e65fecbb

显示空白变更内容

内联并排

浏览文件 @ 388cef14

@@ -762,15 +762,13 @@ void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 	/* Update memcg */
 	__mod_memcg_state(memcg, idx, val);
 	/* Update lruvec */
 	__this_cpu_add(pn->lruvec_stat_local->count[idx], val);
 	x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]);
 	if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) {
 		struct mem_cgroup_per_node *pi;
 		/*
 		 * Batch local counters to keep them in sync with
 		 * the hierarchical ones.
 		 */
 		__this_cpu_add(pn->lruvec_stat_local->count[idx], x);
 		for (pi = pn; pi; pi = parent_nodeinfo(pi, pgdat->node_id))
 			atomic_long_add(x, &pi->lruvec_stat[idx]);
 		x = 0;
-...

想要评论请注册或

openeuler / Kernel 大约 2 年 前同步成功

mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync...

openeuler / Kernel
大约 2 年前同步成功