提交 ef97e894 编写于 作者: X Xunlei Pang 提交者: Joseph Qi

alinux: memcg: Point wb to root memcg/blkcg when offlining to avoid zombie

After turning off the memcg kmem charging, we still suffer
from various zombie memcg problems on production environment
because of its non-zero reference count from both page caches
and per-memcg writeback related structure(bdi_writeback takes
a reference).

After we reclaimed all the page caches of the zombie memcg,
it still can't be dropped due to its bdi_writeback.

bdi_writeback is further referenced by the inodes of files,
so the memcg can't be truely released until the inodes are
destroyed afterwards which is quite unlikely in short term.

When memcg is offlining, change it's bdi_writeback to root,
and call css_put to formally release it. We've tested on
product environment, it yields pretty good effect.

Ditto for wb_blkcg_offline().
Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
Reviewed-by: NGavin Shan <shan.gavin@linux.alibaba.com>
Signed-off-by: NXunlei Pang <xlpang@linux.alibaba.com>
Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
上级 81389d70
...@@ -963,8 +963,16 @@ void wb_memcg_offline(struct mem_cgroup *memcg) ...@@ -963,8 +963,16 @@ void wb_memcg_offline(struct mem_cgroup *memcg)
struct bdi_writeback *wb, *next; struct bdi_writeback *wb, *next;
spin_lock_irq(&cgwb_lock); spin_lock_irq(&cgwb_lock);
list_for_each_entry_safe(wb, next, memcg_cgwb_list, memcg_node) list_for_each_entry_safe(wb, next, memcg_cgwb_list, memcg_node) {
percpu_ref_get(&wb->refcnt);
cgwb_kill(wb); cgwb_kill(wb);
if (wb->memcg_css) {
css_put(wb->memcg_css);
wb->memcg_css = &root_mem_cgroup->css;
css_get(wb->memcg_css);
}
percpu_ref_put(&wb->refcnt);
}
memcg_cgwb_list->next = NULL; /* prevent new wb's */ memcg_cgwb_list->next = NULL; /* prevent new wb's */
spin_unlock_irq(&cgwb_lock); spin_unlock_irq(&cgwb_lock);
} }
...@@ -980,8 +988,16 @@ void wb_blkcg_offline(struct blkcg *blkcg) ...@@ -980,8 +988,16 @@ void wb_blkcg_offline(struct blkcg *blkcg)
struct bdi_writeback *wb, *next; struct bdi_writeback *wb, *next;
spin_lock_irq(&cgwb_lock); spin_lock_irq(&cgwb_lock);
list_for_each_entry_safe(wb, next, &blkcg->cgwb_list, blkcg_node) list_for_each_entry_safe(wb, next, &blkcg->cgwb_list, blkcg_node) {
percpu_ref_get(&wb->refcnt);
cgwb_kill(wb); cgwb_kill(wb);
if (wb->memcg_css) {
css_put(wb->memcg_css);
wb->memcg_css = &root_mem_cgroup->css;
css_get(wb->memcg_css);
}
percpu_ref_put(&wb->refcnt);
}
blkcg->cgwb_list.next = NULL; /* prevent new wb's */ blkcg->cgwb_list.next = NULL; /* prevent new wb's */
spin_unlock_irq(&cgwb_lock); spin_unlock_irq(&cgwb_lock);
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册