提交 fb4c5ea6 编写于 作者: X Xu Yu 提交者: Shile Zhang

alinux: mm, memcg: fix soft lockup in priority oom

Assuming that there is a memory cgroup tree as follows:

        A (use_priority_oom=1, limit=2.5G)
       / \
      /   C (priority=3, usage=1.5G)
     B (priority=0, usage=1G)

As task in C (task-c) invokes oom-killer, task in B (task-b) is chosen
and killed, and then task-c returns from mem_cgroup_oom and retries in
try_charge.

If memory page_counter of B has not been reset yet, leading to task-c
invokes oom-killer again, the soft lockup may happen. In this situation,
task-c keeps selecting bad process in B, while the only task-b in B has
already been set PF_EXITING flag, which makes task-b skipped in
css_task_iter_advance.

Finally, task-c selected no bad process in B and keeps retrying, and
task-b is stalled in synchronize_rcu when do_exit, exit_task_namespaces
specifically.

In a nutshell, the new behavior of css_task_iter_advance, i.e., commit
c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS
iterations"), causes priority oom to misbehave.

This fixes the soft lockup by accounting num_oom_skip of the victim
memcg and its parents (sift up to oc->memcg), if no bad process is
chosen from it.
Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
上级 f1046eaf
...@@ -1078,10 +1078,10 @@ static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg) ...@@ -1078,10 +1078,10 @@ static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
dead_memcg); dead_memcg);
} }
/* memcg priority */ /* memcg oom priority */
/* /*
* mem_cgroup_account_oom_skip - account the OOM-unkillable task * do_mem_cgroup_account_oom_skip - account the memcg with OOM-unkillable task
* @task: non OOM-killable task * @memcg: mem_cgroup struct with OOM-unkillable task
* @oc: oom_control struct * @oc: oom_control struct
* *
* Account OOM-unkillable task to its cgroup and up to the OOMing cgroup's * Account OOM-unkillable task to its cgroup and up to the OOMing cgroup's
...@@ -1093,21 +1093,20 @@ static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg) ...@@ -1093,21 +1093,20 @@ static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
* tasks might become killable. * tasks might become killable.
* *
*/ */
void mem_cgroup_account_oom_skip(struct task_struct *task, static void do_mem_cgroup_account_oom_skip(struct mem_cgroup *memcg,
struct oom_control *oc) struct oom_control *oc)
{ {
struct mem_cgroup *root, *memcg; struct mem_cgroup *root;
struct cgroup_subsys_state *css; struct cgroup_subsys_state *css;
if (!oc->use_priority_oom) if (!oc->use_priority_oom)
return; return;
if (unlikely(!memcg))
return;
root = oc->memcg; root = oc->memcg;
if (!root) if (!root)
root = root_mem_cgroup; root = root_mem_cgroup;
memcg = mem_cgroup_from_task(task);
if (unlikely(!memcg))
return;
css = &memcg->css; css = &memcg->css;
while (css) { while (css) {
struct mem_cgroup *tmp; struct mem_cgroup *tmp;
...@@ -1132,6 +1131,12 @@ void mem_cgroup_account_oom_skip(struct task_struct *task, ...@@ -1132,6 +1131,12 @@ void mem_cgroup_account_oom_skip(struct task_struct *task,
} }
} }
void mem_cgroup_account_oom_skip(struct task_struct *task,
struct oom_control *oc)
{
do_mem_cgroup_account_oom_skip(mem_cgroup_from_task(task), oc);
}
static struct mem_cgroup * static struct mem_cgroup *
mem_cgroup_select_victim_cgroup(struct mem_cgroup *memcg) mem_cgroup_select_victim_cgroup(struct mem_cgroup *memcg)
{ {
...@@ -1261,9 +1266,11 @@ void mem_cgroup_select_bad_process(struct oom_control *oc) ...@@ -1261,9 +1266,11 @@ void mem_cgroup_select_bad_process(struct oom_control *oc)
mem_cgroup_scan_tasks(victim, oom_evaluate_task, oc); mem_cgroup_scan_tasks(victim, oom_evaluate_task, oc);
if (oc->use_priority_oom) { if (oc->use_priority_oom) {
css_put(&victim->css); css_put(&victim->css);
if (!oc->chosen && victim != memcg) if (!oc->chosen && victim != memcg) {
do_mem_cgroup_account_oom_skip(victim, oc);
goto retry; goto retry;
} }
}
out: out:
/* See commets in mem_cgroup_account_oom_skip() */ /* See commets in mem_cgroup_account_oom_skip() */
while (oc->reset_list) { while (oc->reset_list) {
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册