alinux: mm, memcg: fix soft lockup in priority oom
Assuming that there is a memory cgroup tree as follows:
A (use_priority_oom=1, limit=2.5G)
/ \
/ C (priority=3, usage=1.5G)
B (priority=0, usage=1G)
As task in C (task-c) invokes oom-killer, task in B (task-b) is chosen
and killed, and then task-c returns from mem_cgroup_oom and retries in
try_charge.
If memory page_counter of B has not been reset yet, leading to task-c
invokes oom-killer again, the soft lockup may happen. In this situation,
task-c keeps selecting bad process in B, while the only task-b in B has
already been set PF_EXITING flag, which makes task-b skipped in
css_task_iter_advance.
Finally, task-c selected no bad process in B and keeps retrying, and
task-b is stalled in synchronize_rcu when do_exit, exit_task_namespaces
specifically.
In a nutshell, the new behavior of css_task_iter_advance, i.e., commit
c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS
iterations"), causes priority oom to misbehave.
This fixes the soft lockup by accounting num_oom_skip of the victim
memcg and its parents (sift up to oc->memcg), if no bad process is
chosen from it.
Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
Showing
想要评论请 注册 或 登录