提交 b2b6c3df 编写于 作者: C Chen Wandun 提交者: Zheng Zengkai

swapfile: fix soft lockup in scan_swap_map_slots

mainline inclusion
from mainline-v6.1-rc7
commit de1ccfb6
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I645DG
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de1ccfb648243a031cfbdc2d5571dfdaf5023106

--------------------------------

A softlockup occurs in scan free swap slot under huge memory pressure.
The test scenario is: 64 CPU cores, 64GB memory, and 28 zram devices, the
disksize of each zram device is 50MB.

LATENCY_LIMIT is used to prevent softlockups in scan_swap_map_slots(), but
the real loop number would more than LATENCY_LIMIT because of "goto checks
and goto scan" repeatly without decreasing latency limit.

In order to fix it, decrease latency_ration in advance.

There is also a suspicious place that will cause softlockups in
get_swap_pages().  In this function, the "goto start_over" may result in
continuous scanning of the swap partition.  If there is no cond_sched in
scan_swap_map_slots(), it would cause a softlockup (I am not sure about
this).

WARN: soft lockup - CPU#11 stuck for 11s! [kswapd0:466]
CPU: 11 PID: 466 Comm: kswapd@ Kdump: loaded Tainted: G
dump backtrace+0x0/0x1le4
show stack+0x20/@x2c
dump_stack+0xd8/0x140
watchdog print_info+0x48/0x54
watchdog_process_before_softlockup+0x98/0xa0
watchdog_timer_fn+0xlac/0x2d0
hrtimer_rum_queues+0xb0/0x130
hrtimer_interrupt+0x13c/0x3c0
arch_timer_handler_virt+0x3c/0x50
handLe_percpu_devid_irq+0x90/0x1f4
handle domain irq+0x84/0x100
gic_handle_irq+0x88/0x2b0
e11 ira+0xhB/Bx140
scan_swap_map_slots+0x678/0x890
get_swap_pages+0x29c/0x440
get_swap_page+0x120/0x2e0
add_to_swap+UX2U/0XyC
shrink_page_list+0x5d0/0x152c
shrink_inactive_list+0xl6c/Bx500
shrink_lruvec+0x270/0x304

WARN: soft lockup - CPU#32 stuck for 11s! [stress-ng:309915]
watchdog_timer_fn+0x1ac/0x2d0
__run_hrtimer+0x98/0x2a0
__hrtimer_run_queues+0xb0/0x130
hrtimer_interrupt+0x13c/0x3c0
arch_timer_handler_virt+0x3c/0x50
handle_percpu_devid_irq+0x90/0x1f4
__handle_domain_irq+0x84/0x100
gic_handle_irq+0x88/0x2b0
el1_irq+0xb8/0x140
get_swap_pages+0x1e8/0x440
get_swap_page+0x1c8/0x2e0
add_to_swap+0x20/0x9c
shrink_page_list+0x5d0/0x152c
reclaim_pages+0x160/0x310
madvise_cold_or_pageout_pte_range+0x7bc/0xe3c
walk_pmd_range.isra.0+0xac/0x22c
walk_pud_range+0xfc/0x1c0
walk_pgd_range+0x158/0x1b0
__walk_page_range+0x64/0x100
walk_page_range+0x104/0x150

Link: https://lkml.kernel.org/r/20221118133850.3360369-1-chenwandun@huawei.com
Fixes: 048c27fd ("[PATCH] swap: scan_swap_map latency breaks")
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: N"Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: <xialonglong1@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Conflicts:
	mm/swapfile.c
Signed-off-by: NChen Wandun <chenwandun@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
上级 73c99647
...@@ -944,6 +944,11 @@ static int scan_swap_map_slots(struct swap_info_struct *si, ...@@ -944,6 +944,11 @@ static int scan_swap_map_slots(struct swap_info_struct *si,
scan: scan:
spin_unlock(&si->lock); spin_unlock(&si->lock);
while (++offset <= READ_ONCE(si->highest_bit)) { while (++offset <= READ_ONCE(si->highest_bit)) {
if (unlikely(--latency_ration < 0)) {
cond_resched();
latency_ration = LATENCY_LIMIT;
scanned_many = true;
}
if (data_race(!si->swap_map[offset])) { if (data_race(!si->swap_map[offset])) {
spin_lock(&si->lock); spin_lock(&si->lock);
goto checks; goto checks;
...@@ -953,14 +958,14 @@ static int scan_swap_map_slots(struct swap_info_struct *si, ...@@ -953,14 +958,14 @@ static int scan_swap_map_slots(struct swap_info_struct *si,
spin_lock(&si->lock); spin_lock(&si->lock);
goto checks; goto checks;
} }
}
offset = si->lowest_bit;
while (offset < scan_base) {
if (unlikely(--latency_ration < 0)) { if (unlikely(--latency_ration < 0)) {
cond_resched(); cond_resched();
latency_ration = LATENCY_LIMIT; latency_ration = LATENCY_LIMIT;
scanned_many = true; scanned_many = true;
} }
}
offset = si->lowest_bit;
while (offset < scan_base) {
if (data_race(!si->swap_map[offset])) { if (data_race(!si->swap_map[offset])) {
spin_lock(&si->lock); spin_lock(&si->lock);
goto checks; goto checks;
...@@ -970,11 +975,6 @@ static int scan_swap_map_slots(struct swap_info_struct *si, ...@@ -970,11 +975,6 @@ static int scan_swap_map_slots(struct swap_info_struct *si,
spin_lock(&si->lock); spin_lock(&si->lock);
goto checks; goto checks;
} }
if (unlikely(--latency_ration < 0)) {
cond_resched();
latency_ration = LATENCY_LIMIT;
scanned_many = true;
}
offset++; offset++;
} }
spin_lock(&si->lock); spin_lock(&si->lock);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册