• R
    mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() · 6fe7d6b9
    Rongwei Wang 提交于
    The si->lock must be held when deleting the si from the available list. 
    Otherwise, another thread can re-add the si to the available list, which
    can lead to memory corruption.  The only place we have found where this
    happens is in the swapoff path.  This case can be described as below:
    
    core 0                       core 1
    swapoff
    
    del_from_avail_list(si)      waiting
    
    try lock si->lock            acquire swap_avail_lock
                                 and re-add si into
                                 swap_avail_head
    
    acquire si->lock but missing si already being added again, and continuing
    to clear SWP_WRITEOK, etc.
    
    It can be easily found that a massive warning messages can be triggered
    inside get_swap_pages() by some special cases, for example, we call
    madvise(MADV_PAGEOUT) on blocks of touched memory concurrently, meanwhile,
    run much swapon-swapoff operations (e.g.  stress-ng-swap).
    
    However, in the worst case, panic can be caused by the above scene.  In
    swapoff(), the memory used by si could be kept in swap_info[] after
    turning off a swap.  This means memory corruption will not be caused
    immediately until allocated and reset for a new swap in the swapon path. 
    A panic message caused: (with CONFIG_PLIST_DEBUG enabled)
    
    ------------[ cut here ]------------
    top: 00000000e58a3003, n: 0000000013e75cda, p: 000000008cd4451a
    prev: 0000000035b1e58a, n: 000000008cd4451a, p: 000000002150ee8d
    next: 000000008cd4451a, n: 000000008cd4451a, p: 000000008cd4451a
    WARNING: CPU: 21 PID: 1843 at lib/plist.c:60 plist_check_prev_next_node+0x50/0x70
    Modules linked in: rfkill(E) crct10dif_ce(E)...
    CPU: 21 PID: 1843 Comm: stress-ng Kdump: ... 5.10.134+
    Hardware name: Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015
    pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
    pc : plist_check_prev_next_node+0x50/0x70
    lr : plist_check_prev_next_node+0x50/0x70
    sp : ffff0018009d3c30
    x29: ffff0018009d3c40 x28: ffff800011b32a98
    x27: 0000000000000000 x26: ffff001803908000
    x25: ffff8000128ea088 x24: ffff800011b32a48
    x23: 0000000000000028 x22: ffff001800875c00
    x21: ffff800010f9e520 x20: ffff001800875c00
    x19: ffff001800fdc6e0 x18: 0000000000000030
    x17: 0000000000000000 x16: 0000000000000000
    x15: 0736076307640766 x14: 0730073007380731
    x13: 0736076307640766 x12: 0730073007380731
    x11: 000000000004058d x10: 0000000085a85b76
    x9 : ffff8000101436e4 x8 : ffff800011c8ce08
    x7 : 0000000000000000 x6 : 0000000000000001
    x5 : ffff0017df9ed338 x4 : 0000000000000001
    x3 : ffff8017ce62a000 x2 : ffff0017df9ed340
    x1 : 0000000000000000 x0 : 0000000000000000
    Call trace:
     plist_check_prev_next_node+0x50/0x70
     plist_check_head+0x80/0xf0
     plist_add+0x28/0x140
     add_to_avail_list+0x9c/0xf0
     _enable_swap_info+0x78/0xb4
     __do_sys_swapon+0x918/0xa10
     __arm64_sys_swapon+0x20/0x30
     el0_svc_common+0x8c/0x220
     do_el0_svc+0x2c/0x90
     el0_svc+0x1c/0x30
     el0_sync_handler+0xa8/0xb0
     el0_sync+0x148/0x180
    irq event stamp: 2082270
    
    Now, si->lock locked before calling 'del_from_avail_list()' to make sure
    other thread see the si had been deleted and SWP_WRITEOK cleared together,
    will not reinsert again.
    
    This problem exists in versions after stable 5.10.y.
    
    Link: https://lkml.kernel.org/r/20230404154716.23058-1-rongwei.wang@linux.alibaba.com
    Fixes: a2468cc9 ("swap: choose swap device according to numa node") 
    Tested-by: NYongchen Yin <wb-yyc939293@alibaba-inc.com>
    Signed-off-by: NRongwei Wang <rongwei.wang@linux.alibaba.com>
    Cc: Bagas Sanjaya <bagasdotme@gmail.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Aaron Lu <aaron.lu@intel.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
    6fe7d6b9
swapfile.c 92.5 KB