• J
    md/raid1: stop mdx_raid1 thread when raid1 array run failed · 025dac6f
    Jiang Li 提交于
    mainline inclusion
    from mainline-v6.2-rc1
    commit b611ad14
    category: bugfix
    bugzilla: 188662, https://gitee.com/openeuler/kernel/issues/I6UMUF
    CVE: NA
    
    Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=b611ad14006e5be2170d9e8e611bf49dff288911
    
    --------------------------------
    
    fail run raid1 array when we assemble array with the inactive disk only,
    but the mdx_raid1 thread were not stop, Even if the associated resources
    have been released. it will caused a NULL dereference when we do poweroff.
    
    This causes the following Oops:
        [  287.587787] BUG: kernel NULL pointer dereference, address: 0000000000000070
        [  287.594762] #PF: supervisor read access in kernel mode
        [  287.599912] #PF: error_code(0x0000) - not-present page
        [  287.605061] PGD 0 P4D 0
        [  287.607612] Oops: 0000 [#1] SMP NOPTI
        [  287.611287] CPU: 3 PID: 5265 Comm: md0_raid1 Tainted: G     U            5.10.146 #0
        [  287.619029] Hardware name: xxxxxxx/To be filled by O.E.M, BIOS 5.19 06/16/2022
        [  287.626775] RIP: 0010:md_check_recovery+0x57/0x500 [md_mod]
        [  287.632357] Code: fe 01 00 00 48 83 bb 10 03 00 00 00 74 08 48 89 ......
        [  287.651118] RSP: 0018:ffffc90000433d78 EFLAGS: 00010202
        [  287.656347] RAX: 0000000000000000 RBX: ffff888105986800 RCX: 0000000000000000
        [  287.663491] RDX: ffffc90000433bb0 RSI: 00000000ffffefff RDI: ffff888105986800
        [  287.670634] RBP: ffffc90000433da0 R08: 0000000000000000 R09: c0000000ffffefff
        [  287.677771] R10: 0000000000000001 R11: ffffc90000433ba8 R12: ffff888105986800
        [  287.684907] R13: 0000000000000000 R14: fffffffffffffe00 R15: ffff888100b6b500
        [  287.692052] FS:  0000000000000000(0000) GS:ffff888277f80000(0000) knlGS:0000000000000000
        [  287.700149] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [  287.705897] CR2: 0000000000000070 CR3: 000000000320a000 CR4: 0000000000350ee0
        [  287.713033] Call Trace:
        [  287.715498]  raid1d+0x6c/0xbbb [raid1]
        [  287.719256]  ? __schedule+0x1ff/0x760
        [  287.722930]  ? schedule+0x3b/0xb0
        [  287.726260]  ? schedule_timeout+0x1ed/0x290
        [  287.730456]  ? __switch_to+0x11f/0x400
        [  287.734219]  md_thread+0xe9/0x140 [md_mod]
        [  287.738328]  ? md_thread+0xe9/0x140 [md_mod]
        [  287.742601]  ? wait_woken+0x80/0x80
        [  287.746097]  ? md_register_thread+0xe0/0xe0 [md_mod]
        [  287.751064]  kthread+0x11a/0x140
        [  287.754300]  ? kthread_park+0x90/0x90
        [  287.757974]  ret_from_fork+0x1f/0x30
    
    In fact, when raid1 array run fail, we need to do
    md_unregister_thread() before raid1_free().
    Signed-off-by: NJiang Li <jiang.li@ugreen.com>
    Signed-off-by: NSong Liu <song@kernel.org>
    Signed-off-by: NLi Nan <linan122@huawei.com>
    Reviewed-by: NHou Tao <houtao1@huawei.com>
    (cherry picked from commit 22eeb5d1)
    025dac6f
raid1.c 93.5 KB