blk-wbt: fix IO hang due to negative inflight counter
hulk inclusion category: bugfix bugzilla: 182135, https://gitee.com/openeuler/kernel/issues/I4ENC8 CVE: NA -------------------------- Block test reported the following stack, Some req has been watting for wakeup in wbt_wait, and vmcore showed that wbt inflight counter is -1. So Request cannot be awakened. PID: 75416 TASK: ffff88836c098000 CPU: 2 COMMAND: "fsstress" [ffff8882e59a7608] __schedule at ffffffffb2d22a25 [ffff8882e59a7720] schedule at ffffffffb2d2358f [ffff8882e59a7738] io_schedule at ffffffffb2d23bdc [ffff8882e59a7750] rq_qos_wait at ffffffffb2400fde [ffff8882e59a7878] wbt_wait at ffffffffb243a051 [ffff8882e59a7910] __rq_qos_throttle at ffffffffb2400a20 [ffff8882e59a7930] blk_mq_make_request at ffffffffb23de038 [ffff8882e59a7a98] generic_make_request at ffffffffb23c393d [ffff8882e59a7b80] submit_bio at ffffffffb23c3db8 [ffff8882e59a7c48] submit_bio_wait at ffffffffb23b3a5d [ffff8882e59a7cf0] blkdev_issue_flush at ffffffffb23c8f4c [ffff8882e59a7d20] ext4_sync_fs at ffffffffc06dd708 [ext4] [ffff8882e59a7dd0] sync_filesystem at ffffffffb21e8335 [ffff8882e59a7df8] ovl_sync_fs at ffffffffc0fd853a [overlay] [ffff8882e59a7e10] sync_fs_one_sb at ffffffffb21e8221 [ffff8882e59a7e30] iterate_supers at ffffffffb218401e [ffff8882e59a7e70] ksys_sync at ffffffffb21e8588 [ffff8882e59a7f20] __x64_sys_sync at ffffffffb21e861f [ffff8882e59a7f28] do_syscall_64 at ffffffffb1c06bc8 [ffff8882e59a7f50] entry_SYSCALL_64_after_hwframe at ffffffffb2e000ad RIP: 00007f479ab13347 RSP: 00007ffd4dda9fe8 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000068 RCX: 00007f479ab13347 RDX: 0000000000000000 RSI: 000000003e1b142d RDI: 0000000000000068 RBP: 0000000051eb851f R8: 00007f479abd4034 R9: 00007f479abd40a0 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000402c20 R13: 0000000000000001 R14: 0000000000000000 R15: 7fffffffffffffff The ->inflight counter may be negative (-1) if 1) blk-wbt was disabled when the IO was issued, which will add inflight count. 2) blk-wbt was enabled before this IO tracked. 3) the ->inflight counter is decreased from 0 to -1 in endio(). This fixes the problem by freezing the queue while enabling wbt, there is no inflight rq running. Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Showing
想要评论请 注册 或 登录