提交 02512915 编写于 作者: L Laibin Qiu 提交者: Yang Yingliang

blk-wbt: fix IO hang due to negative inflight counter

hulk inclusion
category: bugfix
bugzilla: 182135, https://gitee.com/openeuler/kernel/issues/I4ENC8
CVE: NA

--------------------------

Block test reported the following stack, Some req has been watting for
wakeup in wbt_wait, and vmcore showed that wbt inflight counter is -1.
So Request cannot be awakened.

PID: 75416  TASK: ffff88836c098000  CPU: 2   COMMAND: "fsstress"
[ffff8882e59a7608] __schedule at ffffffffb2d22a25
[ffff8882e59a7720] schedule at ffffffffb2d2358f
[ffff8882e59a7738] io_schedule at ffffffffb2d23bdc
[ffff8882e59a7750] rq_qos_wait at ffffffffb2400fde
[ffff8882e59a7878] wbt_wait at ffffffffb243a051
[ffff8882e59a7910] __rq_qos_throttle at ffffffffb2400a20
[ffff8882e59a7930] blk_mq_make_request at ffffffffb23de038
[ffff8882e59a7a98] generic_make_request at ffffffffb23c393d
[ffff8882e59a7b80] submit_bio at ffffffffb23c3db8
[ffff8882e59a7c48] submit_bio_wait at ffffffffb23b3a5d
[ffff8882e59a7cf0] blkdev_issue_flush at ffffffffb23c8f4c
[ffff8882e59a7d20] ext4_sync_fs at ffffffffc06dd708 [ext4]
[ffff8882e59a7dd0] sync_filesystem at ffffffffb21e8335
[ffff8882e59a7df8] ovl_sync_fs at ffffffffc0fd853a [overlay]
[ffff8882e59a7e10] sync_fs_one_sb at ffffffffb21e8221
[ffff8882e59a7e30] iterate_supers at ffffffffb218401e
[ffff8882e59a7e70] ksys_sync at ffffffffb21e8588
[ffff8882e59a7f20] __x64_sys_sync at ffffffffb21e861f
[ffff8882e59a7f28] do_syscall_64 at ffffffffb1c06bc8
[ffff8882e59a7f50] entry_SYSCALL_64_after_hwframe at ffffffffb2e000ad
RIP: 00007f479ab13347  RSP: 00007ffd4dda9fe8  RFLAGS: 00000202
RAX: ffffffffffffffda  RBX: 0000000000000068  RCX: 00007f479ab13347
RDX: 0000000000000000  RSI: 000000003e1b142d  RDI: 0000000000000068
RBP: 0000000051eb851f   R8: 00007f479abd4034   R9: 00007f479abd40a0
R10: 0000000000000000  R11: 0000000000000202  R12: 0000000000402c20
R13: 0000000000000001  R14: 0000000000000000  R15: 7fffffffffffffff

The ->inflight counter may be negative (-1) if

1) blk-wbt was disabled when the IO was issued,
which will add inflight count.

2) blk-wbt was enabled before this IO tracked.

3) the ->inflight counter is decreased from
0 to -1 in endio().

This fixes the problem by freezing the queue while enabling wbt,
there is no inflight rq running.
Signed-off-by: NLaibin Qiu <qiulaibin@huawei.com>
Reviewed-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
上级 01f486af
......@@ -23,6 +23,9 @@
#include <linux/slab.h>
#include <linux/backing-dev.h>
#include <linux/swap.h>
#ifndef __GENKSYMS__
#include <linux/blk-mq.h>
#endif
#include "blk-wbt.h"
#include "blk-rq-qos.h"
......@@ -824,9 +827,16 @@ int wbt_init(struct request_queue *q)
rq_qos_add(q, &rwb->rqos);
blk_stat_add_callback(q, rwb->cb);
rwb->min_lat_nsec = wbt_default_latency_nsec(q);
/*
* Ensure that the queue is idled by freezing the queue
* while enabling wbt, there is no inflight rq running.
*/
blk_mq_freeze_queue(q);
rwb->min_lat_nsec = wbt_default_latency_nsec(q);
wbt_set_queue_depth(q, blk_queue_depth(q));
blk_mq_unfreeze_queue(q);
wbt_set_write_cache(q, test_bit(QUEUE_FLAG_WC, &q->queue_flags));
return 0;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册