- 29 3月, 2023 1 次提交
-
-
由 Zhong Jinghua 提交于
hulk inclusion category: bugfix bugzilla: 187268, https://gitee.com/openeuler/kernel/issues/I5N162 ---------------------------------------- This reverts commit 51e35e67. There is a new fix for this problem in the mainline patch, so the patch should return to the mainline solution. mainline patch: d36a9ea5 ("block: fix use-after-free of q->q_usage_counter") Fixes: 51e35e67("block: fix null-deref in percpu_ref_put") Signed-off-by: NZhong Jinghua <zhongjinghua@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 15 3月, 2023 1 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6L586 CVE: NA -------------------------------- Currently, for bio-based device, 'ios' and 'sectors' is counted while io is started, while 'nsecs' is counted while io is done. This behaviour is obviously wrong, however we can't fix exist kapis because this will require new parameter, which will cause kapi broken. Hence this patch add some new apis, which will make sure io accounting for bio-based device is precise. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 04 1月, 2023 1 次提交
-
-
由 Li Nan 提交于
hulk inclusion category: bugfix bugzilla: 187921, https://gitee.com/openeuler/kernel/issues/I66VDB CVE: NA -------------------------------- Enable CONFIG_BLK_RQ_ALLOC_TIME will cause kabi broken, use request wrapper to fix it. Signed-off-by: NLi Nan <linan122@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 12 12月, 2022 1 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I65K8D CVE: NA -------------------------------- request_wrapper is used to fix kabi broken for request, it's only for internal use. This patch make sure out-of-tree drivers won't access request_wrapper if request is not managed by block layer. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 02 11月, 2022 5 次提交
-
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5X6RT CVE: NA -------------------------------- After introducing commit 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting"), '%util' accounted by iostat will be over reality data. In fact, the device is quite idle, but iostat may show '%util' as a big number (e.g. 50%). It can produce by fio: fio --name=1 --direct=1 --bs=4k --rw=read --filename=/dev/sda \ --thinktime=4ms --runtime=180 We fix this by using a switch(precise_iostat=1) to control whether or not acconut ioticks precisely. fixes: 5b18b5a7 ("block: delete part_round_stats and switch to less precise counting") Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.16-rc1 commit e155b0c2 category: performance bugzilla: 186917, https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e155b0c238b20f0a866f4334d292656665836c8a -------------------------------- Currently we use separate sbitmap pairs and active_queues atomic_t for shared sbitmap support. However a full sets of static requests are used per HW queue, which is quite wasteful, considering that the total number of requests usable at any given time across all HW queues is limited by the shared sbitmap depth. As such, it is considerably more memory efficient in the case of shared sbitmap to allocate a set of static rqs per tag set or request queue, and not per HW queue. So replace the sbitmap pairs and active_queues atomic_t with a shared tags per tagset and request queue, which will hold a set of shared static rqs. Since there is now no valid HW queue index to be passed to the blk_mq_ops .init and .exit_request callbacks, pass an invalid index token. This changes the semantics of the APIs, such that the callback would need to validate the HW queue index before using it. Currently no user of shared sbitmap actually uses the HW queue index (as would be expected). Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1633429419-228500-13-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Conflict: c6f9c0e2 ("blk-mq: allow hardware queue to get more tag while sharing a tag set") is merged, which will cause lots of conflicts for this patch, and in the meantime, the functionality will need to be adapted in that patch. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5N1S5 CVE: NA -------------------------------- This reverts commit b97f541e. The related feilds will be modified in later patches which are backported from mainline. Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Wensheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5N162 CVE: NA -------------------------------- In the use of q_usage_counter of request_queue, blk_cleanup_queue using "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))" to wait q_usage_counter becoming zero. however, if the q_usage_counter becoming zero quickly, and percpu_ref_exit will execute and ref->data will be freed, maybe another process will cause a null-defef problem like below: CPU0 CPU1 blk_cleanup_queue blk_freeze_queue blk_mq_freeze_queue_wait scsi_end_request percpu_ref_get ... percpu_ref_put atomic_long_sub_and_test percpu_ref_exit ref->data -> NULL ref->data->release(ref) -> null-deref Fix it by setting flag(QUEUE_FLAG_USAGE_COUNT_SYNC) to add synchronization mechanism, when ref->data->release is called, the flag will be setted, and the "wait_event" in blk_mq_freeze_queue_wait must wait flag becoming true as well, which will limit percpu_ref_exit to execute ahead of time. Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 13 7月, 2022 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I57S8D CVE: NA -------------------------------- Since there are no reserved fields, declare a wrapper to fix kabi broken. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I57S8D CVE: NA -------------------------------- commit 7ec2ec68 ("block: update io_ticks when io hang") fixed that %util will be zero for iostat when io is hanged, however, avgqu-sz is still zero while it represents the number of io that are hunged. On the other hand, for some slow device, if an io is started before and done after diskstats is read, the avgqu-sz will be miscalculated. To fix the problem, update 'nsecs[]' when part_stat_show() or diskstats_show() is called. In order to do that, add 'stat_time' in struct hd_struct and 'rq_stat_time' in struct request to record the time. And during iteration, update 'nsecs[]' for each inflight request. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 31 5月, 2022 2 次提交
-
-
由 Yufen Yu 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I597XM CVE: NA --------------------------- Signed-off-by: NYufen Yu <yuyufen@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-v5.14-rc1 commit d97e594c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I597XM CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d97e594c51660bea510a387731637b894651e4b5 -------------------------------- The tags used for an IO scheduler are currently per hctx. As such, when q->nr_hw_queues grows, so does the request queue total IO scheduler tag depth. This may cause problems for SCSI MQ HBAs whose total driver depth is fixed. Ming and Yanhui report higher CPU usage and lower throughput in scenarios where the fixed total driver tag depth is appreciably lower than the total scheduler tag depth: https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@huawei.com/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b In that scenario, since the scheduler tag is got first, much contention is introduced since a driver tag may not be available after we have got the sched tag. Improve this scenario by introducing request queue-wide tags for when a tagset-wide sbitmap is used. The static sched requests are still allocated per hctx, as requests are initialised per hctx, as in blk_mq_init_request(..., hctx_idx, ...) -> set->ops->init_request(.., hctx_idx, ...). For simplicity of resizing the request queue sbitmap when updating the request queue depth, just init at the max possible size, so we don't need to deal with the possibly with swapping out a new sbitmap for old if we need to grow. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1620907258-30910-3-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk> conflict: block/blk-mq-sched.c block/blk-mq-sched.h block/blk-mq-tag.c Signed-off-by: NYufen Yu <yuyufen@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 21 3月, 2022 1 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 186389, https://gitee.com/openeuler/kernel/issues/I4Y43S CVE: NA -------------------------------- blk_mq_realloc_hw_ctxs() will free the 'queue_hw_ctx'(e.g. undate submit_queues through configfs for null_blk), while it might still be used from other context(e.g. switch elevator to none): t1 t2 elevator_switch blk_mq_unquiesce_queue blk_mq_run_hw_queues queue_for_each_hw_ctx // assembly code for hctx = (q)->queue_hw_ctx[i] mov 0x48(%rbp),%rdx -> read old queue_hw_ctx __blk_mq_update_nr_hw_queues blk_mq_realloc_hw_ctxs hctxs = q->queue_hw_ctx q->queue_hw_ctx = new_hctxs kfree(hctxs) movslq %ebx,%rax mov (%rdx,%rax,8),%rdi ->uaf Sicne the queue is freezed in __blk_mq_update_nr_hw_queues(), fix the problem by protecting 'queue_hw_ctx' through rcu where it can be accessed without grabbing 'q_usage_counter'. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NZhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 08 3月, 2022 2 次提交
-
-
由 Yu Kuai 提交于
hulk inclusion category: performance bugzilla: https://gitee.com/openeuler/kernel/issues/I4S8DW --------------------------- If pending_queues is increased once, it will only be decreased when nr_active is zero, and that will lead to the under-utilization of host tags because pending_queues is non-zero and the available tags for the queue will be max(host tags / active_queues, 4) instead of the needed tags of the queue. Fix it by adding an expiration time for the increasement of pending_queues, and decrease it when it expires, so pending_queues will be decreased to zero if there is no tag allocation failure, and the available tags for the queue will be the whole host tags. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: performance bugzilla: https://gitee.com/openeuler/kernel/issues/I4S8DW --------------------------- When sharing a tag set, if most disks are issuing small amount of IO, and only a few is issuing a large amount of IO. Current approach is to limit the max amount of tags a disk can get equally to the average of total tags. Thus the few heavy load disk can't get enough tags while many tags are still free in the tag set. We add 'pending_queues' in blk_mq_tag_set to count how many queues can't get driver tag. Thus if this value is zero, there is no need to limit the max number of available tags. On the other hand, if a queue doesn't issue IO, the 'active_queues' will not be decreased in a period of time(request timeout), thus a lot of tags will not be available because max number of available tags is set to max(total tags / active_queues, 4). Thus we decreased it when 'nr_active' is 0. This functionality is enabled by default, to disable it, add "blk_mq.unfair_dtag=0" to boot cmd. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 31 12月, 2021 1 次提交
-
-
由 Zhihao Cheng 提交于
hulk inclusion category: feature bugzilla: 185747 https://gitee.com/openeuler/kernel/issues/I4OUFN CVE: NA ------------------------------- Introduce kabi for storage module. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 10 12月, 2021 1 次提交
-
-
由 Keith Busch 提交于
mainline inclusion from mainline-v5.14-rc1 commit fb9b16e1 category: bugfix bugzilla: 185778 https://gitee.com/openeuler/kernel/issues/I4LM14 CVE: NA ----------------------------------------- The synchronous blk_execute_rq() had not provided a way for its callers to know if its request was successful or not. Return the blk_status_t result of the request. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-4-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk> conflict: 1. in blkdev.h and blk-exec, blk_execute_rq return value change; 2. input parameter in blk_execute_rq is not the same as mainline; Signed-off-by: Nzhangwensheng <zhangwensheng5@huawei.com> Reviewed-by: Nqiulaibin <qiulaibin@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 06 12月, 2021 2 次提交
-
-
由 Xie Yongji 提交于
stable inclusion from stable-5.10.81 commit 79ff56c613c193744d6be77d4c50a7ae22d6dd01 bugzilla: 185832 https://gitee.com/openeuler/kernel/issues/I4L9CF Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=79ff56c613c193744d6be77d4c50a7ae22d6dd01 -------------------------------- commit 570b1cac upstream. There are some duplicated codes to validate the block size in block drivers. This limitation actually comes from block layer, so this patch tries to add a new block layer helper for that. Signed-off-by: NXie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20211026144015.188-2-xieyongji@bytedance.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NTadeusz Struk <tadeusz.struk@linaro.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jens Axboe 提交于
stable inclusion from stable-5.10.80 commit b34ea3c91eacdc50c761506cab35b14f67216f76 bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b34ea3c91eacdc50c761506cab35b14f67216f76 -------------------------------- [ Upstream commit ba0ffdd8 ] Particularly for NVMe with efficient deferred submission for many requests, there are nice benefits to be seen by bumping the default max plug count from 16 to 32. This is especially true for virtualized setups, where the submit part is more expensive. But can be noticed even on native hardware. Reduce the multiple queue factor from 4 to 2, since we're changing the default size. While changing it, move the defines into the block layer private header. These aren't values that anyone outside of the block layer uses, or should use. Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 15 11月, 2021 2 次提交
-
-
由 Ming Lei 提交于
mainline inclusion from mainline-v5.16 commit e70feb8b category: bugfix bugzilla: 182378 https://gitee.com/openeuler/kernel/issues/I4DDEL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e70feb8b3e6886c525c88943b5f1508d02f5a683 --------------------------- blk_mq_quiesce_queue() has been used a bit wide now, so far we don't support concurrent/nested quiesce. One biggest issue is that unquiesce can happen unexpectedly in case that quiesce/unquiesce are run concurrently from more than one context. This patch introduces q->mq_quiesce_depth to deal concurrent quiesce, and we only unquiesce queue when it is the last/outer-most one of all contexts. Several kernel panic issue has been reported[1][2][3] when running stress quiesce test. And this patch has been verified in these reports. [1] https://lore.kernel.org/linux-block/9b21c797-e505-3821-4f5b-df7bf9380328@huawei.com/T/#m1fc52431fad7f33b1ffc3f12c4450e4238540787 [2] https://lore.kernel.org/linux-block/9b21c797-e505-3821-4f5b-df7bf9380328@huawei.com/T/#m10ad90afeb9c8cc318334190a7c24c8b5c5e0722 [3] https://listman.redhat.com/archives/dm-devel/2021-September/msg00189.htmlSigned-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211014081710.1871747-7-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 yangerkun 提交于
hulk inclusion category: performance bugzilla: 174005 https://gitee.com/openeuler/kernel/issues/I4DDEL --------------------------- This reverts commit b2d85dfb8c1de87700afae78df99715d3a0788a5. This patch is a local patch try to remove the useless rcu gap for loop setup. Now mainline has the solution too. Revert local patch. Signed-off-by: Nyangerkun <yangerkun@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 19 10月, 2021 2 次提交
-
-
由 Ming Lei 提交于
stable inclusion from stable-5.10.65 commit 87aa69aa10b420823174eedcfd16366ad3d7fe93 bugzilla: 182361 https://gitee.com/openeuler/kernel/issues/I4EH3U Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=87aa69aa10b420823174eedcfd16366ad3d7fe93 -------------------------------- [ Upstream commit 866663b7 ] When merging one bio to request, if they are discard IO and the queue supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE because both block core and related drivers(nvme, virtio-blk) doesn't handle mixed discard io merge(traditional IO merge together with discard merge) well. Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation, so both blk-mq and drivers just need to handle multi-range discard. Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: NMing Lei <ming.lei@redhat.com> Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name> Fixes: 2705dfb2 ("block: fix discard request merge") Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: 177149 https://gitee.com/openeuler/kernel/issues/I4DDEL ----------------------------------------------- If blk-throttle is enabled and io is issued before blk_throtl_register_queue() is done. Divide by zero crash will be triggered in tg_may_dispatch() because 'throtl_slice' is uninitialized. Thus introduce a new flag QUEUE_FLAG_THROTL_INIT_DONE. It will be set after blk_throtl_register_queue() is done, and will be checked before applying any config. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 12 10月, 2021 1 次提交
-
-
由 yangerkun 提交于
Offering: HULK hulk inclusion category: performance bugzilla: 174005 https://gitee.com/openeuler/kernel/issues/I4DDEL --------------------------- 'commit 737eb78e ("block: Delay default elevator initialization")' delay elevator init to fix some problem for special device like SMR. Also, the commit add the logic to ensure no IO can happened while blk_mq_init_sched. However, blk_mq_freeze_queue/blk_mq_quiesce_queue will add RCU Grace period which can lead some overhead(about 36 loop device try to mount which each Grace period around 20ms). For loop device, no io can happened while add_disk, so it's safe to skip this step. Add flag QUEUE_FLAG_NO_INIT_IO to identify this case. Signed-off-by: Nyangerkun <yangerkun@huawei.com> Reviewed-by: NHou Tao <houtao1@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 27 1月, 2021 2 次提交
-
-
由 Alan Stern 提交于
stable inclusion from stable-5.10.7 commit d55d15a332ec651ccb49c42a8a10c03447fdf418 bugzilla: 47429 -------------------------------- [ Upstream commit 52abca64 ] blk_queue_enter() accepts BLK_MQ_REQ_PM requests independent of the runtime power management state. Now that SCSI domain validation no longer depends on this behavior, modify the behavior of blk_queue_enter() as follows: - Do not accept any requests while suspended. - Only process power management requests while suspending or resuming. Submitting BLK_MQ_REQ_PM requests to a device that is runtime suspended causes runtime-suspended devices not to resume as they should. The request which should cause a runtime resume instead gets issued directly, without resuming the device first. Of course the device can't handle it properly, the I/O fails, and the device remains suspended. The problem is fixed by checking that the queue's runtime-PM status isn't RPM_SUSPENDED before allowing a request to be issued, and queuing a runtime-resume request if it is. In particular, the inline blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and the code is unified by merging the surrounding checks into the routine. If the queue isn't set up for runtime PM, or there currently is no restriction on allowed requests, the request is allowed. Likewise if the BLK_MQ_REQ_PM flag is set and the status isn't RPM_SUSPENDED. Otherwise a runtime resume is queued and the request is blocked until conditions are more suitable. [ bvanassche: modified commit message and removed Cc: stable because without the previous patches from this series this patch would break parallel SCSI domain validation + introduced queue_rpm_status() ] Link: https://lore.kernel.org/r/20201209052951.16136-9-bvanassche@acm.org Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reported-and-tested-by: NMartin Kepplinger <martin.kepplinger@puri.sm> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NCan Guo <cang@codeaurora.org> Signed-off-by: NAlan Stern <stern@rowland.harvard.edu> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
由 Bart Van Assche 提交于
stable inclusion from stable-5.10.7 commit 782c9ef2ac059a25d6afbac344319574414258db bugzilla: 47429 -------------------------------- [ Upstream commit a4d34da7 ] Remove flag RQF_PREEMPT and BLK_MQ_REQ_PREEMPT since these are no longer used by any kernel code. Link: https://lore.kernel.org/r/20201209052951.16136-8-bvanassche@acm.org Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Kepplinger <martin.kepplinger@puri.sm> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NCan Guo <cang@codeaurora.org> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 05 12月, 2020 2 次提交
-
-
由 Mike Snitzer 提交于
If non-zero 'chunk_sectors' is passed in to blk_max_size_offset() that override will be incorrectly ignored. Old blk_max_size_offset() branching, prior to commit 3ee16db3, must be used only if passed 'chunk_sectors' override is zero. Fixes: 3ee16db3 ("dm: fix IO splitting") Cc: stable@vger.kernel.org # 5.9 Reported-by: NJohn Dorminy <jdorminy@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com>
-
由 Mike Snitzer 提交于
Commit 882ec4e6 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") caused a couple regressions: 1) Using lcm_not_zero() when stacking chunk_sectors was a bug because chunk_sectors must reflect the most limited of all devices in the IO stack. 2) DM targets that set max_io_len but that do _not_ provide an .iterate_devices method no longer had there IO split properly. And commit 5091cdec ("dm: change max_io_len() to use blk_max_size_offset()") also caused a regression where DM no longer supported varied (per target) IO splitting. The implication being the potential for severely reduced performance for IO stacks that use a DM target like dm-cache to hide performance limitations of a slower device (e.g. one that requires 4K IO splitting). Coming full circle: Fix all these issues by discontinuing stacking chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(), add optional chunk_sectors override argument to blk_max_size_offset() and update DM's max_io_len() to pass ti->max_io_len to its blk_max_size_offset() call. Passing in an optional chunk_sectors override to blk_max_size_offset() allows for code reuse of block's centralized calculation for max IO size based on provided offset and split boundary. Fixes: 882ec4e6 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") Fixes: 5091cdec ("dm: change max_io_len() to use blk_max_size_offset()") Cc: stable@vger.kernel.org Reported-by: NJohn Dorminy <jdorminy@redhat.com> Reported-by: NBruce Johnston <bjohnsto@redhat.com> Reported-by: NKirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: NJohn Dorminy <jdorminy@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NJens Axboe <axboe@kernel.dk>
-
- 17 10月, 2020 1 次提交
-
-
由 Andy Shevchenko 提交于
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out min()/max() et al. helpers. At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for other existing users. Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20200910164152.GA1891694@smile.fi.intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 10月, 2020 2 次提交
-
-
由 Johannes Thumshirn 提交于
Martin rightfully noted that for normal filesystem IO we have soft limits in place, to prevent them from getting too big and not lead to unpredictable latencies. For zone append we only have the hardware limit in place. Cap the max sectors we submit via zone-append to the maximal number of sectors if the second limit is lower. Reported-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/linux-btrfs/yq1k0w8g3rw.fsf@ca-mkp.ca.oracle.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Always return BLK_ZONED_NONE if zoned device support is not enabled. This allows various compiler optimizations including the dead code elimination that we so like for avoiding ifdefs. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
-
- 06 10月, 2020 4 次提交
-
-
由 Christoph Hellwig 提交于
Also move the definition from the public blkdev.h to the private block/blk.h header. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Also move the definition from the public blkdev.h to the private block/blk.h header. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
The field of 'q_usage_counter' is always fetched in fast path of every block driver, and move it into front of 'request_queue', so it can be fetched into 1st cacheline of 'request_queue' instance. Signed-off-by: NMing Lei <ming.lei@redhat.com> Tested-by: NVeronika Kabatova <vkabatov@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Bart Van Assche <bvanassche@acm.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All remaining callers of bdget() outside of fs/block_dev.c want to get a reference to the struct block_device for a given struct hd_struct. Add a helper just for that and then mark bdget static. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 9月, 2020 4 次提交
-
-
由 Mike Snitzer 提交于
Add QUEUE_FLAG_NOWAIT to allow a block device to advertise support for REQ_NOWAIT. Bio-based devices may set QUEUE_FLAG_NOWAIT where applicable. Update QUEUE_FLAG_MQ_DEFAULT to include QUEUE_FLAG_NOWAIT. Also update submit_bio_checks() to verify it is set for REQ_NOWAIT bios. Reported-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a littler helper to make the somewhat arcane bd_contains checks a little more obvious. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The BDI_CAP_STABLE_WRITES is one of the few bits of information in the backing_dev_info shared between the block drivers and the writeback code. To help untangling the dependency replace it with a queue flag and a superblock flag derived from it. This also helps with the case of e.g. a file system requiring stable writes due to its own checksumming, but not forcing it on other users of the block device like the swap code. One downside is that we an't support the stable_pages_required bdi attribute in sysfs anymore. It is replaced with a queue attribute which also is writable for easier testing. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Drivers shouldn't really mess with the readahead size, as that is a VM concept. Instead set it based on the optimal I/O size by lifting the algorithm from the md driver when registering the disk. Also set bdi->io_pages there as well by applying the same scheme based on max_sectors. To ensure the limits work well for stacking drivers a new helper is added to update the readahead limits from the block limits, which is also called from disk_stack_limits. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: NColy Li <colyli@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-