- 22 6月, 2022 1 次提交
-
-
由 Jens Axboe 提交于
If rq_qos_throttle() ends up blocking, then we will have invalidated and flushed our current plug. Since blk_mq_get_cached_request() hasn't popped the cached request off the plug list just yet, we end holding a pointer to a request that is no longer valid. This insta-crashes with rq->mq_hctx being NULL in the validity checks just after. Pop the request off the cached list before doing rq_qos_throttle() to avoid using a potentially stale request. Fixes: 0a5aa8d1 ("block: fix blk_mq_attempt_bio_merge and rq_qos_throttle protection") Reported-by: NDylan Yudaken <dylany@fb.com> Tested-by: NDylan Yudaken <dylany@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 6月, 2022 4 次提交
-
-
由 Ming Lei 提交于
commit 364b6181 ("blk-mq: clearing flush request reference in tags->rqs[]") is added to clear the to-be-free flush request from tags->rqs[] for avoiding use-after-free on the flush rq. Yu Kuai reported that blk_mq_clear_flush_rq_mapping() slows down boot time by ~8s because running scsi probe which may create and remove lots of unpresent LUNs on megaraid-sas which uses BLK_MQ_F_TAG_HCTX_SHARED and each request queue has lots of hw queues. Improve the situation by not running blk_mq_clear_flush_rq_mapping if disk isn't added when there can't be any flush request issued. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reported-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220616014401.817001-4-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
q->elevator is referred in blk_mq_has_sqsched() without any protection, no .q_usage_counter is held, no queue srcu and rcu read lock is held, so potential use-after-free may be triggered. Fix the issue by adding one queue flag for checking if the elevator uses single queue style dispatch. Meantime the elevator feature flag of ELEVATOR_F_MQ_AWARE isn't needed any more. Cc: Jan Kara <jack@suse.cz> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220616014401.817001-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
elevator can be tore down by sysfs switch interface or disk release, so hold ->sysfs_lock before referring to q->elevator, then potential use-after-free can be avoided. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220616014401.817001-2-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Bart Van Assche 提交于
This patch prevents that test nvme/004 triggers the following: UBSAN: array-index-out-of-bounds in block/blk-mq.h:135:9 index 512 is out of range for type 'long unsigned int [512]' Call Trace: show_stack+0x52/0x58 dump_stack_lvl+0x49/0x5e dump_stack+0x10/0x12 ubsan_epilogue+0x9/0x3b __ubsan_handle_out_of_bounds.cold+0x44/0x49 blk_mq_alloc_request_hctx+0x304/0x310 __nvme_submit_sync_cmd+0x70/0x200 [nvme_core] nvmf_connect_io_queue+0x23e/0x2a0 [nvme_fabrics] nvme_loop_connect_io_queues+0x8d/0xb0 [nvme_loop] nvme_loop_create_ctrl+0x58e/0x7d0 [nvme_loop] nvmf_create_ctrl+0x1d7/0x4d0 [nvme_fabrics] nvmf_dev_write+0xae/0x111 [nvme_fabrics] vfs_write+0x144/0x560 ksys_write+0xb7/0x140 __x64_sys_write+0x42/0x50 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Fixes: 20e4d813 ("blk-mq: simplify queue mapping & schedule with each possisble CPU") Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220615210004.1031820-1-bvanassche@acm.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 5月, 2022 1 次提交
-
-
由 Haisu Wang 提交于
Flush or passthrough requests are not accounted as normal IO in completion. To reflect iostat for slow IO, io_ticks is updated when stat show called based on inflight numbers. It may cause inconsistent io_ticks calculation result. So do not account non-passthrough request when check inflight. Fixes: 86d73312 ("block: update io_ticks when io hang") Signed-off-by: NHaisu Wang <haisuwang@tencent.com> Reviewed-by: Nsamuelliao <samuelliao@tencent.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220530064059.1120058-1-haisuwang@tencent.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 5月, 2022 3 次提交
-
-
由 Christoph Hellwig 提交于
Let the caller set it together with the end_io_data instead of passing a pointless argument. Note the the target code did in fact already set it and then just overrode it again by calling blk_execute_rq_nowait. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NKanchan Joshi <joshi.k@samsung.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Instead of trying to cast a __bitwise 32-bit integer to a larger integer and then a pointer, just allow a struct with the blk_status_t and the completion on stack and set the end_io_data to that. Use the opportunity to move the code to where it belongs and drop rather confusing comments. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
We don't want to plug for synchronous execution that where we immediately wait for the request. Once that is done not a whole lot of code is shared, so just remove __blk_execute_rq_nowait. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220524121530.943123-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 23 5月, 2022 1 次提交
-
-
由 Ming Lei 提交于
blk_mq_run_hw_queues() could be run when there isn't queued request and after queue is cleaned up, at that time tagset is freed, because tagset lifetime is covered by driver, and often freed after blk_cleanup_queue() returns. So don't touch ->tagset for figuring out current default hctx by the mapping built in request queue, so use-after-free on tagset can be avoided. Meantime this way should be fast than retrieving mapping from tagset. Cc: "yukuai (C)" <yukuai3@huawei.com> Cc: Jan Kara <jack@suse.cz> Fixes: b6e68ee8 ("blk-mq: Improve performance of non-mq IO schedulers with multiple HW queues") Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220522122350.743103-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 21 5月, 2022 1 次提交
-
-
由 Julia Lawall 提交于
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: NJulia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/20220521111145.81697-29-Julia.Lawall@inria.frSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 5月, 2022 1 次提交
-
-
由 Ming Lei 提交于
First we can't add request into plug list in blk_mq_request_bypass_insert which may be called when flushing plug list, so nested plug is caused. Second if polled passthrough request is inserted via blk_execute_rq(), it can't be added to plug list too since io polling needs the request to be issued to driver. Fixes the two by moving plugging into blk_execute_rq_no_wait(). Cc: Christoph Hellwig <hch@lst.de> Fixes: 1c2d2fff ("block: wire-up support for passthrough plugging") Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220512140010.1458645-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 11 5月, 2022 1 次提交
-
-
由 Jens Axboe 提交于
Add support for plugging in passthrough path. When plugging is enabled, the requests are added to a plug instead of getting dispatched to the driver. And when the plug is finished, the whole batch gets dispatched via ->queue_rqs which turns out to be more efficient. Otherwise dispatching used to happen via ->queue_rq, one request at a time. Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220511054750.20432-3-joshi.k@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 4月, 2022 1 次提交
-
-
由 Tejun Heo 提交于
This reverts commit 00067077. It has a couple problems: * bio_issue_time() is stored in bio->bi_issue truncated to 51 bits. This overflows in slightly over 26 days. Setting rq->io_start_time_ns with it means that io duration calculation would yield >26days after 26 days of uptime. This, for example, confuses kyber making it cause high IO latencies. * rq->io_start_time_ns should record the time that the IO is issued to the device so that on-device latency can be measured. However, bio_issue_time() is set before the bio goes through the rq-qos controllers (wbt, iolatency, iocost), so when the bio gets throttled in any of the mechanisms, the measured latencies make no sense - on-device latencies end up higher than request-alloc-to-completion latencies. We'll need a smarter way to avoid calling ktime_get_ns() repeatedly back-to-back. For now, let's revert the commit. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v5.16+ Link: https://lore.kernel.org/r/YmmeOLfo5lzc+8yI@slm.duckdns.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 4月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
When a disk has been marked dead, don't print warnings for I/O errors as they are very much expected. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220323163815.1526998-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 31 3月, 2022 1 次提交
-
-
由 Jakob Koschel 提交于
To move the list iterator variable into the list_for_each_entry_*() macro in the future it should be avoided to use the list iterator variable after the loop body. To *never* use the list iterator variable after the loop it was concluded to use a separate iterator variable instead of a found boolean [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Signed-off-by: NJakob Koschel <jakobkoschel@gmail.com> Link: https://lore.kernel.org/r/20220331091218.641532-1-jakobkoschel@gmail.com [axboe: move lookup to where return value is checked] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 3月, 2022 1 次提交
-
-
由 Jens Axboe 提交于
We used to sort the plug list if we had multiple queues before dispatching requests to the IO scheduler. This usually isn't needed, but for certain workloads that interleave requests to disks, it's a less efficient to process the plug list one-by-one if everything is interleaved. Don't sort the list, but skip through it and flush out entries that have the same target at the same time. Fixes: df87eb0f ("block: get rid of plug list sorting") Reported-and-tested-by: NSong Liu <song@kernel.org> Reviewed-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 3月, 2022 9 次提交
-
-
由 Ming Lei 提交于
The queue's top debugfs dir is removed from blk_release_queue(), so all hctx's debugfs dirs are removed from there. Given blk_mq_exit_queue() is only called from blk_cleanup_queue(), it isn't necessary to remove hctx debugfs from blk_mq_exit_queue(). So remove it from blk_mq_exit_queue(). Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220308055200.735835-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
To simplify further changes allow for double calling blk_mq_free_rqs on a queue. Signed-off-by: NMing Lei <ming.lei@redhat.com> [hch: split out from a larger patch] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
I/O accounting buckets I/O into the read/write/discard categories into which passthrough I/O does not fit at all. It also accounts to the block_device, which may not even exist for passthrough I/O. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
First code becomes more clean by switching to xarray from plain array. Second use-after-free on q->queue_hw_ctx can be fixed because queue_for_each_hw_ctx() may be run when updating nr_hw_queues is in-progress. With this patch, q->hctx_table is defined as xarray, and this structure will share same lifetime with request queue, so queue_for_each_hw_ctx() can use q->hctx_table to lookup hctx reliably. Reported-by: NYu Kuai <yukuai3@huawei.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220308073219.91173-7-ming.lei@redhat.com [axboe: fix blk_mq_hw_ctx forward declaration] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
It is inevitable to cause use-after-free on q->queue_hw_ctx between queue_for_each_hw_ctx() and blk_mq_update_nr_hw_queues(). And converting to xarray can fix the uaf, meantime code gets cleaner. Prepare for converting q->queue_hctx_ctx into xarray, one thing is that xa_for_each() can only accept 'unsigned long' as index, so changes type of hctx index of queue_for_each_hw_ctx() into 'unsigned long'. Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220308073219.91173-6-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
queue map can be changed when updating nr_hw_queues, so we need to reconfigure queue's poll capability. Add one helper for doing this job. Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220308073219.91173-4-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
blk_mq_alloc_and_init_hctx() has already taken reuse into account, so no need to do it outside, then we can simplify blk_mq_realloc_hw_ctxs(). Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220308073219.91173-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
The current code always uses default queue map and hw queue index for figuring out the numa node for hw queue, this way isn't correct because blk-mq supports three queue maps, and the correct queue map should be used for the specified hw queue. Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220308073219.91173-2-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Shin'ichiro Kawasaki 提交于
Commit 9d497e29 ("block: don't protect submit_bio_checks by q_usage_counter") moved blk_mq_attempt_bio_merge and rq_qos_throttle calls out of q_usage_counter protection. However, these functions require q_usage_counter protection. The blk_mq_attempt_bio_merge call without the protection resulted in blktests block/005 failure with KASAN null- ptr-deref or use-after-free at bio merge. The rq_qos_throttle call without the protection caused kernel hang at qos throttle. To fix the failures, move the blk_mq_attempt_bio_merge and rq_qos_throttle calls back to q_usage_counter protection. Fixes: 9d497e29 ("block: don't protect submit_bio_checks by q_usage_counter") Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20220308080915.3473689-1-shinichiro.kawasaki@wdc.comReviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 3月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
With the NVMe support for this gone, there are no consumers of these hints left, so remove them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220304175556.407719-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 2月, 2022 5 次提交
-
-
由 David Jeffery 提交于
When blk_mq_delay_run_hw_queues sets an hctx to run in the future, it can reset the delay length for an already pending delayed work run_work. This creates a scenario where multiple hctx may have their queues set to run, but if one runs first and finds nothing to do, it can reset the delay of another hctx and stall the other hctx's ability to run requests. To avoid this I/O stall when an hctx's run_work is already pending, leave it untouched to run at its current designated time rather than extending its delay. The work will still run which keeps closed the race calling blk_mq_delay_run_hw_queues is needed for while also avoiding the I/O stall. Signed-off-by: NDavid Jeffery <djeffery@redhat.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220131203337.GA17666@redhatSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
blk_crypto_bio_prep() is called for both bio based and blk-mq drivers, so move it out of blk-mq.c, then we can unify this kind of handling. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220216044514.2903784-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The request must be submitted to the queue it was allocated for, so remove the extra request_queue argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold blk_cloned_rq_check_limits into its only caller. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The code to stack blk-mq drivers is only used by dm-multipath, and will preferably stay that way. Make it optional and only selected by device mapper, so that the buildbots more easily catch abuses like the one that slipped in in the ufs driver in the last merged window. Another positive side effects is that kernel builds without device mapper shrink a little bit as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 2月, 2022 2 次提交
-
-
由 Yang Shi 提交于
Currently, rasdaemon uses the existing tracepoint block_rq_complete and filters out non-error cases in order to capture block disk errors. But there are a few problems with this approach: 1. Even kernel trace filter could do the filtering work, there is still some overhead after we enable this tracepoint. 2. The filter is merely based on errno, which does not align with kernel logic to check the errors for print_req_error(). 3. block_rq_complete only provides dev major and minor to identify the block device, it is not convenient to use in user-space. So introduce a new tracepoint block_rq_error just for the error case. With this patch, rasdaemon could switch to block_rq_error. Since the new tracepoint has the similar implementation with block_rq_complete, so move the existing code from TRACE_EVENT block_rq_complete() into new event class block_rq_completion(). Then add event for block_rq_complete and block_rq_err respectively from the newly created event class per the suggestion from Chaitanya Kulkarni. Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220210225222.260069-1-shy828301@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pankaj Raghav 提交于
Zone append command needs special handling to update the bi_sector field in the bio struct with the actual position of the data in the device. It is stored in __sector field of the request struct. Fixes: 5581a5dd ("block: add completion handler for fast path") Signed-off-by: NPankaj Raghav <p.raghav@samsung.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NAdam Manzanares <a.manzanares@samsung.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220211093425.43262-2-p.raghav@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 2月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
Pass a block_device to bio_clone_fast and __bio_clone_fast and give the functions more suitable names. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 1月, 2022 1 次提交
-
-
由 Yu Kuai 提交于
If blk_mq_request_issue_directly() failed from blk_insert_cloned_request(), the request will be accounted start. Currently, blk_insert_cloned_request() is only called by dm, and such request won't be accounted done by dm. In normal path, io will be accounted start from blk_mq_bio_to_request(), when the request is allocated, and such io will be accounted done from __blk_mq_end_request_acct() whether it succeeded or failed. Thus add blk_account_io_done() to fix the problem. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220126012132.3111551-1-yukuai3@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 18 1月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
bio_clone_fast() sets the cloned bio to have the same ->bi_bdev as the source bio. This means that when request-based dm called setup_clone(), the cloned bio had its ->bi_bdev pointing to the dm device. After Commit 0b6e522c ("blk-mq: use ->bi_bdev for I/O accounting") __blk_account_io_start() started using the request's ->bio->bi_bdev for I/O accounting, if it was set. This caused IO going to the underlying devices to use the dm device for their I/O accounting. Set up the proper ->bi_bdev in blk_rq_prep_clone based on the whole device bdev for the queue the request is cloned onto. Fixes: 0b6e522c ("blk-mq: use ->bi_bdev for I/O accounting") Reported-by: NBenjamin Marzinski <bmarzins@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> [hch: the commit message is mostly from a different patch from Benjamin] Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NBenjamin Marzinski <bmarzins@redhat.com> Link: https://lore.kernel.org/r/20220118070444.1241739-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 1月, 2022 1 次提交
-
-
由 Yury Norov 提交于
cpumask_first() is a more effective analogue of 'next' version if n == -1 (which means start == 0). This patch replaces 'next' with 'first' where things look trivial. There's no cpumask_first_zero() function, so create it. Signed-off-by: NYury Norov <yury.norov@gmail.com> Tested-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
-
- 10 1月, 2022 1 次提交
-
-
由 Ming Lei 提交于
Commit cc9c884d ("block: call submit_bio_checks under q_usage_counter") uses q_usage_counter to protect submit_bio_checks for avoiding IO after disk is deleted by del_gendisk(). Turns out the protection isn't necessary, because once blk_mq_freeze_queue_wait() in del_gendisk() returns: 1) all in-flight IO has been done 2) all new IO will be failed in __bio_queue_enter() because q_usage_counter is dead, and GD_DEAD is set 3) both disk and request queue instance are safe since caller of submit_bio() guarantees that the disk can't be closed. Once submit_bio_checks() needn't the protection of q_usage_counter, we can move submit_bio_checks before calling blk_mq_submit_bio() and ->submit_bio(). With this change, we needn't to throttle queue with holding one allocated request, then precise driver tag or request won't be wasted in throttling. Meantime we can unify the bio check for both bio based and request based driver. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220104134223.590803-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 21 12月, 2021 1 次提交
-
-
由 Keith Busch 提交于
The low level drivers don't expect to see new requests after a successful quiesce completes. Check the queue quiesce state within the rcu protected area prior to calling the driver's queue_rqs(). Fixes: 3c67d44d ("block: add mq_ops->queue_rqs hook") Signed-off-by: NKeith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20211220205919.180191-1-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-