- 18 2月, 2022 1 次提交
-
-
由 Yu Kuai 提交于
Use bfq_group() instead, which do the same thing. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Acked-by: NPaolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220129015924.3958918-2-yukuai3@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 2月, 2022 20 次提交
-
-
由 Yahu Gao 提交于
The return value is ioprio * BFQ_WEIGHT_CONVERSION_COEFF or 0. What we want is ioprio or 0. Correct this by changing the calculation. Signed-off-by: NYahu Gao <gaoyahu19@gmail.com> Acked-by: NPaolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20220107065859.25689-1-gaoyahu19@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 David Jeffery 提交于
When blk_mq_delay_run_hw_queues sets an hctx to run in the future, it can reset the delay length for an already pending delayed work run_work. This creates a scenario where multiple hctx may have their queues set to run, but if one runs first and finds nothing to do, it can reset the delay of another hctx and stall the other hctx's ability to run requests. To avoid this I/O stall when an hctx's run_work is already pending, leave it untouched to run at its current designated time rather than extending its delay. The work will still run which keeps closed the race calling blk_mq_delay_run_hw_queues is needed for while also avoiding the I/O stall. Signed-off-by: NDavid Jeffery <djeffery@redhat.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220131203337.GA17666@redhatSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Implement the ->free_disk method to free the virtio_blk structure only once the last gendisk reference goes away instead of keeping a local refcount. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20220215094514.3828912-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Implement the ->free_disk method to free the msb_data structure only once the last gendisk reference goes away instead of keeping a local refcount. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220215094514.3828912-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use set_disk_ro to propagate the read-only state to the block layer instead of checking for it in ->open and leaking a reference in case of a read-only device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220215094514.3828912-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Implement the ->free_disk method to free the msb_data structure only once the last gendisk reference goes away instead of keeping a local refcount. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220215094514.3828912-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a method to notify the driver that the gendisk is about to be freed. This allows drivers to tie the lifetime of their private data to that of the gendisk and thus deal with device removal races without expensive synchronization and boilerplate code. A new flag is added so that ->free_disk is only called after a successful call to add_disk, which significantly simplifies the error handling path during probing. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220215094514.3828912-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
Revert commit 4f1e9630 ("blk-throtl: optimize IOPS throttle for large IO scenarios") since we have another easier way to address this issue and get better iops throttling result. Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220216044514.2903784-9-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
We need to throttle split bio in case of IOPS limit even though the split bio has been marked as BIO_THROTTLED since block layer accounts split bio actually. If only throughput throttle is setup, no need to throttle any more if BIO_THROTTLED is set since we have accounted & considered the whole bio bytes already. Add one flag of THROTL_TG_HAS_IOPS_LIMIT for serving this purpose. Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220216044514.2903784-8-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
Commit 111be883 ("block-throttle: avoid double charge") marks bio as BIO_THROTTLED unconditionally if __blk_throtl_bio() is called on this bio, then this bio won't be called into __blk_throtl_bio() any more. This way is to avoid double charge in case of bio splitting. It is reasonable for read/write throughput limit, but not reasonable for IOPS limit because block layer provides io accounting against split bio. Chunguang Xu has already observed this issue and fixed it in commit 4f1e9630 ("blk-throtl: optimize IOPS throttle for large IO scenarios"). However, that patch only covers bio splitting in __blk_queue_split(), and we have other kind of bio splitting, such as bio_split() & submit_bio_noacct() and other ways. This patch tries to fix the issue in one generic way by always charging the bio for iops limit in blk_throtl_bio(). This way is reasonable: re-submission & fast-cloned bio is charged if it is submitted to same disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed. This new approach can get much more smooth/stable iops limit compared with commit 4f1e9630 ("blk-throtl: optimize IOPS throttle for large IO scenarios") since that commit can't throttle current split bios actually. Also this way won't cause new double bio iops charge in blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called any more. Reported-by: NNing Li <lining2020x@163.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Chunguang Xu <brookxu@tencent.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220216044514.2903784-7-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
Now submit_bio_checks() is only called by submit_bio_noacct(), so merge it into submit_bio_noacct(). Suggested-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220216044514.2903784-6-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
The bio has been checked already before throttling, so no need to check it again before dispatching it from throttle queue. Add a helper of submit_bio_noacct_nocheck() for this purpose. Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220216044514.2903784-5-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
submit_bio_checks() won't be called outside of block/blk-core.c any more since commit 9d497e29 ("block: don't protect submit_bio_checks by q_usage_counter"), so mark it as one local helper. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220216044514.2903784-4-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
blk_crypto_bio_prep() is called for both bio based and blk-mq drivers, so move it out of blk-mq.c, then we can unify this kind of handling. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220216044514.2903784-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
It is more clean & readable to check bio when starting to submit it, instead of just before calling ->submit_bio() or blk_mq_submit_bio(). Also it provides us chance to optimize bio submission without checking bio. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220216044514.2903784-2-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold dm_dispatch_clone_request into it's only caller, and use a switch statement to single dispatch for the handling of the different return values from blk_insert_cloned_request. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Both ->start_time_ns and the RQF_IO_STAT are set when the request is allocated using blk_mq_alloc_request by dm-mpath in blk_mq_rq_ctx_init. The block layer also ensures ->start_time_ns is only set when actually needed. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The request must be submitted to the queue it was allocated for, so remove the extra request_queue argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold blk_cloned_rq_check_limits into its only caller. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The code to stack blk-mq drivers is only used by dm-multipath, and will preferably stay that way. Make it optional and only selected by device mapper, so that the buildbots more easily catch abuses like the one that slipped in in the ufs driver in the last merged window. Another positive side effects is that kernel builds without device mapper shrink a little bit as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220215100540.3892965-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 2月, 2022 1 次提交
-
-
由 Chengming Zhou 提交于
Don't need to do blkg_iostat_set for top blkg iostat on each CPU, so move it after percpu stat aggregation. Fixes: ef45fe47 ("blk-cgroup: show global disk stats in root cgroup io.stat") Signed-off-by: NChengming Zhou <zhouchengming@bytedance.com> Acked-by: NTejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220213085902.88884-1-zhouchengming@bytedance.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 2月, 2022 2 次提交
-
-
由 Chaitanya Kulkarni 提交于
Based on the comment present in the bdev_get_queue() bdev->bd_queue can never be NULL. Remove the NULL check for the local variable q that is set from bdev_get_queue() for discard, write_same, and write_zeroes. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220215115247.11717-2-kch@nvidia.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This document is completely out of date and extremely misleading. In general the existing kerneldoc comment serve as a much better documentation of the still existing functionality, while the history blurbs are pretty much irrelevant today. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220215081047.3693582-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 2月, 2022 5 次提交
-
-
由 Barry Song 提交于
Since commit 7eaceacc ("block: remove per-queue plugging"), kernel has removed blk_run_address_space(), blk_unplug() and sync_buffer(), and moved to on-stack plugging. The document has been obsolete for years. Given that there is no obvious counterparts in the new mechinism to replace old APIs, this patch drops the content directly. Signed-off-by: NBarry Song <song.bao.hua@hisilicon.com> Link: https://lore.kernel.org/r/20220207074931.20067-1-song.bao.hua@hisilicon.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
Partition include/linux/blk-cgroup.h into two parts: one is public part, the other is block layer private part. Suggested by Christoph Hellwig. Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220211101149.2368042-4-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
q->blkg_list is only used by blkcg code, so move it into blkcg_init_queue. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220211101149.2368042-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
No one uses THROTL_IOPS_MAX any more, so remove it. Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220211101149.2368042-2-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Yang Shi 提交于
Currently, rasdaemon uses the existing tracepoint block_rq_complete and filters out non-error cases in order to capture block disk errors. But there are a few problems with this approach: 1. Even kernel trace filter could do the filtering work, there is still some overhead after we enable this tracepoint. 2. The filter is merely based on errno, which does not align with kernel logic to check the errors for print_req_error(). 3. block_rq_complete only provides dev major and minor to identify the block device, it is not convenient to use in user-space. So introduce a new tracepoint block_rq_error just for the error case. With this patch, rasdaemon could switch to block_rq_error. Since the new tracepoint has the similar implementation with block_rq_complete, so move the existing code from TRACE_EVENT block_rq_complete() into new event class block_rq_completion(). Then add event for block_rq_complete and block_rq_err respectively from the newly created event class per the suggestion from Chaitanya Kulkarni. Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220210225222.260069-1-shy828301@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 2月, 2022 2 次提交
-
-
由 John Garry 提交于
Since __sbitmap_queue_get_shallow() was introduced in commit c05e6673 ("sbitmap: add sbitmap_get_shallow() operation"), it has not been used. Delete __sbitmap_queue_get_shallow() and rename public __sbitmap_queue_get_shallow() -> sbitmap_queue_get_shallow() as it is odd to have public __foo but no foo at all. Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1644322024-105340-1-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
Only the last sbitmap_word can have different depth, and all the others must have same depth of 1U << sb->shift, so not necessary to store it in sbitmap_word, and it can be retrieved easily and efficiently by adding one internal helper of __map_depth(sb, index). Remove 'depth' field from sbitmap_word, then the annotation of ____cacheline_aligned_in_smp for 'word' isn't needed any more. Not see performance effect when running high parallel IOPS test on null_blk. This way saves us one cacheline(usually 64 words) per each sbitmap_word. Cc: Martin Wilck <martin.wilck@suse.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NMartin Wilck <mwilck@suse.com> Reviewed-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20220110072945.347535-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 2月, 2022 9 次提交
-
-
由 Christoph Hellwig 提交于
Pass a block_device to bio_clone_fast and __bio_clone_fast and give the functions more suitable names. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All callers of __bio_clone_fast initialize the bio first. Move that initialization into __bio_clone_fast instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-13-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Replace open coded bio_clone_fast implementations with the actual helper. Note that the bio allocated as part of the dm_io structure in alloc_io will only actually be used later in alloc_tio, making this earlier cloning of the information safe. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
__bio_clone_fast should also clone integrity and crypto data, as a clone without those is incomplete. Right now the only caller that can actually support crypto and integrity data (dm) does it manually for the one callchain that supports these, but we better do it properly in the core. Note that all callers except for the above mentioned one also don't need to handle failure at all, given that the integrity and crypto clones are based on mempool allocations that won't fail for sleeping allocations. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold __remap_to_origin_clear_discard into the two callers to prepare for bio cloning refactoring. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Most targets just need a single flush bio. Open code that case in __send_duplicate_bios without the need to add the bio to a list. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Return the clone bio embedded into the tio as that is what the callers actually want. Similar for the free side. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
This simplifies the callers a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move the call to __bio_clone_fast and the assignment of ->len_ptr from the callers into alloc_tio to prepare for changes to the bio clone API. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-