- 10 1月, 2022 2 次提交
-
-
由 Ming Lei 提交于
Commit cc9c884d ("block: call submit_bio_checks under q_usage_counter") uses q_usage_counter to protect submit_bio_checks for avoiding IO after disk is deleted by del_gendisk(). Turns out the protection isn't necessary, because once blk_mq_freeze_queue_wait() in del_gendisk() returns: 1) all in-flight IO has been done 2) all new IO will be failed in __bio_queue_enter() because q_usage_counter is dead, and GD_DEAD is set 3) both disk and request queue instance are safe since caller of submit_bio() guarantees that the disk can't be closed. Once submit_bio_checks() needn't the protection of q_usage_counter, we can move submit_bio_checks before calling blk_mq_submit_bio() and ->submit_bio(). With this change, we needn't to throttle queue with holding one allocated request, then precise driver tag or request won't be wasted in throttling. Meantime we can unify the bio check for both bio based and request based driver. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220104134223.590803-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Yang Li 提交于
Move the 'inline' keyword to the front of 'void'. Remove a warning found by clang(make W=1 LLVM=1) ./include/linux/blk-mq.h:259:1: warning: ‘inline’ is not at beginning of declaration Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NYang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220107005228.103927-1-yang.lee@linux.alibaba.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 1月, 2022 4 次提交
-
-
由 Keith Busch 提交于
If command prep fails, current handling will orphan subsequent requests in the list. Consider a simple example: rqlist = [ 1 -> 2 ] When prep for request '1' fails, it will be appended to the 'requeue_list', leaving request '2' disconnected from the original rqlist and no longer tracked. Meanwhile, rqlist is still pointing to the failed request '1' and will attempt to submit the unprepped command. Fix this by updating the rqlist accordingly using the request list helper functions. Fixes: d62cbcf6 ("nvme: add support for mq_ops->queue_rqs()") Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220105170518.3181469-5-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
When iterating a list, a particular request may need to be moved for special handling. Provide a helper function to achieve that so drivers don't need to reimplement rqlist manipulation. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220105170518.3181469-4-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
While iterating a list, a particular request may need to be removed for special handling. Provide an iterator that can safely handle that. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20220105170518.3181469-3-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
Move the request list macros to the header file that defines that struct they operate on. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220105170518.3181469-2-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 23 12月, 2021 3 次提交
-
-
由 Lukas Bulwahn 提交于
Commit 5fc11eeb ("block: open code create_task_io_context in set_task_ioprio") introduces a needless assignment 'ioc = task->io_context', as the local variable ioc is not further used before returning. Even after the further fix, commit a957b612 ("block: fix error in handling dead task for ioprio setting"), the assignment still remains needless. Drop this needless assignment in set_task_ioprio(). This code smell was identified with 'make clang-analyzer'. Fixes: 5fc11eeb ("block: open code create_task_io_context in set_task_ioprio") Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211223125300.20691-1-lukas.bulwahn@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
While harmless, the blank line is certainly not intended to be part of the rq_list_for_each() macro. Remove it. Signed-off-by: NKeith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20211222215239.1768164-1-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Randy Dunlap 提交于
Fix all kernel-doc warnings in <linux/bio.h>: include/linux/bio.h:136: warning: Function parameter or member 'nbytes' not described in 'bio_advance' include/linux/bio.h:136: warning: Excess function parameter 'bytes' description in 'bio_advance' include/linux/bio.h:391: warning: No description found for return value of 'bio_next_split' Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Link: https://lore.kernel.org/r/20211222211532.24060-1-rdunlap@infradead.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 12月, 2021 3 次提交
-
-
由 Tetsuo Handa 提交于
ioctl(fd, LOOP_CTL_ADD, 1048576) causes sysfs: cannot create duplicate filename '/dev/block/7:0' message because such request is treated as if ioctl(fd, LOOP_CTL_ADD, 0) due to MINORMASK == 1048575. Verify that all minor numbers for that device fit in the minor range. Reported-by: Nwangyangbo <wangyangbo@uniontech.com> Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/b1b19379-23ee-5379-0eb5-94bf5f79f1b4@i-love.sakura.ne.jpSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Tetsuo Handa 提交于
Since lo_simple_ioctl(LOOP_SET_BLOCK_SIZE) and ioctl(NBD_SET_BLKSIZE) pass user-controlled "unsigned long arg" to blk_validate_block_size(), "unsigned long" should be used for validation. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/9ecbf057-4375-c2db-ab53-e4cc0dff953d@i-love.sakura.ne.jpSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
One device_add is called disk->ev will be freed by disk_release, so we should free it twice. Fix this by allocating disk->ev after device_add so that the extra local unwinding can be removed entirely. Based on an earlier patch from Tetsuo Handa. Reported-by: Nsyzbot <syzbot+28a66a9fbc621c939000@syzkaller.appspotmail.com> Tested-by: Nsyzbot <syzbot+28a66a9fbc621c939000@syzkaller.appspotmail.com> Fixes: 83cbce95 ("block: add error handling for device_add_disk / add_disk") Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211221161851.788424-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 21 12月, 2021 4 次提交
-
-
由 Ming Lei 提交于
blk_stat_disable_accounting() is added in commit 68497092 ("block: make queue stat accounting a reference"), and called in kyber_exit_sched(). So we have to free q->stats after elevator is unloaded from blk_exit_queue() in blk_release_queue(). Otherwise kernel panic is caused. Fixes: 68497092 ("block: make queue stat accounting a reference") Signed-off-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211221040436.1333880-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Don't combine the task exiting and "already have io_context" case, we need to just abort if the task is marked as dead. Return -ESRCH, which is the documented value for ioprio_set() if the specified task could not be found. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Reported-by: syzbot+8836466a79f4175961b0@syzkaller.appspotmail.com Fixes: 5fc11eeb ("block: open code create_task_io_context in set_task_ioprio") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The low level drivers don't expect to see new requests after a successful quiesce completes. Check the queue quiesce state within the rcu protected area prior to calling the driver's queue_rqs(). Fixes: 3c67d44d ("block: add mq_ops->queue_rqs hook") Signed-off-by: NKeith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20211220205919.180191-1-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Wander Lairson Costa 提交于
The running_trace_lock protects running_trace_list and is acquired within the tracepoint which implies disabled preemption. The spinlock_t typed lock can not be acquired with disabled preemption on PREEMPT_RT because it becomes a sleeping lock. The runtime of the tracepoint depends on the number of entries in running_trace_list and has no limit. The blk-tracer is considered debug code and higher latencies here are okay. Make running_trace_lock a raw_spinlock_t. Signed-off-by: NWander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20211220192827.38297-1-wander@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 12月, 2021 14 次提交
-
-
由 Christoph Hellwig 提交于
Only bfq needs to code to track icq, so make it conditional. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold create_task_io_context into the only remaining caller. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The flow in set_task_ioprio can be simplified by simply open coding create_task_io_context, which removes a refcount roundtrip on the I/O context. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold get_task_io_context into its only caller, and simplify the code as no reference to the I/O context is required to just set the ioprio field. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Keep set_task_ioprio with the other low-level code that accesses the io_context structure. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fold __ioc_clear_queue into ioc_clear_queue and switch to always use plain _irq locking instead of the more expensive _irqsave that is not needed here. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move the code to delay freeing the icqs into a separate helper. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
No caller passes in a NULL pointer, so remove the check. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Factor out a ioc_exit_icqs helper to tear down the icqs and the fold the rest of put_iocontext_active into exit_io_context. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Don't hold a reference to ->refcount for each active reference, but just one for all active references. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Nothing ever looks at ->nr_tasks, so remove it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211209063131.18537-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
This enables the block layer to send us a full plug list of requests that need submitting. The block layer guarantees that they all belong to the same queue, but we do have to check the hardware queue mapping for each request. If errors are encountered, leave them in the passed in list. Then the block layer will handle them individually. This is good for about a 4% improvement in peak performance, taking us from 9.6M to 10M IOPS/core. Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Add a nvme_prep_rq() helper to setup a command, and nvme_queue_rq() is adapted to use this helper. Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We'll need it for batched submit as well. Since we now have a copy helper, get rid of the nvme_submit_cmd() wrapper. Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 12月, 2021 3 次提交
-
-
由 Jens Axboe 提交于
If we have a list of requests in our plug list, send it to the driver in one go, if possible. The driver must set mq_ops->queue_rqs() to support this, if not the usual one-by-one path is used. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Pointless to maintain a head/tail for the list, as we never need to access the tail. Entries are always LIFO for cache hotness reasons. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
The batched completions only deal with non-partial requests anyway, and it doesn't deal with any requests that have errors. Add a completion handler that assumes it's a full request and that it's all being ended successfully. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 12月, 2021 1 次提交
-
-
由 Jens Axboe 提交于
kyber turns on IO statistics when it is loaded on a queue, which means that even if kyber is then later unloaded, we're still stuck with stats enabled on the queue. Change the account enabled from a bool to an int, and pair the enable call with the equivalent disable call. This ensures that stats gets turned off again appropriately. Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 12月, 2021 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
Add a Context section and rewrite the rest to be clearer. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NNikolay Borisov <nborisov@suse.com> Link: https://lore.kernel.org/r/20211213171113.3097631-1-willy@infradead.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 13 12月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
mtdblock / mtdblock_ro set part_bits to 0 and thus nevever scanned partitions. Restore that behavior by setting the GENHD_FL_NO_PART flag. Fixes: 1ebe2e5f ("block: remove GENHD_FL_EXT_DEVT") Reported-by: NGeert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NGeert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20211206070409.2836165-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 07 12月, 2021 4 次提交
-
-
由 John Garry 提交于
Kashyap reports high CPU usage in blk_mq_queue_tag_busy_iter() and callees using megaraid SAS RAID card since moving to shared tags [0]. Previously, when shared tags was shared sbitmap, this function was less than optimum since we would iter through all tags for all hctx's, yet only ever match upto tagset depth number of rqs. Since the change to shared tags, things are even less efficient if we have parallel callers of blk_mq_queue_tag_busy_iter(). This is because in bt_iter() -> blk_mq_find_and_get_req() there would be more contention on accessing each request ref and tags->lock since they are now shared among all HW queues. Optimise by having separate calls to bt_for_each() for when we're using shared tags. In this case no longer pass a hctx, as it is no longer relevant, and teach bt_iter() about this. Ming suggested something along the lines of this change, apart from a different implementation. [0] https://lore.kernel.org/linux-block/e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com/Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reported-and-tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Fixes: e155b0c2 ("blk-mq: Use shared tags for shared sbitmap support") Link: https://lore.kernel.org/r/1638794990-137490-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 John Garry 提交于
Typedefs busy_iter_fn and busy_tag_iter_fn are now identical, so delete busy_iter_fn to reduce duplication. It would be nicer to delete busy_tag_iter_fn, as the name busy_iter_fn is less specific. However busy_tag_iter_fn is used in many different parts of the tree, unlike busy_iter_fn which is just use in block/, so just take the straightforward path now, so that we could rename later treewide. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Link: https://lore.kernel.org/r/1638794990-137490-3-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 John Garry 提交于
The only user of blk_mq_hw_ctx blk_mq_hw_ctx argument is blk_mq_rq_inflight(). Function blk_mq_rq_inflight() uses the hctx to find the associated request queue to match against the request. However this same check is already done in caller bt_iter(), so drop this check. With that change there are no more users of busy_iter_fn blk_mq_hw_ctx argument, so drop the argument. Reviewed-by Hannes Reinecke <hare@suse.de> Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Tested-by: NKashyap Desai <kashyap.desai@broadcom.com> Link: https://lore.kernel.org/r/1638794990-137490-2-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
blk_mq_run_dispatch_ops() is defined as one macro, and plug->mq_list will be changed when running 'dispatch_ops', so add one local variable for holding request queue. Reported-and-tested-by: NYi Zhang <yi.zhang@redhat.com> Fixes: 4cafe86c ("blk-mq: run dispatch lock once in case of issuing from list") Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-