- 28 1月, 2017 9 次提交
-
-
由 Christoph Hellwig 提交于
And require all drivers that want to support BLOCK_PC to allocate it as the first thing of their private data. To support this the legacy IDE and BSG code is switched to set cmd_size on their queues to let the block layer allocate the additional space. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Simply the boilerplate code needed for bsg nodes a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
This mirrors the blk-mq capabilities to allocate extra drivers-specific data behind struct request by setting a cmd_size field, as well as having a constructor / destructor for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Return an errno value instead of the passed in queue so that the callers don't have to keep track of two queues, and move the assignment of the request_fn and lock to the caller as passing them as argument doesn't simplify anything. While we're at it also remove two pointless NULL assignments, given that the request structure is zeroed on allocation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
We can't initalize the elevator fields for flushes as flush share space in struct request with the elevator data. But currently we can't communicate that a request is a flush through blk_get_request as we can only pass READ or WRITE, and the low-level code looks at the possible NULL bio to check for a flush. Fix this by allowing to pass any block op and flags, and by checking for the flush flags in __get_request. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
This fixes a couple of problems: 1. In the !CONFIG_DEBUG_FS case, the stub definitions were bogus. 2. In the !CONFIG_BLOCK case, blk-mq-debugfs.c shouldn't be compiled at all. Fix the stub definitions and add a CONFIG_BLK_DEBUG_FS Kconfig option. Fixes: 07e4fead ("blk-mq: create debugfs directory tree") Signed-off-by: NOmar Sandoval <osandov@fb.com> Augment Kconfig description. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Use op_is_flush() where applicable. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Instead of letting the caller check this and handle the details of inserting a flush request, put the logic in the scheduler insertion function. This fixes direct flush insertion outside of the usual make_request_fn calls, like from dm via blk_insert_cloned_request(). Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
This centralizes the checks for bios that needs to be go into the flush state machine. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 27 1月, 2017 16 次提交
-
-
由 Jens Axboe 提交于
When we invoke dispatch_requests(), the scheduler empties everything into the passed in list. This isn't always a good thing, since it means that we remove items that we could have potentially merged with. Change the function to dispatch single requests at the time. If we do that, we can backoff exactly at the point where the device can't consume more IO, and leave the rest with the scheduler for better merging and future dispatch decision making. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Tested-by: NHannes Reinecke <hare@suse.com>
-
由 Jens Axboe 提交于
If we have both multiple hardware queues and shared tag map between devices, we need to ensure that we propagate the hardware queue restart bit higher up. This is because we can get into a situation where we don't have any IO pending on a hardware queue, yet we fail getting a tag to start new IO. If that happens, it's not enough to mark the hardware queue as needing a restart, we need to bubble that up to the higher level queue as well. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Tested-by: NHannes Reinecke <hare@suse.com>
-
由 Jens Axboe 提交于
We don't want to hold on to this resource when we have a scheduler attached. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Tested-by: NHannes Reinecke <hare@suse.com>
-
由 Jens Axboe 提交于
Once we mark the queue as needing a restart, re-check if we can get a driver tag. This fixes a theoretical issue where the needed IO completes _after_ blk_mq_get_driver_tag() fails, but before we manage to set the restart bit. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Tested-by: NHannes Reinecke <hare@suse.com>
-
由 Jens Axboe 提交于
We'll use the same criteria for whether we need to run the queue sync or async when we have a scheduler, as we do without one. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com> Tested-by: NHannes Reinecke <hare@suse.com>
-
由 Omar Sandoval 提交于
These counters aren't as out-of-place in sysfs as the other stuff, but debugfs is a slightly better home for them. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
These statistics _might_ be useful to userspace, but it's better not to commit to an ABI for these yet. Also, the dispatched file in sysfs couldn't be cleared, so make it clearable like the others in debugfs. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
These can be used to debug issues like tag leaks and stuck requests. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
These are very tied to the blk-mq tag implementation, so exposing them to sysfs isn't a great idea. Move the debugging information to debugfs and add basic entries for the number of tags and the number of reserved tags to sysfs. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
This is useful for debugging problems where we've gotten stuck with requests in the software queues. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
The request pointers by themselves aren't super useful. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
These lists are only useful for debugging; they definitely don't belong in sysfs. Putting them in debugfs also removes the limitation of a single page of output. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
hctx->state could come in handy for bugs where the hardware queue gets stuck in the stopped state, and hctx->flags is just useful to know. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Omar Sandoval 提交于
In preparation for putting blk-mq debugging information in debugfs, create a directory tree mirroring the one in sysfs: # tree -d /sys/kernel/debug/block /sys/kernel/debug/block |-- nvme0n1 | `-- mq | |-- 0 | | `-- cpu0 | |-- 1 | | `-- cpu1 | |-- 2 | | `-- cpu2 | `-- 3 | `-- cpu3 `-- vda `-- mq `-- 0 |-- cpu0 |-- cpu1 |-- cpu2 `-- cpu3 Also add the scaffolding for the actual files that will go in here, either under the hardware queue or software queue directories. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
We don't trigger this from the normal IO path, since we always use blocking allocations from there. But Bart saw it testing multipath dm, since that is a heavy user of atomic request allocations in the map and clone path. Reported-by: NBart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
If we come in from blk_mq_alloc_requst() with NOWAIT set in flags, we must ensure that we don't later overwrite that in blk_mq_sched_get_request(). Initialize alloc_data->flags before passing it in. Reported-by: NBart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 25 1月, 2017 1 次提交
-
-
由 Jens Axboe 提交于
If we have a scheduler attached, we have two sets of tags. We don't want to apply our active queue throttling for the scheduler side of tags, that only applies to driver tags since that's the resource we need to dispatch an IO. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 23 1月, 2017 3 次提交
-
-
由 Markus Elfring 提交于
The script "checkpatch.pl" pointed information out like the following. ERROR: do not use assignment in if condition Thus fix the affected source code place. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Markus Elfring 提交于
The script "checkpatch.pl" pointed information out like the following. ERROR: do not use assignment in if condition Thus fix the affected source code places. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Alexander Potapenko 提交于
KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of uninitialized memory in cfq_init_cfqq(): ================================================================== BUG: KMSAN: use of unitialized memory ... Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff8202ac97>] dump_stack+0x157/0x1d0 lib/dump_stack.c:51 [<ffffffff813e9b65>] kmsan_report+0x205/0x360 ??:? [<ffffffff813eabbb>] __msan_warning+0x5b/0xb0 ??:? [< inline >] cfq_init_cfqq block/cfq-iosched.c:3754 [<ffffffff8201e110>] cfq_get_queue+0xc80/0x14d0 block/cfq-iosched.c:3857 ... origin: [<ffffffff8103ab37>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67 [<ffffffff813e836b>] kmsan_internal_poison_shadow+0xab/0x150 ??:? [<ffffffff813e88ab>] kmsan_poison_slab+0xbb/0x120 ??:? [< inline >] allocate_slab mm/slub.c:1627 [<ffffffff813e533f>] new_slab+0x3af/0x4b0 mm/slub.c:1641 [< inline >] new_slab_objects mm/slub.c:2407 [<ffffffff813e0ef3>] ___slab_alloc+0x323/0x4a0 mm/slub.c:2564 [< inline >] __slab_alloc mm/slub.c:2606 [< inline >] slab_alloc_node mm/slub.c:2669 [<ffffffff813dfb42>] kmem_cache_alloc_node+0x1d2/0x1f0 mm/slub.c:2746 [<ffffffff8201d90d>] cfq_get_queue+0x47d/0x14d0 block/cfq-iosched.c:3850 ... ================================================================== (the line numbers are relative to 4.8-rc6, but the bug persists upstream) The uninitialized struct cfq_queue is created by kmem_cache_alloc_node() and then passed to cfq_init_cfqq(), which accesses cfqq->ioprio_class before it's initialized. Signed-off-by: NAlexander Potapenko <glider@google.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 21 1月, 2017 1 次提交
-
-
由 Jens Axboe 提交于
Add support for growing the tags associated with a hardware queue, for the scheduler tags. Currently we only support resizing within the limits of the original depth, change that so we can grow it as well by allocating and replacing the existing scheduler tag set. This is similar to how we could increase the software queue depth with the legacy IO stack and schedulers. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
- 19 1月, 2017 5 次提交
-
-
由 Jens Axboe 提交于
The run handler we register for the delayed work requires that the queue be stopped, yet we leave that up to the caller. Let's move it into blk_mq_delay_queue() itself, so that the API is sane. This fixes a stall with SCSI, where it calls blk_mq_delay_queue() without having stopped the queue. Hence the queue is never run. Reported-by: NHannes Reinecke <hare@suse.com> Fixes: 70f4db63 ("blk-mq: add blk_mq_delay_queue") Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
We used to pass in NULL for hctx for reserved tags, but we don't do that anymore. Hence the check for whether hctx is NULL or not is now redundant, kill it. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: a642a158aec6 ("blk-mq-tag: cleanup the normal/reserved tag allocation") Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
We already checked that e is NULL, so no point in calling elevator_put() to free it. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: dc877dbd088f ("blk-mq-sched: add framework for MQ capable IO schedulers") Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
There's no potential harm in quiescing the queue, but it also doesn't buy us anything. And we can't run the queue async for policy deactivate, since we could be in the path of tearing the queue down. If we schedule an async run of the queue at that time, we're racing with queue teardown AFTER having we've already torn most of it down. Reported-by: NOmar Sandoval <osandov@fb.com> Fixes: 4d199c6f ("blk-cgroup: ensure that we clear the stop bit on quiesced queues") Tested-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Keith Busch 提交于
Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 18 1月, 2017 5 次提交
-
-
由 Jens Axboe 提交于
If we call blk_mq_quiesce_queue() on a queue, we must remember to pair that with something that clears the stopped by on the queues later on. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Add Kconfig entries to manage what devices get assigned an MQ scheduler, and add a blk-mq flag for drivers to opt out of scheduling. The latter is useful for admin type queues that still allocate a blk-mq queue and tag set, but aren't use for normal IO. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
由 Jens Axboe 提交于
This is basically identical to deadline-iosched, except it registers as a MQ capable scheduler. This is still a single queue design. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
由 Jens Axboe 提交于
This adds a set of hooks that intercepts the blk-mq path of allocating/inserting/issuing/completing requests, allowing us to develop a scheduler within that framework. We reuse the existing elevator scheduler API on the registration side, but augment that with the scheduler flagging support for the blk-mq interfce, and with a separate set of ops hooks for MQ devices. We split driver and scheduler tags, so we can run the scheduling independently of device queue depth. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
由 Jens Axboe 提交于
This is in preparation for having two sets of tags available. For that we need a static index, and a dynamically assignable one. Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-