- 31 5月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
No functional changes in this patch, just a prep patch for utilizing this in an IO scheduler. Signed-off-by: NJens Axboe <axboe@kernel.dk> Reviewed-by: NOmar Sandoval <osandov@fb.com>
-
- 30 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Bsg holding a reference to the parent device may result in a crash if a bsg file handle is closed after the parent device driver has unloaded. Holding a reference is not really needed: the parent device must exist between bsg_register_queue and bsg_unregister_queue. Before the device goes away the caller does blk_cleanup_queue so that all in-flight requests to the device are gone and all new requests cannot pass beyond the queue. The queue itself is a refcounted object and it will stay alive with a bsg file. Based on analysis, previous patch and changelog from Anatoliy Glagolev. Reported-by: NAnatoliy Glagolev <glagolig@gmail.com> Reviewed-by: NJames E.J. Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 29 5月, 2018 6 次提交
-
-
由 Christoph Hellwig 提交于
The information about a size change in this case just creates confusion. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The BLK_EH_NOT_HANDLED implies nothing happen, but very often that is not what is happening - instead the driver already completed the command. Fix the symbolic name to reflect that a little better. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
This patch simplifies the timeout handling by relying on the request reference counting to ensure the iterator is operating on an inflight and truly timed out request. Since the reference counting prevents the tag from being reallocated, the block layer no longer needs to prevent drivers from completing their requests while the timeout handler is operating on it: a driver completing a request is allowed to proceed to the next state without additional syncronization with the block layer. This also removes any need for generation sequence numbers since the request lifetime is prevented from being reallocated as a new sequence while timeout handling is operating on it. To enables this a refcount is added to struct request so that request users can be sure they're operating on the same request without it changing while they're processing it. The request's tag won't be released for reuse until both the timeout handler and the completion are done with it. Signed-off-by: NKeith Busch <keith.busch@intel.com> [hch: slight cleanups, added back submission side hctx lock, use cmpxchg for completions] Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The block layer had been setting the state to in-flight prior to updating the timer. This is the wrong order since the timeout handler could observe the in-flight state with the older timeout, believing the request had expired when in fact it is just getting started. Signed-off-by: NKeith Busch <keith.busch@intel.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 5月, 2018 2 次提交
-
-
由 Joe Perches 提交于
Convert the S_<FOO> symbolic permissions to their octal equivalents as using octal and not symbolic permissions is preferred by many as more readable. see: https://lkml.org/lkml/2016/8/2/1945 Done with automated conversion via: $ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace <files...> Miscellanea: o Wrapped modified multi-line calls to a single line where appropriate o Realign modified multi-line calls to open parenthesis Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
When the allocation process is scheduled back and the mapped hw queue is changed, fake one extra wake up on previous queue for compensating wake up miss, so other allocations on the previous queue won't be starved. This patch fixes one request allocation hang issue, which can be triggered easily in case of very low nr_request. The race is as follows: 1) 2 hw queues, nr_requests are 2, and wake_batch is one 2) there are 3 waiters on hw queue 0 3) two in-flight requests in hw queue 0 are completed, and only two waiters of 3 are waken up because of wake_batch, but both the two waiters can be scheduled to another CPU and cause to switch to hw queue 1 4) then the 3rd waiter will wait for ever, since no in-flight request is in hw queue 0 any more. 5) this patch fixes it by the fake wakeup when waiter is scheduled to another hw queue Cc: <stable@vger.kernel.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Modified commit message to make it clearer, and make it apply on top of the 4.18 branch. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 23 5月, 2018 1 次提交
-
-
由 Bart Van Assche 提交于
Avoid that complaints similar to the following appear in the kernel log if the number of zones is sufficiently large: fio: page allocation failure: order:9, mode:0x140c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null) Call Trace: dump_stack+0x63/0x88 warn_alloc+0xf5/0x190 __alloc_pages_slowpath+0x8f0/0xb0d __alloc_pages_nodemask+0x242/0x260 alloc_pages_current+0x6a/0xb0 kmalloc_order+0x18/0x50 kmalloc_order_trace+0x26/0xb0 __kmalloc+0x20e/0x220 blkdev_report_zones_ioctl+0xa5/0x1a0 blkdev_ioctl+0x1ba/0x930 block_ioctl+0x41/0x50 do_vfs_ioctl+0xaa/0x610 SyS_ioctl+0x79/0x90 do_syscall_64+0x79/0x1b0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 3ed05a98 ("blk-zoned: implement ioctls") Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Cc: Shaun Tancheff <shaun.tancheff@seagate.com> Cc: Damien Le Moal <damien.lemoal@hgst.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 5月, 2018 1 次提交
-
-
由 huhai 提交于
When dispatch_rq_from_ctx is called, in the vast majority of cases the ctx->rq_list is not empty. Signed-off-by: Nhuhai <huhai@kylinos.cn> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 18 5月, 2018 1 次提交
-
-
由 huhai 提交于
When the number of hardware queues is changed, the drivers will call blk_mq_update_nr_hw_queues() to remap hardware queues. This changes the ctx mappings, but the current code doesn't clear the ->dispatch_from hint. This can result in dispatch_from pointing to a ctx that isn't mapped to the hctx anymore. Fixes: b347689f ("blk-mq-sched: improve dispatching from sw queue") Signed-off-by: Nhuhai <huhai@kylinos.cn> Reviewed-by: NMing Lei <ming.lei@redhat.com> Moved the placement of the clearing to where we clear other items pertaining to the existing mapping, added Fixes line, and reworded the commit message. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 5月, 2018 1 次提交
-
-
由 huhai 提交于
We can use blk_mq_sched_insert_request() even if we don't have an IO scheduler attached, since that case will end up being exactly the same as what blk_mq_queue_io() was doing now. Signed-off-by: Nhuhai <huhai@kylinos.cn> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 5月, 2018 9 次提交
-
-
由 Kent Overstreet 提交于
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Recently found a bug where a driver left bi_next not NULL and then called bio_endio(), and then the submitter of the bio used bio_copy_data() which was treating src and dst as lists of bios. Fixed that bug by splitting out bio_list_copy_data(), but in case other things are depending on bi_next in weird ways, add a warning to help avoid more bugs like that in the future. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Since a bio can point to userspace pages (e.g. direct IO), this is generally necessary. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Found a bug (with ASAN) where we were passing a bio to bio_copy_data() with bi_next not NULL, when it should have been - a driver had left bi_next set to something after calling bio_endio(). Since the normal case is only copying single bios, split out bio_list_copy_data() to avoid more bugs like this in the future. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Add versions that take bvec_iter args instead of using bio->bi_iter - to be used by bcachefs. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Minor optimization - remove a pointer indirection when using fs_bio_set. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Similarly to mempool_init()/mempool_exit(), take a pointer indirection out of allocation/freeing by allowing biosets to be embedded in other structs. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Kent Overstreet 提交于
Minor performance improvement by getting rid of pointer indirections from allocation/freeing fastpaths. Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 14 5月, 2018 5 次提交
-
-
由 Christoph Hellwig 提交于
Same numerical value (for now at least), but a much better documentation of intent. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
We just can't do I/O when doing block layer requests allocations, so use GFP_NOIO instead of the even more limited __GFP_DIRECT_RECLAIM. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
blk_old_get_request already has it at hand, and in blk_queue_bio, which is the fast path, it is constant. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Switch everyone to blk_get_request_flags, and then rename blk_get_request_flags to blk_get_request. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 11 5月, 2018 7 次提交
-
-
由 Jens Axboe 提交于
We don't expect the async depth to be smaller than the wake batch count for sbitmap, but just in case, inform sbitmap of what shallow depth kyber may use. Acked-by: NPaolo Valente <paolo.valente@linaro.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If our shallow depth is smaller than the wake batching of sbitmap, we can introduce hangs. Ensure that sbitmap knows how low we'll go. Acked-by: NPaolo Valente <paolo.valente@linaro.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
bfqd->sb_shift was attempted used as a cache for the sbitmap queue shift, but we don't need it, as it never changes. Kill it with fire. Acked-by: NPaolo Valente <paolo.valente@linaro.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
It doesn't change, so don't put it in the per-IO hot path. Acked-by: NPaolo Valente <paolo.valente@linaro.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Reserved tags are used for error handling, we don't need to care about them for regular IO. The core won't call us for these anyway. Acked-by: NPaolo Valente <paolo.valente@linaro.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
It's not useful, they are internal and/or error handling recovery commands. Acked-by: NPaolo Valente <paolo.valente@linaro.org> Reviewed-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Paolo Valente 提交于
When invoked for an I/O request rq, the prepare_request hook of bfq increments reference counters in the destination bfq_queue for rq. In this respect, after this hook has been invoked, rq may still be transformed into a request with no icq attached, i.e., for bfq, a request not associated with any bfq_queue. No further hook is invoked to signal this tranformation to bfq (in general, to the destination elevator for rq). This leads bfq into an inconsistent state, because bfq has no chance to correctly lower these counters back. This inconsistency may in its turn cause incorrect scheduling and hangs. It certainly causes memory leaks, by making it impossible for bfq to free the involved bfq_queue. On the bright side, no transformation can still happen for rq after rq has been inserted into bfq, or merged with another, already inserted, request. Exploiting this fact, this commit addresses the above issue by delaying the preparation of an I/O request to when the request is inserted or merged. This change also gives a performance bonus: a lock-contention point gets removed. To prepare a request, bfq needs to hold its scheduler lock. After postponing request preparation to insertion or merging, no lock needs to be grabbed any longer in the prepare_request hook, while the lock already taken to perform insertion or merging is used to preparare the request as well. Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name> Tested-by: NBart Van Assche <bart.vanassche@wdc.com> Signed-off-by: NPaolo Valente <paolo.valente@linaro.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 5月, 2018 5 次提交
-
-
由 Omar Sandoval 提交于
Currently, struct request has four timestamp fields: - A start time, set at get_request time, in jiffies, used for iostats - An I/O start time, set at start_request time, in ktime nanoseconds, used for blk-stats (i.e., wbt, kyber, hybrid polling) - Another start time and another I/O start time, used for cfq and bfq These can all be consolidated into one start time and one I/O start time, both in ktime nanoseconds, shaving off up to 16 bytes from struct request depending on the kernel config. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Omar Sandoval 提交于
We want this next to blk_account_io_done() for the next change so that we can call ktime_get() only once for both. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Omar Sandoval 提交于
cfq and bfq have some internal fields that use sched_clock() which can trivially use ktime_get_ns() instead. Their timestamp fields in struct request can also use ktime_get_ns(), which resolves the 8 year old comment added by commit 28f4197e ("block: disable preemption before using sched_clock()"). Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Omar Sandoval 提交于
struct blk_issue_stat squashes three things into one u64: - The time the driver started working on a request - The original size of the request (for the io.low controller) - Flags for writeback throttling It turns out that on x86_64, we have a 4 byte hole in struct request which we can fill with the non-timestamp fields from blk_issue_stat, simplifying things quite a bit. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Omar Sandoval 提交于
struct blk_issue_stat is going away, and bio->bi_issue_stat doesn't even use the blk-stats interface, so we can provide a separate implementation specific for bios. The helpers work the same way as the blk-stats helpers. Signed-off-by: NOmar Sandoval <osandov@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-