- 04 11月, 2016 1 次提交
-
-
由 Shaohua Li 提交于
Currently block plug holds up to 16 non-mergeable requests. This makes sense if the request size is small, eg, reduce lock contention. But if request size is big enough, we don't need to worry about lock contention. Holding such request makes no sense and it lows the disk utilization. In practice, this improves 10% throughput for my raid5 sequential write workload. The size (128k) is arbitrary right now, but it makes sure lock contention is small. This probably could be more intelligent, eg, check average request size holded. Since this is mainly for sequential IO, probably not worthy. V2: check the last request instead of the first request, so as long as there is one big size request we flush the plug. Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 28 10月, 2016 2 次提交
-
-
由 Christoph Hellwig 提交于
Now that we don't need the common flags to overflow outside the range of a 32-bit type we can encode them the same way for both the bio and request fields. This in addition allows us to place the operation first (and make some room for more ops while we're at it) and to stop having to shift around the operation values. In addition this allows passing around only one value in the block layer instead of two (and eventuall also in the file systems, but we can do that later) and thus clean up a lot of code. Last but not least this allows decreasing the size of the cmd_flags field in struct request to 32-bits. Various functions passing this value could also be updated, but I'd like to avoid the churn for now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
A lot of the REQ_* flags are only used on struct requests, and only of use to the block layer and a few drivers that dig into struct request internals. This patch adds a new req_flags_t rq_flags field to struct request for them, and thus dramatically shrinks the number of common requests. It also removes the unfortunate situation where we have to fit the fields from the same enum into 32 bits for struct bio and 64 bits for struct request. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NShaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 19 10月, 2016 1 次提交
-
-
由 Shaun Tancheff 提交于
Define REQ_OP_ZONE_REPORT and REQ_OP_ZONE_RESET for handling zones of host-managed and host-aware zoned block devices. With with these two new operations, the total number of operations defined reaches 8 and still fits with the 3 bits definition of REQ_OP_BITS. Signed-off-by: NShaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: NDamien Le Moal <damien.lemoal@hgst.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 14 9月, 2016 1 次提交
-
-
由 Stephen Bates 提交于
In order to help determine the effectiveness of polling in a running system it is usful to determine the ratio of how often the poll function is called vs how often the completion is checked. For this reason we add a poll_considered variable and add it to the sysfs entry for io_poll. Signed-off-by: NStephen Bates <sbates@raithlin.com> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 29 8月, 2016 2 次提交
-
-
由 Jens Axboe 提交于
We don't need the larger delayed work struct, since we always run it immediately. Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Jens Axboe 提交于
Add a helper to schedule a regular struct work on a particular CPU. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 17 8月, 2016 1 次提交
-
-
由 Bart Van Assche 提交于
blk_set_queue_dying() can be called while another thread is submitting I/O or changing queue flags, e.g. through dm_stop_queue(). Hence protect the QUEUE_FLAG_DYING flag change with locking. Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 8月, 2016 1 次提交
-
-
由 Jens Axboe 提交于
Since commit 63a4cc24, bio->bi_rw contains flags in the lower portion and the op code in the higher portions. This means that old code that relies on manually setting bi_rw is most likely going to be broken. Instead of letting that brokeness linger, rename the member, to force old and out-of-tree code to break at compile time instead of at runtime. No intended functional changes in this commit. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 21 7月, 2016 3 次提交
-
-
由 Christoph Hellwig 提交于
I wish the OSD code could simply use blk_rq_map_* helpers like everyone else, but the complex nature of deciding if we have DATA IN and/or DATA OUT buffers might make this impossible (at least for a mere human like me). But using blk_rq_append_bio at least allows sharing the setup code between request with or without dat a buffers, and given that this is the last user of blk_make_request it allows getting rid of that somewhat awkward interface. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NBoaz Harrosh <ooo@electrozaur.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
The target SCSI passthrough backend is much better served with the low-level blk_rq_append_bio construct then the helpers built on top of it, so export it. Also use the opportunity to remove the pointless request_queue argument and make the code flow a little more readable. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
blk_get_request is used for BLOCK_PC and similar passthrough requests. Currently we always need to call blk_rq_set_block_pc or an open coded version of it to allow appending bios using the request mapping helpers later on, which is a somewhat awkward API. Instead move the initialization part of blk_rq_set_block_pc into blk_get_request, so that we always have a safe to use request. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 06 7月, 2016 1 次提交
-
-
由 Sagi Grimberg 提交于
The new NVMe over fabrics target will make use of this outside from a module. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 10 6月, 2016 1 次提交
-
-
由 Jens Axboe 提交于
If we're queuing REQ_PRIO IO and the task is running at an idle IO class, then temporarily boost the priority. This prevents livelocks due to priority inversion, when a low priority task is holding file system resources while attempting to do IO. An example of that is shown below. An ioniced idle task is holding the directory mutex, while a normal priority task is trying to do a directory lookup. [478381.198925] ------------[ cut here ]------------ [478381.200315] INFO: task ionice:1168369 blocked for more than 120 seconds. [478381.201324] Not tainted 4.0.9-38_fbk5_hotfix1_2936_g85409c6 #1 [478381.202278] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [478381.203462] ionice D ffff8803692736a8 0 1168369 1 0x00000080 [478381.203466] ffff8803692736a8 ffff880399c21300 ffff880276adcc00 ffff880369273698 [478381.204589] ffff880369273fd8 0000000000000000 7fffffffffffffff 0000000000000002 [478381.205752] ffffffff8177d5e0 ffff8803692736c8 ffffffff8177cea7 0000000000000000 [478381.206874] Call Trace: [478381.207253] [<ffffffff8177d5e0>] ? bit_wait_io_timeout+0x80/0x80 [478381.208175] [<ffffffff8177cea7>] schedule+0x37/0x90 [478381.208932] [<ffffffff8177f5fc>] schedule_timeout+0x1dc/0x250 [478381.209805] [<ffffffff81421c17>] ? __blk_run_queue+0x37/0x50 [478381.210706] [<ffffffff810ca1c5>] ? ktime_get+0x45/0xb0 [478381.211489] [<ffffffff8177c407>] io_schedule_timeout+0xa7/0x110 [478381.212402] [<ffffffff810a8c2b>] ? prepare_to_wait+0x5b/0x90 [478381.213280] [<ffffffff8177d616>] bit_wait_io+0x36/0x50 [478381.214063] [<ffffffff8177d325>] __wait_on_bit+0x65/0x90 [478381.214961] [<ffffffff8177d5e0>] ? bit_wait_io_timeout+0x80/0x80 [478381.215872] [<ffffffff8177d47c>] out_of_line_wait_on_bit+0x7c/0x90 [478381.216806] [<ffffffff810a89f0>] ? wake_atomic_t_function+0x40/0x40 [478381.217773] [<ffffffff811f03aa>] __wait_on_buffer+0x2a/0x30 [478381.218641] [<ffffffff8123c557>] ext4_bread+0x57/0x70 [478381.219425] [<ffffffff8124498c>] __ext4_read_dirblock+0x3c/0x380 [478381.220467] [<ffffffff8124665d>] ext4_dx_find_entry+0x7d/0x170 [478381.221357] [<ffffffff8114c49e>] ? find_get_entry+0x1e/0xa0 [478381.222208] [<ffffffff81246bd4>] ext4_find_entry+0x484/0x510 [478381.223090] [<ffffffff812471a2>] ext4_lookup+0x52/0x160 [478381.223882] [<ffffffff811c401d>] lookup_real+0x1d/0x60 [478381.224675] [<ffffffff811c4698>] __lookup_hash+0x38/0x50 [478381.225697] [<ffffffff817745bd>] lookup_slow+0x45/0xab [478381.226941] [<ffffffff811c690e>] link_path_walk+0x7ae/0x820 [478381.227880] [<ffffffff811c6a42>] path_init+0xc2/0x430 [478381.228677] [<ffffffff813e6e26>] ? security_file_alloc+0x16/0x20 [478381.229776] [<ffffffff811c8c57>] path_openat+0x77/0x620 [478381.230767] [<ffffffff81185c6e>] ? page_add_file_rmap+0x2e/0x70 [478381.232019] [<ffffffff811cb253>] do_filp_open+0x43/0xa0 [478381.233016] [<ffffffff8108c4a9>] ? creds_are_invalid+0x29/0x70 [478381.234072] [<ffffffff811c0cb0>] do_open_execat+0x70/0x170 [478381.235039] [<ffffffff811c1bf8>] do_execveat_common.isra.36+0x1b8/0x6e0 [478381.236051] [<ffffffff811c214c>] do_execve+0x2c/0x30 [478381.236809] [<ffffffff811ca392>] ? getname+0x12/0x20 [478381.237564] [<ffffffff811c23be>] SyS_execve+0x2e/0x40 [478381.238338] [<ffffffff81780a1d>] stub_execve+0x6d/0xa0 [478381.239126] ------------[ cut here ]------------ [478381.239915] ------------[ cut here ]------------ [478381.240606] INFO: task python2.7:1168375 blocked for more than 120 seconds. [478381.242673] Not tainted 4.0.9-38_fbk5_hotfix1_2936_g85409c6 #1 [478381.243653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [478381.244902] python2.7 D ffff88005cf8fb98 0 1168375 1168248 0x00000080 [478381.244904] ffff88005cf8fb98 ffff88016c1f0980 ffffffff81c134c0 ffff88016c1f11a0 [478381.246023] ffff88005cf8ffd8 ffff880466cd0cbc ffff88016c1f0980 00000000ffffffff [478381.247138] ffff880466cd0cc0 ffff88005cf8fbb8 ffffffff8177cea7 ffff88005cf8fcc8 [478381.248252] Call Trace: [478381.248630] [<ffffffff8177cea7>] schedule+0x37/0x90 [478381.249382] [<ffffffff8177d08e>] schedule_preempt_disabled+0xe/0x10 [478381.250465] [<ffffffff8177e892>] __mutex_lock_slowpath+0x92/0x100 [478381.251409] [<ffffffff8177e91b>] mutex_lock+0x1b/0x2f [478381.252199] [<ffffffff817745ae>] lookup_slow+0x36/0xab [478381.253023] [<ffffffff811c690e>] link_path_walk+0x7ae/0x820 [478381.253877] [<ffffffff811aeb41>] ? try_charge+0xc1/0x700 [478381.254690] [<ffffffff811c6a42>] path_init+0xc2/0x430 [478381.255525] [<ffffffff813e6e26>] ? security_file_alloc+0x16/0x20 [478381.256450] [<ffffffff811c8c57>] path_openat+0x77/0x620 [478381.257256] [<ffffffff8115b2fb>] ? lru_cache_add_active_or_unevictable+0x2b/0xa0 [478381.258390] [<ffffffff8117b623>] ? handle_mm_fault+0x13f3/0x1720 [478381.259309] [<ffffffff811cb253>] do_filp_open+0x43/0xa0 [478381.260139] [<ffffffff811d7ae2>] ? __alloc_fd+0x42/0x120 [478381.260962] [<ffffffff811b95ac>] do_sys_open+0x13c/0x230 [478381.261779] [<ffffffff81011393>] ? syscall_trace_enter_phase1+0x113/0x170 [478381.262851] [<ffffffff811b96c2>] SyS_open+0x22/0x30 [478381.263598] [<ffffffff81780532>] system_call_fastpath+0x12/0x17 [478381.264551] ------------[ cut here ]------------ [478381.265377] ------------[ cut here ]------------ Signed-off-by: NJens Axboe <axboe@fb.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
-
- 09 6月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
Instead of overloading the discard support with the REQ_SECURE flag. Use the opportunity to rename the queue flag as well, and remove the dead checks for this flag in the RAID 1 and RAID 10 drivers that don't claim support for secure erase. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 08 6月, 2016 10 次提交
-
-
由 Mike Christie 提交于
To avoid confusion between REQ_OP_FLUSH, which is handled by request_fn drivers, and upper layers requesting the block layer perform a flush sequence along with possibly a WRITE, this patch renames REQ_FLUSH to REQ_PREFLUSH. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
We don't need bi_rw to be so large on 64 bit archs, so reduce it to unsigned int. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This patch converts the is_sync helpers to use separate variables for the operation and flags. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This patch converts the block layer merging code to use separate variables for the operation and flags, and to check req_op for the REQ_OP. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This patch converts the elevator code to use separate variables for the operation and flags, and to check req_op for the REQ_OP. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This patch prepares *_get_request/*_put_request and freed_request, to use separate variables for the operation and flags. In the next patches the struct request users will be converted like was done for bios where the op and flags are set separately. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
The bio users should now always be setting up the bio op. This patch has the block layer copy that to the request. Signed-off-by: NMike Christie <mchristi@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This patch converts the simple bi_rw use cases in the block, drivers, mm and fs code to set/get the bio operation using bio_set_op_attrs/bio_op These should be simple one or two liner cases, so I just did them in one patch. The next patches handle the more complicated cases in a module per patch. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
We currently set REQ_WRITE/WRITE for all non READ IOs like discard, flush, writesame, etc. In the next patches where we no longer set up the op as a bitmap, we will not be able to detect a operation direction like writesame by testing if REQ_WRITE is set. This patch converts the drivers and cgroup to use the op_is_write helper. This should just cover the simple cases. I did dm, md and bcache in their own patches because they were more involved. Signed-off-by: NMike Christie <mchristi@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Mike Christie 提交于
This has callers of submit_bio/submit_bio_wait set the bio->bi_rw instead of passing it in. This makes that use the same as generic_make_request and how we set the other bio fields. Signed-off-by: NMike Christie <mchristi@redhat.com> Fixed up fs/ext4/crypto.c Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 14 4月, 2016 1 次提交
-
-
由 Jens Axboe 提交于
Now that we converted everything to the newer block write cache interface, kill off the queue flush_flags and queueable flush entries. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 13 4月, 2016 1 次提交
-
-
由 Ming Lin 提交于
We could kmalloc() the payload, so need the offset in page. Signed-off-by: NMing Lin <ming.l@ssi.samsung.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 05 4月, 2016 1 次提交
-
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 2月, 2016 1 次提交
-
-
由 Mike Snitzer 提交于
Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower than if an underlying null_blk device were used directly. One of the reasons for this drop in performance is that blk_insert_clone_request() was calling blk_mq_insert_request() with @async=true. This forced the use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues which ushered in ping-ponging between process context (fio in this case) and kblockd's kworker to submit the cloned request. The ftrace function_graph tracer showed: kworker-2013 => fio-12190 fio-12190 => kworker-2013 ... kworker-2013 => fio-12190 fio-12190 => kworker-2013 ... Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to _not_ use kblockd to submit the cloned requests isn't enough to eliminate the observed context switches. In addition to this dm-mq specific blk-core fix, there are 2 DM core fixes to dm-mq that (when paired with the blk-core fix) completely eliminate the observed context switching: 1) don't blk_mq_run_hw_queues in blk-mq request completion Motivated by desire to reduce overhead of dm-mq, punting to kblockd just increases context switches. In my testing against a really fast null_blk device there was no benefit to running blk_mq_run_hw_queues() on completion (and no other blk-mq driver does this). So hopefully this change doesn't induce the need for yet another revert like commit 621739b0 ! 2) use blk_mq_complete_request() in dm_complete_request() blk_complete_request() doesn't offer the traditional q->mq_ops vs .request_fn branching pattern that other historic block interfaces do (e.g. blk_get_request). Using blk_mq_complete_request() for blk-mq requests is important for performance. It should be noted that, like blk_complete_request(), blk_mq_complete_request() doesn't natively handle partial completions -- but the request-based DM-multipath target does provide the required partial completion support by dm.c:end_clone_bio() triggering requeueing of the request via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE. dm-mq fix #2 is _much_ more important than #1 for eliminating the context switches. Before: cpu : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475 After: cpu : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472 With these changes multithreaded async read IOPs improved from ~950K to ~1350K for this dm-mq stacked on null_blk test-case. The raw read IOPs of the underlying null_blk device for the same workload is ~1950K. Fixes: 7fb4898e ("block: add blk-mq support to blk_insert_cloned_request()") Fixes: bfebd1cd ("dm: add full blk-mq support to request-based DM") Cc: stable@vger.kernel.org # 4.1+ Reported-by: NSagi Grimberg <sagig@dev.mellanox.co.il> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Acked-by: NJens Axboe <axboe@kernel.dk>
-
- 19 2月, 2016 1 次提交
-
-
由 Mika Westerberg 提交于
If block device is left runtime suspended during system suspend, resume hook of the driver typically corrects runtime PM status of the device back to "active" after it is resumed. However, this is not enough as queue's runtime PM status is still "suspended". As long as it is in this state blk_pm_peek_request() returns NULL and thus prevents new requests to be processed. Add new function blk_set_runtime_active() that can be used to force the queue status back to "active" as needed. Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Acked-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 05 2月, 2016 1 次提交
-
-
由 Martin K. Petersen 提交于
When a storage device rejects a WRITE SAME command we will disable write same functionality for the device and return -EREMOTEIO to the block layer. -EREMOTEIO will in turn prevent DM from retrying the I/O and/or failing the path. Yiwen Jiang discovered a small race where WRITE SAME requests issued simultaneously would cause -EIO to be returned. This happened because any requests being prepared after WRITE SAME had been disabled for the device caused us to return BLKPREP_KILL. The latter caused the block layer to return -EIO upon completion. To overcome this we introduce BLKPREP_INVALID which indicates that this is an invalid request for the device. blk_peek_request() is modified to return -EREMOTEIO in that case. Reported-by: NYiwen Jiang <jiangyiwen@huawei.com> Suggested-by: NMike Snitzer <snitzer@redhat.com> Reviewed-by: NHannes Reinicke <hare@suse.de> Reviewed-by: NEwan Milne <emilne@redhat.com> Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 29 12月, 2015 1 次提交
-
-
由 Jens Axboe 提交于
We currently only have an inline/sync helper to restart a stopped queue. If drivers need an async version, they have to roll their own. Add a generic helper instead. Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 23 12月, 2015 2 次提交
-
-
由 Junichi Nomura 提交于
blk_queue_bio() does split then bounce, which makes the segment counting based on pages before bouncing and could go wrong. Move the split to after bouncing, like we do for blk-mq, and the we fix the issue of having the bio count for segments be wrong. Fixes: 54efd50b ("block: make generic_make_request handle arbitrarily sized bios") Cc: stable@vger.kernel.org Tested-by: NArtem S. Tashkinov <t.artem@lycos.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Christoph Hellwig 提交于
Timer context is not very useful for drivers to perform any meaningful abort action from. So instead of calling the driver from this useless context defer it to a workqueue as soon as possible. Note that while a delayed_work item would seem the right thing here I didn't dare to use it due to the magic in blk_add_timer that pokes deep into timer internals. But maybe this encourages Tejun to add a sensible API for that to the workqueue API and we'll all be fine in the end :) Contains a major update from Keith Bush: "This patch removes synchronizing the timeout work so that the timer can start a freeze on its own queue. The timer enters the queue, so timer context can only start a freeze, but not wait for frozen." Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 12月, 2015 1 次提交
-
-
由 Ken Xue 提交于
The routines in scsi_pm.c assume that if a runtime-PM callback is invoked for a SCSI device, it can only mean that the device's driver has asked the block layer to handle the runtime power management (by calling blk_pm_runtime_init(), which among other things sets q->dev). However, this assumption turns out to be wrong for things like the ses driver. Normally ses devices are not allowed to do runtime PM, but userspace can override this setting. If this happens, the kernel gets a NULL pointer dereference when blk_post_runtime_resume() tries to use the uninitialized q->dev pointer. This patch fixes the problem by checking q->dev in block layer before handle runtime PM. Since ses doesn't define any PM callbacks and call blk_pm_runtime_init(), the crash won't occur. This fixes Bugzilla #101371. https://bugzilla.kernel.org/show_bug.cgi?id=101371 More discussion can be found from below link. http://marc.info/?l=linux-scsi&m=144163730531875&w=2Signed-off-by: NKen Xue <Ken.Xue@amd.com> Acked-by: NAlan Stern <stern@rowland.harvard.edu> Cc: Xiangliang Yu <Xiangliang.Yu@amd.com> Cc: James E.J. Bottomley <JBottomley@odin.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Michael Terry <Michael.terry@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 02 12月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
We already have the reserved flag, and a nowait flag awkwardly encoded as a gfp_t. Add a real flags argument to make the scheme more extensible and allow for a nicer calling convention. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 30 11月, 2015 1 次提交
-
-
由 Hannes Reinecke 提交于
When a cloned request is retried on other queues it always needs to be checked against the queue limits of that queue. Otherwise the calculations for nr_phys_segments might be wrong, leading to a crash in scsi_init_sgtable(). To clarify this the patch renames blk_rq_check_limits() to blk_cloned_rq_check_limits() and removes the symbol export, as the new function should only be used for cloned requests and never exported. Cc: Mike Snitzer <snitzer@redhat.com> Cc: Ewan Milne <emilne@redhat.com> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: NHannes Reinecke <hare@suse.de> Fixes: e2a60da7 ("block: Clean up special command handling logic") Cc: stable@vger.kernel.org # 3.7+ Acked-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 25 11月, 2015 2 次提交
-
-
由 Wei Tang 提交于
This patch fixes the checkpatch.pl error to blk-exec.c: ERROR: do not initialise globals to 0 or NULL Signed-off-by: NWei Tang <tangwei@cmss.chinamobile.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Ilya Dryomov 提交于
Name the cache after the actual name of the struct. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 12 11月, 2015 1 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc warning in blk-core.c: Warning(..//block/blk-core.c:1549): No description found for parameter 'same_queue_rq' Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-