提交 · c6538499814d8112c5d4d08570a7cf0758e5f8f5 · openeuler / Kernel

12 5月, 2009 1 次提交

block: fix the bio_vec array index out-of-bounds test · af498d7f

由 Kazuhisa Ichikawa 提交于 5月 12, 2009

Current bio_vec array index out-of-bounds test within
__end_that_request_first() does not seem correct.
It checks bio->bi_idx against bio->bi_vcnt, but the subsequent code
uses idx (which is, bio->bi_idx + next_idx) as the array index into
bio_vec array. This means that the test really make sense only at
the first iteration of !(nr_bytes >=bio->bi_size) case (when next_idx
== zero). Fix this by replacing bio->bi_idx with idx.
(This patch applies to 2.6.30-rc4.)
Signed-off-by: NKazuhisa Ichikawa <ki@epsilou.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

af498d7f

24 4月, 2009 1 次提交

block: simplify I/O stat accounting · 42dad764

由 Jerome Marchand 提交于 4月 22, 2009

This simplifies I/O stat accounting switching code and separates it
completely from I/O scheduler switch code.

Requests are accounted according to the state of their request queue
at the time of the request allocation. There is no need anymore to
flush the request queue when switching I/O accounting state.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

42dad764

07 4月, 2009 2 次提交

block: remove unused REQ_UNPLUG · 23853277

由 Jens Axboe 提交于 4月 07, 2009

The request inherits the unplug flag from the bio, but it isn't actually
used. The bio flag stops at __make_request(), which tells it to unplug
after submission. Passing it on to the request doesn't make any sense.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

23853277

block: fix inconsistency in I/O stat accounting code · 26308eab

由 Jerome Marchand 提交于 3月 27, 2009

This forces in_flight to be zero when turning off or on the I/O stat
accounting and stops updating I/O stats in attempt_merge() when
accounting is turned off.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

26308eab

06 4月, 2009 3 次提交

block: Add flag for telling the IO schedulers NOT to anticipate more IO · aeb6fafb

由 Jens Axboe 提交于 4月 06, 2009

By default, CFQ will anticipate more IO from a given io context if the
previously completed IO was sync. This used to be fine, since the only
sync IO was reads and O_DIRECT writes. But with more "normal" sync writes
being used now, we don't want to anticipate for those.

Add a bio/request flag that informs the IO scheduler that this is a sync
request that we should not idle for. Introduce WRITE_ODIRECT specifically
for O_DIRECT writes, and make sure that the other sync writes set this
flag.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aeb6fafb

block: enabling plugging on SSD devices that don't do queuing · 644b2d99

由 Jens Axboe 提交于 4月 06, 2009

For the older SSD devices that don't do command queuing, we do want to
enable plugging to get better merging.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

644b2d99

block: change the request allocation/congestion logic to be sync/async based · 1faa16d2

由 Jens Axboe 提交于 4月 06, 2009

This makes sure that we never wait on async IO for sync requests, instead
of doing the split on writes vs reads.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1faa16d2

03 4月, 2009 1 次提交

blktrace: fix pdu_len when tracing packet command requests · e2494e1b

由 Li Zefan 提交于 4月 02, 2009

Impact: output all of packet commands - not just the first 4 / 8 bytes

Since commit d7e3c324 ("block: add
large command support"), struct request->cmd has been changed from
unsinged char cmd[BLK_MAX_CDB] to unsigned char *cmd.

v1 -> v2: by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>

- make sure rq->cmd_len is always intialized, and then we can use
  rq->cmd_len instead of BLK_MAX_CDB.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Acked-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
LKML-Reference: <49D4507E.2060602@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e2494e1b

26 3月, 2009 1 次提交

block: WARN in __blk_put_request() for potential bio leak · 1cd96c24

由 Boaz Harrosh 提交于 3月 24, 2009

Put a WARN_ON in __blk_put_request if it is about to
leak bio(s). This is a serious bug that can happen in error
handling code paths.

For this to work I have fixed a couple of places in block/ where
request->bio != NULL ownership was not honored. And a small cleanup
at sg_io() while at it.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1cd96c24

24 3月, 2009 2 次提交
- J
  block: get rid of unused blkdev_free_rq() define · 50e17493
  由 Jens Axboe 提交于 3月 06, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  50e17493
- J
  block: remove various blk_queue_*() setting functions in blk_init_queue_node() · f3b144aa
  由 Jens Axboe 提交于 3月 06, 2009
```
It calls blk_queue_make_request(), which sets the identical set of limits.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  f3b144aa
02 2月, 2009 1 次提交

block: fix oops in blk_queue_io_stat() · fb8ec18c

由 Jens Axboe 提交于 2月 02, 2009

Some initial probe requests don't have disk->queue mapped yet, so we
can't rely on a non-NULL queue in blk_queue_io_stat(). Wrap it in
blk_do_io_stat().
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fb8ec18c

30 1月, 2009 3 次提交

block: add sysfs file for controlling io stats accounting · bc58ba94

由 Jens Axboe 提交于 1月 23, 2009

This allows us to turn off disk stat accounting completely, for the cases
where the 0.5-1% reduction in system time is important.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bc58ba94

block: silently error an unsupported barrier bio · cec0707e

由 Jens Axboe 提交于 1月 13, 2009

This fixes a "regression" from 2.6.28, where the barrier probes that file
systems may do would trigger additional end request warnings in dmesg.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cec0707e

J
block: seperate bio/request unplug and sync bits · 213d9417
由 Jens Axboe 提交于 1月 06, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
213d9417

29 12月, 2008 5 次提交

block: don't use plugging on SSD devices · a31a9738

由 Jens Axboe 提交于 10月 17, 2008

We just want to hand the first bits of IO to the device as fast
as possible. Gains a few percent on the IOPS rate.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a31a9738

block: remove duplicate or unused barrier/discard error paths · a7384677

由 Tejun Heo 提交于 11月 28, 2008

* Because barrier mode can be changed dynamically, whether barrier is
  supported or not can be determined only when actually issuing the
  barrier and there is no point in checking it earlier.  Drop barrier
  support check in generic_make_request() and __make_request(), and
  update comment around the support check in blk_do_ordered().

* There is no reason to check discard support in both
  generic_make_request() and __make_request().  Drop the check in
  __make_request().  While at it, move error action block to the end
  of the function and add unlikely() to q existence test.

* Barrier request, be it empty or not, is never passed to low level
  driver and thus it's meaningless to try to copy back req->sector to
  bio->bi_sector on error.  In addition, the notion of failed sector
  doesn't make any sense for empty barrier to begin with.  Drop the
  code block from __end_that_request_first().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a7384677

block: use cancel_work_sync() instead of kblockd_flush_work() · 64d01dc9

由 Cheng Renquan 提交于 12月 03, 2008

After many improvements on kblockd_flush_work, it is now identical to
cancel_work_sync, so a direct call to cancel_work_sync is suggested.

The only difference is that cancel_work_sync is a GPL symbol,
so no non-GPL modules anymore.
Signed-off-by: NCheng Renquan <crquan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

64d01dc9

block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set · 08bafc03

由 Keith Mannthey 提交于 11月 25, 2008

Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer.  The buffer
layer can then suppress needless printks.

This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.

During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block 5242878
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242872
"
in my logs.

My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd.  This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.

 I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue.  A few runs of
bonnie++ show no noticeable difference in performance in my setup.

Thanks for John Stultz for the quiet_error finalization.
Submitted-by: NKeith Mannthey <kmannth@us.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

08bafc03

block: leave the request timeout timer running even on an empty list · 70ed28b9

由 Jens Axboe 提交于 11月 19, 2008

For sync IO, we'll often do them serialized. This means we'll be touching
the queue timer for every IO, as opposed to only occasionally like we
do for queued IO. Instead of deleting the timer when the last request
is removed, just let continue running. If a new request comes up soon
we then don't have to readd the timer again. If no new requests arrive,
the timer will expire without side effect later.

This improves high iops sync IO by ~1%.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

70ed28b9

03 12月, 2008 2 次提交

block: fix setting of max_segment_size and seg_boundary mask · 0e435ac2

由 Milan Broz 提交于 12月 03, 2008

Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.

When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.

If you create MD device over SCSI, these attributes are zeroed.

Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:

request_queue   max_segment_size  seg_boundary_mask
SCSI                65536             0xffffffff
MD RAID1                0                      0
LVM                 65536                 -1 (64bit)

Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
physical segments according to these parameters.

During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit.  (After
bio_clone() in stack operation.)

Thi is specially problem in CCISS driver, where it produce OOPS here

    BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);

(MAXSEGENTRIES is 31 by default.)

Sometimes even this command is enough to cause oops:

  dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10

This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).

For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().

The patch tries to fix it by:

 * initializing attributes above in queue request constructor
   blk_queue_make_request()

 * make sure that blk_queue_stack_limits() inherits setting

 (DM uses its own function to set the limits because it
 blk_queue_stack_limits() was introduced later.  It should probably switch
 to use generic stack limit function too.)

 * sets the default seg_boundary value in one place (blkdev.h)

 * use this mask as default in DM (instead of -1, which differs in 64bit)

Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672Signed-off-by: NMilan Broz <mbroz@redhat.com>
Reviewed-by: NAlasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0e435ac2

block: internal dequeue shouldn't start timer · 53a08807

由 Tejun Heo 提交于 12月 03, 2008

blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
both start the timeout timer.  Barrier code dequeues the original
barrier request but doesn't passes the request itself to lower level
driver, only broken down proxy requests; however, as the original
barrier code goes through the same dequeue path and timeout timer is
started on it.  If barrier sequence takes long enough, this timer
expires but the low level driver has no idea about this request and
oops follows.

Timeout timer shouldn't have been started on the original barrier
request as it never goes through actual IO.  This patch unexports
elv_dequeue_request(), which has no external user anyway, and makes it
operate on elevator proper w/o adding the timer and make
blkdev_dequeue_request() call elv_dequeue_request() and add timer.
Internal users which don't pass the request to driver - barrier code
and end_that_request_last() - are converted to use
elv_dequeue_request().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

53a08807

26 11月, 2008 2 次提交

blktrace: port to tracepoints, update · 0bfc2455

由 Ingo Molnar 提交于 11月 26, 2008

Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE()
sites. Spread them out to the usage sites, as suggested by
Mathieu Desnoyers.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>

0bfc2455

blktrace: port to tracepoints · 5f3ea37c

由 Arnaldo Carvalho de Melo 提交于 10月 30, 2008

This was a forward port of work done by Mathieu Desnoyers, I changed it to
encode the 'what' parameter on the tracepoint name, so that one can register
interest in specific events and not on classes of events to then check the
'what' parameter.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5f3ea37c

06 11月, 2008 1 次提交

blk: move blk_delete_timer call in end_that_request_last · e78042e5

由 Mike Anderson 提交于 10月 30, 2008

Move the calling  blk_delete_timer to later in end_that_request_last to
address an issue where blkdev_dequeue_request may have add a timer for the
request.
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e78042e5

17 10月, 2008 4 次提交

block: remove __generic_unplug_device() from exports · f73e2d13

由 Jens Axboe 提交于 10月 17, 2008

The only out-of-core user is IDE, and that should be using
blk_start_queueing() instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f73e2d13

block: move q->unplug_work initialization · 713ada9b

由 Peter Zijlstra 提交于 10月 16, 2008

modprobe loop; rmmod loop effectively creates a blk_queue and destroys it
which results in q->unplug_work being canceled without it ever being
initialized.

Therefore, move the initialization of q->unplug_work from
blk_queue_make_request() to blk_alloc_queue*().
Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

713ada9b

block: fix current kernel-doc warnings · 496aa8a9

由 Randy Dunlap 提交于 10月 16, 2008

Fix block kernel-doc warnings:

Warning(linux-2.6.27-git4//fs/block_dev.c:1272): No description found for parameter 'path'
Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'cpu'
Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'part'
Warning(/var/linsrc/linux-2.6.27-git4//block/genhd.c:544): No description found for parameter 'partno'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

496aa8a9

block: only call ->request_fn when the queue is not stopped · 80a4b58e

由 Jens Axboe 提交于 10月 14, 2008

Callers should use either blk_run_queue/__blk_run_queue, or
blk_start_queueing() to invoke request handling instead of calling
->request_fn() directly as that does not take the queue stopped
flag into account.

Also add appropriate comments on the above functions to detail
their usage.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

80a4b58e

13 10月, 2008 1 次提交

[SCSI] block: separate failfast into multiple bits. · 6000a368

由 Mike Christie 提交于 8月 19, 2008

Multipath is best at handling transport errors. If it gets a device
error then there is not much the multipath layer can do. It will just
access the same device but from a different path.

This patch breaks up failfast into device, transport and driver errors.
The multipath layers (md and dm mutlipath) only ask the lower levels to
fast fail transport errors. The user of failfast, read ahead, will ask
to fast fail on all errors.

Note that blk_noretry_request will return true if any failfast bit
is set. This allows drivers that do not support the multipath failfast
bits to continue to fail on any failfast error like before. Drivers
like scsi that are able to fail fast specific errors can check
for the specific fail fast type. In the next patch I will convert
scsi.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

6000a368

09 10月, 2008 10 次提交

block: remove end_{queued|dequeued}_request() · d00e29fd

由 Kiyoshi Ueda 提交于 10月 01, 2008

This patch removes end_queued_request() and end_dequeued_request(),
which are no longer used.

As a results, users of __end_request() became only end_request().
So the actual code in __end_request() is moved to end_request()
and __end_request() is removed.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d00e29fd

block: add lld busy state exporting interface · ef9e3fac

由 Kiyoshi Ueda 提交于 10月 01, 2008

This patch adds an new interface, blk_lld_busy(), to check lld's
busy state from the block layer.
blk_lld_busy() calls down into low-level drivers for the checking
if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().

This resolves a performance problem on request stacking devices below.

Some drivers like scsi mid layer stop dispatching request when
they detect busy state on its low-level device like host/target/device.
It allows other requests to stay in the I/O scheduler's queue
for a chance of merging.

Request stacking drivers like request-based dm should follow
the same logic.
However, there is no generic interface for the stacked device
to check if the underlying device(s) are busy.
If the request stacking driver dispatches and submits requests to
the busy underlying device, the requests will stay in
the underlying device's queue without a chance of merging.
This causes performance problem on burst I/O load.

With this patch, busy state of the underlying device is exported
via q->lld_busy_fn().  So the request stacking driver can check it
and stop dispatching requests if busy.

The underlying device driver must return the busy state appropriately:
    1: when the device driver can't process requests immediately.
    0: when the device driver can process requests immediately,
       including abnormal situations where the device driver needs
       to kill all requests.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ef9e3fac

block: Fix blk_start_queueing() to not kick a stopped queue · 336c3d8c

由 Elias Oltmanns 提交于 10月 01, 2008

blk_start_queueing() should act like the generic queue unplugging
and kicking and ignore a stopped queue. Such a queue may not be
run until after a call to blk_start_queue().
Signed-off-by: NElias Oltmanns <eo@nebensachen.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

336c3d8c

block: add a queue flag for request stacking support · 4ee5eaf4

由 Kiyoshi Ueda 提交于 9月 18, 2008

This patch adds a queue flag to indicate the block device can be
used for request stacking.

Request stacking drivers need to stack their devices on top of
only devices of which q->request_fn is functional.
Since bio stacking drivers (e.g. md, loop) basically initialize
their queue using blk_alloc_queue() and don't set q->request_fn,
the check of (q->request_fn == NULL) looks enough for that purpose.

However, dm will become both types of stacking driver (bio-based and
request-based).  And dm will always set q->request_fn even if the dm
device is bio-based of which q->request_fn is not functional actually.
So we need something else to distinguish the type of the device.
Adding a queue flag is a solution for that.

The reason why dm always sets q->request_fn is to keep
the compatibility of dm user-space tools.
Currently, all dm user-space tools are using bio-based dm without
specifying the type of the dm device they use.
To use request-based dm without changing such tools, the kernel
must decide the type of the dm device automatically.
The automatic type decision can't be done at the device creation time
and needs to be deferred until such tools load a mapping table,
since the actual type is decided by dm target type included in
the mapping table.

So a dm device has to be initialized using blk_init_queue()
so that we can load either type of table.
Then, all queue stuffs are set (e.g. q->request_fn) and we have
no element to distinguish that it is bio-based or request-based,
even after a table is loaded and the type of the device is decided.

By the way, some stuffs of the queue (e.g. request_list, elevator)
are needless when the dm device is used as bio-based.
But the memory size is not so large (about 20[KB] per queue on ia64),
so I hope the memory loss can be acceptable for bio-based dm users.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4ee5eaf4

block: add request submission interface · 82124d60

由 Kiyoshi Ueda 提交于 9月 18, 2008

This patch adds blk_insert_cloned_request(), a generic request
submission interface for request stacking drivers.
Request-based dm will use it to submit their clones to underlying
devices.

blk_rq_check_limits() is also added because it is possible that
the lower queue has stronger limitations than the upper queue
if multiple drivers are stacking at request-level.
Not only for blk_insert_cloned_request()'s internal use, the function
will be used by request-based dm when the queue limitation is
modified (e.g. by replacing dm's table).
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

82124d60

block: add request update interface · 32fab448

由 Kiyoshi Ueda 提交于 9月 18, 2008

This patch adds blk_update_request(), which updates struct request
with completing its data part, but doesn't complete the struct
request itself.
Though it looks like end_that_request_first() of older kernels,
blk_update_request() should be used only by request stacking drivers.

Request-based dm will use it in bio->bi_end_io callback to update
the original request when a data part of a cloned request completes.
Followings are additional background information of why request-based
dm needs this interface.

  - Request stacking drivers can't use blk_end_request() directly from
    the lower driver's completion context (bio->bi_end_io or rq->end_io),
    because some device drivers (e.g. ide) may try to complete
    their request with queue lock held, and it may cause deadlock.
    See below for detailed description of possible deadlock:
    <http://marc.info/?l=linux-kernel&m=120311479108569&w=2>

  - To solve that, request-based dm offloads the completion of
    cloned struct request to softirq context (i.e. using
    blk_complete_request() from rq->end_io).

  - Though it is possible to use the same solution from bio->bi_end_io,
    it will delay the notification of bio completion to the original
    submitter.  Also, it will cause inefficient partial completion,
    because the lower driver can't perform the cloned request anymore
    and request-based dm needs to requeue and redispatch it to
    the lower driver again later.  That's not good.

  - So request-based dm needs blk_update_request() to perform the bio
    completion in the lower driver's completion context, which is more
    efficient.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

32fab448

block: blk_cleanup_queue() should call blk_sync_queue() · e3335de9

由 Jens Axboe 提交于 9月 18, 2008

When a driver calls blk_cleanup_queue(), the device should be fully idle.
However, the block layer may have pending plugging timers and the IO
schedulers may have pending work in the work queues. So quisce the device
by waiting for the timer and flushing the work queues.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e3335de9

block: unify request timeout handling · 242f9dcb

由 Jens Axboe 提交于 9月 14, 2008

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

242f9dcb

block: update comment on end_request() · 839e96af

由 Jens Axboe 提交于 9月 02, 2008

It refers to functions that no longer exist after the IO completion
changes.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

839e96af

block: don't use bio_has_data() in the completion path · 60540161

由 Jens Axboe 提交于 8月 26, 2008

We should just check for rq->bio, as that is really the information
we are looking for. Even if the bio attached doesn't carry data,
we still need to do IO post processing on it.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

60540161

openeuler / Kernel 12 个月 前同步成功

openeuler / Kernel
12 个月前同步成功