提交 · 0221c81b1b8eb0cbb6b30a0ced52ead32d2b4e4c · OpenHarmony / kernel_linux

26 2月, 2009 1 次提交

block: reduce stack footprint of blk_recount_segments() · 1e428079

由 Jens Axboe 提交于 2月 23, 2009

blk_recalc_rq_segments() requires a request structure passed in, which
we don't have from blk_recount_segments(). So the latter allocates one on
the stack, using > 400 bytes of stack for that. This can cause us to spill
over one page of stack from ext4 at least:

 0)     4560     400   blk_recount_segments+0x43/0x62
 1)     4160      32   bio_phys_segments+0x1c/0x24
 2)     4128      32   blk_rq_bio_prep+0x2a/0xf9
 3)     4096      32   init_request_from_bio+0xf9/0xfe
 4)     4064     112   __make_request+0x33c/0x3f6
 5)     3952     144   generic_make_request+0x2d1/0x321
 6)     3808      64   submit_bio+0xb9/0xc3
 7)     3744      48   submit_bh+0xea/0x10e
 8)     3696     368   ext4_mb_init_cache+0x257/0xa6a [ext4]
 9)     3328     288   ext4_mb_regular_allocator+0x421/0xcd9 [ext4]
10)     3040     160   ext4_mb_new_blocks+0x211/0x4b4 [ext4]
11)     2880     336   ext4_ext_get_blocks+0xb61/0xd45 [ext4]
12)     2544      96   ext4_get_blocks_wrap+0xf2/0x200 [ext4]
13)     2448      80   ext4_da_get_block_write+0x6e/0x16b [ext4]
14)     2368     352   mpage_da_map_blocks+0x7e/0x4b3 [ext4]
15)     2016     352   ext4_da_writepages+0x2ce/0x43c [ext4]
16)     1664      32   do_writepages+0x2d/0x3c
17)     1632     144   __writeback_single_inode+0x162/0x2cd
18)     1488      96   generic_sync_sb_inodes+0x1e3/0x32b
19)     1392      16   sync_sb_inodes+0xe/0x10
20)     1376      48   writeback_inodes+0x69/0xb3
21)     1328     208   balance_dirty_pages_ratelimited_nr+0x187/0x2f9
22)     1120     224   generic_file_buffered_write+0x1d4/0x2c4
23)      896     176   __generic_file_aio_write_nolock+0x35f/0x393
24)      720      80   generic_file_aio_write+0x6c/0xc8
25)      640      80   ext4_file_write+0xa9/0x137 [ext4]
26)      560     320   do_sync_write+0xf0/0x137
27)      240      48   vfs_write+0xb3/0x13c
28)      192      64   sys_write+0x4c/0x74
29)      128     128   system_call_fastpath+0x16/0x1b

Split the segment counting out into a __blk_recalc_rq_segments() helper
to avoid allocating an onstack request just for checking the physical
segment count.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1e428079

02 2月, 2009 1 次提交
- J
  block: fix inconsistent parenthesisation of QUEUE_FLAG_DEFAULT · 0648e10d
  由 Jens Axboe 提交于 2月 02, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  0648e10d
30 1月, 2009 2 次提交

block: add sysfs file for controlling io stats accounting · bc58ba94

由 Jens Axboe 提交于 1月 23, 2009

This allows us to turn off disk stat accounting completely, for the cases
where the 0.5-1% reduction in system time is important.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bc58ba94

J
block: seperate bio/request unplug and sync bits · 213d9417
由 Jens Axboe 提交于 1月 06, 2009
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
213d9417

03 1月, 2009 2 次提交

[SCSI] block: make blk_rq_map_user take a NULL user-space buffer for WRITE · 97ae77a1

由 FUJITA Tomonori 提交于 12月 18, 2008

The commit 81882766 (block: make
blk_rq_map_user take a NULL user-space buffer) extended
blk_rq_map_user to accept a NULL user-space buffer with a READ
command. It was necessary to convert sg to use the block layer mapping
API.

This patch extends blk_rq_map_user again for a WRITE command. It is
necessary to convert st and osst drivers to use the block layer
apping API.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

97ae77a1

[SCSI] block: fix the partial mappings with struct rq_map_data · 56c451f4

由 FUJITA Tomonori 提交于 12月 18, 2008

This fixes bio_copy_user_iov to properly handle the partial mappings
with struct rq_map_data (which only sg uses for now but st and osst
will shortly). It adds the offset member to struct rq_map_data and
changes blk_rq_map_user to update it so that bio_copy_user_iov can add
an appropriate page frame via bio_add_pc_page().
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

56c451f4

29 12月, 2008 7 次提交

block: get rid of elevator_t typedef · b374d18a

由 Jens Axboe 提交于 10月 31, 2008

Just use struct elevator_queue everywhere instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b374d18a

block: simplify empty barrier implementation · 58eea927

由 Tejun Heo 提交于 11月 28, 2008

Empty barrier required special handling in __elv_next_request() to
complete it without letting the low level driver see it.

With previous changes, barrier code is now flexible enough to skip the
BAR step using the same barrier sequence selection mechanism.  Drop
the special handling and mask off q->ordered from start_ordered().

Remove blk_empty_barrier() test which now has no user.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

58eea927

block: make barrier completion more robust · 8f11b3e9

由 Tejun Heo 提交于 11月 28, 2008

Barrier completion had the following assumptions.

* start_ordered() couldn't finish the whole sequence properly.  If all
  actions are to be skipped, q->ordseq is set correctly but the actual
  completion was never triggered thus hanging the barrier request.

* Drain completion in elv_complete_request() assumed that there's
  always at least one request in the queue when drain completes.

Both assumptions are true but these assumptions need to be removed to
improve empty barrier implementation.  This patch makes the following
changes.

* Make start_ordered() use blk_ordered_complete_seq() to mark skipped
  steps complete and notify __elv_next_request() that it should fetch
  the next request if the whole barrier has completed inside
  start_ordered().

* Make drain completion path in elv_complete_request() check whether
  the queue is empty.  Empty queue also indicates drain completion.

* While at it, convert 0/1 return from blk_do_ordered() to false/true.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

8f11b3e9

block: make every barrier action optional · f671620e

由 Tejun Heo 提交于 11月 28, 2008

In all barrier sequences, the barrier write itself was always assumed
to be issued and thus didn't have corresponding control flag.  This
patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in
start_ordered() such that any barrier action can be skipped.

This patch doesn't introduce any visible behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f671620e

block: reorganize QUEUE_ORDERED_* constants · 313e4299

由 Tejun Heo 提交于 11月 28, 2008

Separate out ordering type (drain,) and action masks (preflush,
postflush, fua) from visible ordering mode selectors
(QUEUE_ORDERED_*).  Ordering types are now named QUEUE_ORDERED_BY_*
while action masks are named QUEUE_ORDERED_DO_*.

This change is necessary to add QUEUE_ORDERED_DO_BAR and make it
optional to improve empty barrier implementation.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

313e4299

block: use cancel_work_sync() instead of kblockd_flush_work() · 64d01dc9

由 Cheng Renquan 提交于 12月 03, 2008

After many improvements on kblockd_flush_work, it is now identical to
cancel_work_sync, so a direct call to cancel_work_sync is suggested.

The only difference is that cancel_work_sync is a GPL symbol,
so no non-GPL modules anymore.
Signed-off-by: NCheng Renquan <crquan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

64d01dc9

block: add queue flag for paravirt frontend drivers · 88e740f1

由 Fernando Luis Vázquez Cao 提交于 10月 27, 2008

As is the case with SSD devices, we do not want to idle in AS/CFQ when
the block device is a paravirt front-end driver. This patch adds a flag
(QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
virtio_blk and xen-blkfront to indicate a paravirtualized device.
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

88e740f1

06 12月, 2008 1 次提交

Enforce a minimum SG_IO timeout · f2f1fa78

由 Linus Torvalds 提交于 12月 05, 2008

There's no point in having too short SG_IO timeouts, since if the
command does end up timing out, we'll end up through the reset sequence
that is several seconds long in order to abort the command that timed
out.

As a result, shorter timeouts than a few seconds simply do not make
sense, as the recovery would be longer than the timeout itself.

Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT.
Suggested-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f2f1fa78

03 12月, 2008 2 次提交

block: fix setting of max_segment_size and seg_boundary mask · 0e435ac2

由 Milan Broz 提交于 12月 03, 2008

Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.

When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.

If you create MD device over SCSI, these attributes are zeroed.

Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:

request_queue   max_segment_size  seg_boundary_mask
SCSI                65536             0xffffffff
MD RAID1                0                      0
LVM                 65536                 -1 (64bit)

Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
physical segments according to these parameters.

During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit.  (After
bio_clone() in stack operation.)

Thi is specially problem in CCISS driver, where it produce OOPS here

    BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);

(MAXSEGENTRIES is 31 by default.)

Sometimes even this command is enough to cause oops:

  dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10

This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).

For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().

The patch tries to fix it by:

 * initializing attributes above in queue request constructor
   blk_queue_make_request()

 * make sure that blk_queue_stack_limits() inherits setting

 (DM uses its own function to set the limits because it
 blk_queue_stack_limits() was introduced later.  It should probably switch
 to use generic stack limit function too.)

 * sets the default seg_boundary value in one place (blkdev.h)

 * use this mask as default in DM (instead of -1, which differs in 64bit)

Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672Signed-off-by: NMilan Broz <mbroz@redhat.com>
Reviewed-by: NAlasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0e435ac2

block: internal dequeue shouldn't start timer · 53a08807

由 Tejun Heo 提交于 12月 03, 2008

blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
both start the timeout timer.  Barrier code dequeues the original
barrier request but doesn't passes the request itself to lower level
driver, only broken down proxy requests; however, as the original
barrier code goes through the same dequeue path and timeout timer is
started on it.  If barrier sequence takes long enough, this timer
expires but the low level driver has no idea about this request and
oops follows.

Timeout timer shouldn't have been started on the original barrier
request as it never goes through actual IO.  This patch unexports
elv_dequeue_request(), which has no external user anyway, and makes it
operate on elevator proper w/o adding the timer and make
blkdev_dequeue_request() call elv_dequeue_request() and add timer.
Internal users which don't pass the request to driver - barrier code
and end_that_request_last() - are converted to use
elv_dequeue_request().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

53a08807

21 10月, 2008 7 次提交

A
[PATCH] end of methods switch: remove the old ones · 90b8f282
由 Al Viro 提交于 3月 02, 2008
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
90b8f282

[PATCH] beginning of methods conversion · d4430d62

由 Al Viro 提交于 3月 02, 2008

To keep the size of changesets sane we split the switch by drivers;
to keep the damn thing bisectable we do the following:
	1) rename the affected methods, add ones with correct
prototypes, make (few) callers handle both.  That's this changeset.
	2) for each driver convert to new methods.  *ALL* drivers
are converted in this series.
	3) kill the old (renamed) methods.

Note that it _is_ a flagday; all in-tree drivers are converted and by the
end of this series no trace of old methods remain.  The only reason why
we do that this way is to keep the damn thing bisectable and allow per-driver
debugging if anything goes wrong.

New methods:
	open(bdev, mode)
	release(disk, mode)
	ioctl(bdev, mode, cmd, arg)		/* Called without BKL */
	compat_ioctl(bdev, mode, cmd, arg)
	locked_ioctl(bdev, mode, cmd, arg)	/* Called with BKL, legacy */
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d4430d62

[PATCH] introduce __blkdev_driver_ioctl() · 633a08b8

由 Al Viro 提交于 8月 29, 2007

Analog of blkdev_driver_ioctl() with sane arguments.  For
now uses fake struct file, by the end of the series it won't
and blkdev_driver_ioctl() will become a wrapper around it.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

633a08b8

A
[PATCH] move block_device_operations to blkdev.h · 08f85851
由 Al Viro 提交于 10月 08, 2007
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
08f85851
A
[PATCH] switch scsi_cmd_ioctl() to passing fmode_t · 74f3c8af
由 Al Viro 提交于 8月 27, 2007
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
74f3c8af
A
[PATCH] switch sg_scsi_ioctl() to passing fmode_t · e915e872
由 Al Viro 提交于 9月 02, 2008
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
e915e872
A
[PATCH] introduce fmode_t, do annotations · aeb5d727
由 Al Viro 提交于 9月 02, 2008
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
aeb5d727

17 10月, 2008 1 次提交

block: remove __generic_unplug_device() from exports · f73e2d13

由 Jens Axboe 提交于 10月 17, 2008

The only out-of-core user is IDE, and that should be using
blk_start_queueing() instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f73e2d13

13 10月, 2008 1 次提交

[SCSI] block: separate failfast into multiple bits. · 6000a368

由 Mike Christie 提交于 8月 19, 2008

Multipath is best at handling transport errors. If it gets a device
error then there is not much the multipath layer can do. It will just
access the same device but from a different path.

This patch breaks up failfast into device, transport and driver errors.
The multipath layers (md and dm mutlipath) only ask the lower levels to
fast fail transport errors. The user of failfast, read ahead, will ask
to fast fail on all errors.

Note that blk_noretry_request will return true if any failfast bit
is set. This allows drivers that do not support the multipath failfast
bits to continue to fail on any failfast error like before. Drivers
like scsi that are able to fail fast specific errors can check
for the specific fail fast type. In the next patch I will convert
scsi.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

6000a368

09 10月, 2008 15 次提交

block: gendisk integrity wrapper · b02739b0

由 Martin K. Petersen 提交于 10月 02, 2008

This is a wrapper for accessing a gendisk's integrity bits. It allows
the integrity support in MD to be compiled with BLK_DEV_INTEGRITY off.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b02739b0

block: Switch blk_integrity_compare from bdev to gendisk · ad7fce93

由 Martin K. Petersen 提交于 10月 01, 2008

The DM and MD integrity support now depends on being able to use
gendisks instead of block_devices when comparing integrity profiles.
Change function parameters accordingly.

Also update comparison logic so that two NULL profiles are a valid
configuration.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ad7fce93

block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1 · b04accc4

由 Jens Axboe 提交于 10月 02, 2008

We need bdev_get_integrity() to support the pending md/dm patches.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b04accc4

block: remove end_{queued|dequeued}_request() · d00e29fd

由 Kiyoshi Ueda 提交于 10月 01, 2008

This patch removes end_queued_request() and end_dequeued_request(),
which are no longer used.

As a results, users of __end_request() became only end_request().
So the actual code in __end_request() is moved to end_request()
and __end_request() is removed.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d00e29fd

block: add lld busy state exporting interface · ef9e3fac

由 Kiyoshi Ueda 提交于 10月 01, 2008

This patch adds an new interface, blk_lld_busy(), to check lld's
busy state from the block layer.
blk_lld_busy() calls down into low-level drivers for the checking
if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().

This resolves a performance problem on request stacking devices below.

Some drivers like scsi mid layer stop dispatching request when
they detect busy state on its low-level device like host/target/device.
It allows other requests to stay in the I/O scheduler's queue
for a chance of merging.

Request stacking drivers like request-based dm should follow
the same logic.
However, there is no generic interface for the stacked device
to check if the underlying device(s) are busy.
If the request stacking driver dispatches and submits requests to
the busy underlying device, the requests will stay in
the underlying device's queue without a chance of merging.
This causes performance problem on burst I/O load.

With this patch, busy state of the underlying device is exported
via q->lld_busy_fn().  So the request stacking driver can check it
and stop dispatching requests if busy.

The underlying device driver must return the busy state appropriately:
    1: when the device driver can't process requests immediately.
    0: when the device driver can process requests immediately,
       including abnormal situations where the device driver needs
       to kill all requests.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ef9e3fac

block: add queue flag for SSD/non-rotational devices · a68bbddb

由 Jens Axboe 提交于 9月 24, 2008

We don't want to idle in AS/CFQ if the device doesn't have a seek
penalty. So add a QUEUE_FLAG_NONROT to indicate a non-rotational
device, low level drivers should set this flag upon discovery of
an SSD or similar device type.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a68bbddb

block: add a queue flag for request stacking support · 4ee5eaf4

由 Kiyoshi Ueda 提交于 9月 18, 2008

This patch adds a queue flag to indicate the block device can be
used for request stacking.

Request stacking drivers need to stack their devices on top of
only devices of which q->request_fn is functional.
Since bio stacking drivers (e.g. md, loop) basically initialize
their queue using blk_alloc_queue() and don't set q->request_fn,
the check of (q->request_fn == NULL) looks enough for that purpose.

However, dm will become both types of stacking driver (bio-based and
request-based).  And dm will always set q->request_fn even if the dm
device is bio-based of which q->request_fn is not functional actually.
So we need something else to distinguish the type of the device.
Adding a queue flag is a solution for that.

The reason why dm always sets q->request_fn is to keep
the compatibility of dm user-space tools.
Currently, all dm user-space tools are using bio-based dm without
specifying the type of the dm device they use.
To use request-based dm without changing such tools, the kernel
must decide the type of the dm device automatically.
The automatic type decision can't be done at the device creation time
and needs to be deferred until such tools load a mapping table,
since the actual type is decided by dm target type included in
the mapping table.

So a dm device has to be initialized using blk_init_queue()
so that we can load either type of table.
Then, all queue stuffs are set (e.g. q->request_fn) and we have
no element to distinguish that it is bio-based or request-based,
even after a table is loaded and the type of the device is decided.

By the way, some stuffs of the queue (e.g. request_list, elevator)
are needless when the dm device is used as bio-based.
But the memory size is not so large (about 20[KB] per queue on ia64),
so I hope the memory loss can be acceptable for bio-based dm users.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4ee5eaf4

block: add request submission interface · 82124d60

由 Kiyoshi Ueda 提交于 9月 18, 2008

This patch adds blk_insert_cloned_request(), a generic request
submission interface for request stacking drivers.
Request-based dm will use it to submit their clones to underlying
devices.

blk_rq_check_limits() is also added because it is possible that
the lower queue has stronger limitations than the upper queue
if multiple drivers are stacking at request-level.
Not only for blk_insert_cloned_request()'s internal use, the function
will be used by request-based dm when the queue limitation is
modified (e.g. by replacing dm's table).
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

82124d60

block: add request update interface · 32fab448

由 Kiyoshi Ueda 提交于 9月 18, 2008

This patch adds blk_update_request(), which updates struct request
with completing its data part, but doesn't complete the struct
request itself.
Though it looks like end_that_request_first() of older kernels,
blk_update_request() should be used only by request stacking drivers.

Request-based dm will use it in bio->bi_end_io callback to update
the original request when a data part of a cloned request completes.
Followings are additional background information of why request-based
dm needs this interface.

  - Request stacking drivers can't use blk_end_request() directly from
    the lower driver's completion context (bio->bi_end_io or rq->end_io),
    because some device drivers (e.g. ide) may try to complete
    their request with queue lock held, and it may cause deadlock.
    See below for detailed description of possible deadlock:
    <http://marc.info/?l=linux-kernel&m=120311479108569&w=2>

  - To solve that, request-based dm offloads the completion of
    cloned struct request to softirq context (i.e. using
    blk_complete_request() from rq->end_io).

  - Though it is possible to use the same solution from bio->bi_end_io,
    it will delay the notification of bio completion to the original
    submitter.  Also, it will cause inefficient partial completion,
    because the lower driver can't perform the cloned request anymore
    and request-based dm needs to requeue and redispatch it to
    the lower driver again later.  That's not good.

  - So request-based dm needs blk_update_request() to perform the bio
    completion in the lower driver's completion context, which is more
    efficient.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

32fab448

block: cleanup some of the integrity stuff in blkdev.h · 9c02f2b0

由 Jens Axboe 提交于 9月 18, 2008

Don't put functions that are only used in fs/bio-integrity.c in
blkdev.h, it's much cleaner to just keep it in there. Also kill
completely unused bdev_get_tag_size()
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9c02f2b0

block: add fault injection mechanism for faking request timeouts · 581d4e28

由 Jens Axboe 提交于 9月 14, 2008

Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

581d4e28

block: adjust blkdev_issue_discard for swap · 3e6053d7

由 Hugh Dickins 提交于 9月 11, 2008

Two mods to blkdev_issue_discard(), thinking ahead to its use on swap:

1. Add gfp_mask argument, so swap allocation can use it where GFP_KERNEL
might deadlock but GFP_NOIO is safe.

2. Enlarge nr_sects argument from unsigned to sector_t: unsigned long is
enough to cover a whole swap area, but sector_t suits any partition.

Change sb_issue_discard()'s nr_blocks to sector_t too; but no need seen
for a gfp_mask there, just pass GFP_KERNEL down to blkdev_issue_discard().
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3e6053d7

block: Add interface to abort queued requests · 11914a53

由 Mike Anderson 提交于 9月 13, 2008

Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

11914a53

block: unify request timeout handling · 242f9dcb

由 Jens Axboe 提交于 9月 14, 2008

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

242f9dcb

block: add blk_rq_aligned helper function · 87904074

由 FUJITA Tomonori 提交于 8月 28, 2008

This adds blk_rq_aligned helper function to see if alignment and
padding requirement is satisfied for DMA transfer. This also converts
blk_rq_map_kern and __blk_rq_map_user to use the helper function.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

87904074

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多