提交 · ca39d651d17df49b6d11f851d56c0ce0ce01ea1a · bug2833 / cloud-kernel

15 5月, 2008 1 次提交

Remove blkdev warning triggered by using md · e7e72bf6

由 Neil Brown 提交于 5月 14, 2008

As setting and clearing queue flags now requires that we hold a spinlock
on the queue, and as blk_queue_stack_limits is called without that lock,
get the lock inside blk_queue_stack_limits.

For blk_queue_stack_limits to be able to find the right lock, each md
personality needs to set q->queue_lock to point to the appropriate lock.
Those personalities which didn't previously use a spin_lock, us
q->__queue_lock.  So always initialise that lock when allocated.

With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no
longer cause warnings as it will be clear that the proper lock is held.

Thanks to Dan Williams for review and fixing the silly bugs.
Signed-off-by: NNeilBrown <neilb@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Alistair John Strachan <alistair@devzero.co.uk>
Cc: Nick Piggin <npiggin@suse.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Jacek Luczak <difrost.kernel@gmail.com>
Cc: Prakash Punnoor <prakash@punnoor.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e7e72bf6

07 5月, 2008 2 次提交

block: avoid duplicate calls to get_part() in disk stat code · 28f13702

由 Jens Axboe 提交于 5月 07, 2008

get_part() is fairly expensive, as it O(N) loops over partitions
to find the right one. In lots of normal IO paths we end up looking
up the partition twice, to make matters even worse. Change the
stat add code to accept a passed in partition instead.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

28f13702

block: optimize generic_unplug_device() · dbaf2c00

由 Jens Axboe 提交于 5月 07, 2008

Original patch from Mikulas Patocka <mpatocka@redhat.com>

Mike Anderson was doing an OLTP benchmark on a computer with 48 physical
disks mapped to one logical device via device mapper.

He found that there was a slowdown on request_queue->lock in function
generic_unplug_device. The slowdown is caused by the fact that when some
code calls unplug on the device mapper, device mapper calls unplug on all
physical disks. These unplug calls take the lock, find that the queue is
already unplugged, release the lock and exit.

With the below patch, performance of the benchmark was increased by 18%
(the whole OLTP application, not just block layer microbenchmarks).

So I'm submitting this patch for upstream. I think the patch is correct,
because when more threads call simultaneously plug and unplug, it is
unspecified, if the queue is or isn't plugged (so the patch can't make
this worse). And the caller that plugged the queue should unplug it
anyway. (if it doesn't, there's 3ms timeout).
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

dbaf2c00

01 5月, 2008 1 次提交

block: remove remaining __FUNCTION__ occurrences · 24c03d47

由 Harvey Harrison 提交于 5月 01, 2008

__FUNCTION__ is gcc specific, use __func__
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

24c03d47

29 4月, 2008 5 次提交

block: add large command support · d7e3c324

由 FUJITA Tomonori 提交于 4月 29, 2008

This patch changes rq->cmd from the static array to a pointer to
support large commands.

We rarely handle large commands. So for optimization, a struct request
still has a static array for a command. rq_init sets rq->cmd pointer
to the static array.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d7e3c324

block: replace sizeof(rq->cmd) with BLK_MAX_CDB · d34c87e4

由 FUJITA Tomonori 提交于 4月 29, 2008

This is a preparation for changing rq->cmd from the static array to a
pointer.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d34c87e4

block: rename and export rq_init() · 2a4aa30c

由 FUJITA Tomonori 提交于 4月 29, 2008

This rename rq_init() blk_rq_init() and export it. Any path that hands
the request to the block layer needs to call it to initialize the
request.

This is a preparation for large command support, which needs to
initialize the request in a proper way (that is, just doing a memset()
will not work).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2a4aa30c

block: make queue flags non-atomic · 75ad23bc

由 Nick Piggin 提交于 4月 29, 2008

We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

75ad23bc

block: make rq_init() do a full memset() · 1afb20f3

由 FUJITA Tomonori 提交于 4月 25, 2008

This requires moving rq_init() from get_request() to blk_alloc_request().
The upside is that we can now require an rq_init() from any path that
wishes to hand the request to the block layer.

rq_init() will be exported for the code that uses struct request
without blk_get_request.

This is a preparation for large command support, which needs to
initialize struct request in a proper way (that is, just doing a
memset() will not work).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1afb20f3

04 3月, 2008 3 次提交

unexport blk_{get,put}_queue · 9d7f1e6b

由 Adrian Bunk 提交于 3月 04, 2008

This patch removes the unused exports of blk_{get,put}_queue.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9d7f1e6b

block: restore the meaning of rq->data_len to the true data length · 7a85f889

由 FUJITA Tomonori 提交于 3月 04, 2008

The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).

This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9b.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>

7a85f889

block: fix kernel-docbook parameters and files · 5d87a052

由 Randy Dunlap 提交于 2月 20, 2008

kernel-doc for block/:
- add missing parameters
- fix one function's parameter list (remove blank line)
- add 2 source files to docbook for non-exported kernel-doc functions
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5d87a052

19 2月, 2008 2 次提交

block: add request->raw_data_len · 6b00769f

由 Tejun Heo 提交于 2月 19, 2008

With padding and draining moved into it, block layer now may extend
requests as directed by queue parameters, so now a request has two
sizes - the original request size and the extended size which matches
the size of area pointed to by bios and later by sgs. The latter size
is what lower layers are primarily interested in when allocating,
filling up DMA tables and setting up the controller.

Both padding and draining extend the data area to accomodate
controller characteristics. As any controller which speaks SCSI can
handle underflows, feeding larger data area is safe.

So, this patch makes the primary data length field, request->data_len,
indicate the size of full data area and add a separate length field,
request->raw_data_len, for the unmodified request size. The latter is
used to report to higher layer (userland) and where the original
request size should be fed to the controller or device.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6b00769f

make blk-core.c:request_cachep static again · 5ece6c52

由 Adrian Bunk 提交于 2月 18, 2008

request_cachep needlessly became global.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>

5ece6c52

08 2月, 2008 3 次提交

Enhanced partition statistics: remove old partition statistics · c3c930d9

由 Jerome Marchand 提交于 2月 08, 2008

Removes the now unused old partition statistic code.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c3c930d9

Enhanced partition statistics: update partition statitics · 6f2576af

由 Jerome Marchand 提交于 2月 08, 2008

Updates the enhanced partition statistics in generic block layer
besides the disk statistics.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6f2576af

block: fixup rq_init() a bit · 63a71386

由 Jens Axboe 提交于 2月 08, 2008

Rearrange fields in cache order and initialize some fields that
we didn't previously init. Remove init of ->completion_data, it's
part of a union with ->hash. Luckily clearing the rb node is the same
as setting it to null!
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

63a71386

01 2月, 2008 2 次提交
- J
  block: make core bits checkpatch compliant · 6728cb0e
  由 Jens Axboe 提交于 1月 31, 2008
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  6728cb0e
- J
  block: new end request handling interface should take unsigned byte counts · 22b13210
  由 Jens Axboe 提交于 1月 31, 2008
```
No point in passing signed integers as the byte count, they can never
be negative.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  22b13210
30 1月, 2008 6 次提交
- J
  block: ll_rw_blk.c split, add blk-merge.c · d6d48196
  由 Jens Axboe 提交于 1月 29, 2008
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  d6d48196
- J
  block: remove dated (and wrong) comment in blk-core.c · db1d08c6
  由 Jens Axboe 提交于 1月 29, 2008
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  db1d08c6
- J
  block: get rid of unnecessary forward declarations in blk-core.c · 26b8256e
  由 Jens Axboe 提交于 1月 29, 2008
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  26b8256e
- J
  block: continue ll_rw_blk.c splitup · 86db1e29
  由 Jens Axboe 提交于 1月 29, 2008
```
Adds files for barrier handling, rq execution, io context handling,
mapping data to requests, and queue settings.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  86db1e29
- J
  block: split tag and sysfs handling from blk-core.c · 8324aa91
  由 Jens Axboe 提交于 1月 29, 2008
```
Seperates the tag and sysfs handling from ll_rw_blk.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  8324aa91
- J
  block: first step of splitting ll_rw_blk, rename it · a168ee84
  由 Jens Axboe 提交于 1月 29, 2008
```
Then we retain history in blk-core.c
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
  a168ee84
28 1月, 2008 12 次提交

block: implement drain buffers · fa0ccd83

由 James Bottomley 提交于 1月 10, 2008

These DMA drain buffer implementations in drivers are pretty horrible
to do in terms of manipulating the scatterlist.  Plus they're being
done at least in drivers/ide and drivers/ata, so we now have code
duplication.

The one use case for this, as I understand it is AHCI controllers doing
PIO mode to mmc devices but translating this to DMA at the controller
level.

So, what about adding a callback to the block layer that permits the
adding of the drain buffer for the problem devices.  The idea is that
you'd do this in slave_configure after you find one of these devices.

The beauty of doing it in the block layer is that it quietly adds the
drain buffer to the end of the sg list, so it automatically gets mapped
(and unmapped) without anything unusual having to be done to the
scatterlist in driver/scsi or drivers/ata and without any alteration to
the transfer length.
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fa0ccd83

block: cfq: make the io contect sharing lockless · 4ac845a2

由 Jens Axboe 提交于 1月 24, 2008

The io context sharing introduced a per-ioc spinlock, that would protect
the cfq io context lookup. That is a regression from the original, since
we never needed any locking there because the ioc/cic were process private.

The cic lookup is changed from an rbtree construct to a radix tree, which
we can then use RCU to make the reader side lockless. That is the performance
critical path, modifying the radix tree is only done on process creation
(when that process first does IO, actually) and on process exit (if that
process has done IO).

As it so happens, radix trees are also much faster for this type of
lookup where the key is a pointer. It's a very sparse tree.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4ac845a2

io context sharing: preliminary support · d38ecf93

由 Jens Axboe 提交于 1月 24, 2008

Detach task state from ioc, instead keep track of how many processes
are accessing the ioc.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d38ecf93

ioprio: move io priority from task_struct to io_context · fd0928df

由 Jens Axboe 提交于 1月 24, 2008

This is where it belongs and then it doesn't take up space for a
process that doesn't do IO.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fd0928df

blk_end_request: cleanup of request completion (take 4) · b8286239