提交 · 9246b5f06deeea541e7c62437c2ad19a0b1172c0 · openanolis / cloud-kernel

09 10月, 2008 40 次提交

block: Expand Xen blkfront for > 16 xvd · 9246b5f0

由 Chris Lalancette 提交于 9月 17, 2008

Until recently, the maximum number of xvd block devices you could attach
to a Xen domU was 16. This limitation turned out to be problematic for
some users, so it was expanded to handle a much larger number of disks.
However, this requires a couple of changes in the way that blkfront
scans for disks. This functionality is already present in the Xen
linux-2.6.18-xen.hg tree; the attached patch adds this functionality to
the mainline xen-blkfront implementation. I successfully tested it on a
2.6.25 tree, and build tested it on 2.6.27-rc3.
Signed-off-by: NChris Lalancette <clalance@redhat.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9246b5f0

block: cleanup some of the integrity stuff in blkdev.h · 9c02f2b0

由 Jens Axboe 提交于 9月 18, 2008

Don't put functions that are only used in fs/bio-integrity.c in
blkdev.h, it's much cleaner to just keep it in there. Also kill
completely unused bdev_get_tag_size()
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9c02f2b0

block: use rq complete marking in blk_abort_request() · 7ba1fbaa

由 Jens Axboe 提交于 9月 16, 2008

We cannot abort a request if we raced with the timeout handler already,
or with the IO completion. So make blk_abort_request() mark the request
as complete, and only continue if we succeeded.

Found and suggested by Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7ba1fbaa

block: add fault injection mechanism for faking request timeouts · 581d4e28

由 Jens Axboe 提交于 9月 14, 2008

Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

581d4e28

block: add bio_kmalloc() · 0a0d96b0

由 Jens Axboe 提交于 9月 11, 2008

Not all callers need (or want!) the mempool backing guarentee, it
essentially means that you can only use bio_alloc() for short allocations
and not for preallocating some bio's at setup or init time.

So add bio_kmalloc() which does the same thing as bio_alloc(), except
it just uses kmalloc() as the backing instead of the bio mempools.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0a0d96b0

block: adjust blkdev_issue_discard for swap · 3e6053d7

由 Hugh Dickins 提交于 9月 11, 2008

Two mods to blkdev_issue_discard(), thinking ahead to its use on swap:

1. Add gfp_mask argument, so swap allocation can use it where GFP_KERNEL
might deadlock but GFP_NOIO is safe.

2. Enlarge nr_sects argument from unsigned to sector_t: unsigned long is
enough to cover a whole swap area, but sector_t suits any partition.

Change sb_issue_discard()'s nr_blocks to sector_t too; but no need seen
for a gfp_mask there, just pass GFP_KERNEL down to blkdev_issue_discard().
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

3e6053d7

sg: remove unnecessary blk_rq_unmap_user · 4677735f

由 FUJITA Tomonori 提交于 9月 02, 2008

blk_rq_unmap_user in sg_finish_rem_req can take care of all the cases.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4677735f

sg: remove sg_read_xfer · 0b6cb26c

由 FUJITA Tomonori 提交于 9月 02, 2008

sg_read_xfer was used to copy data to user space for READ
commands. blk_rq_unmap_user does the job so sg_read_xfer does nothing
useful.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0b6cb26c

sg: remove sg_write_xfer · c3919af2

由 FUJITA Tomonori 提交于 9月 02, 2008

sg_write_xfer was used to copy data from user space for WRITE
commands. blk_rq_map_user_iov and blk_rq_map_user do the job so
sg_write_xfer does nothing useful.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c3919af2

sg: incorporate sg_build_direct into sg_start_req · 626710c9

由 FUJITA Tomonori 提交于 9月 02, 2008

Calling blk_rq_map_user() at a single place is better than at
different two places. It makes the code more understandable.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

626710c9

sg: remove __sg_start_req · 44c7b0ea

由 FUJITA Tomonori 提交于 9月 02, 2008

__sg_start_req() was used temporarily to call blk_get_request() during
converting sg to use the block layer.

Now sg always calls blk_get_request() so we can move blk_get_request()
to sg_start_req(). We don't need __sg_start_req anymore.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

44c7b0ea

sg: remove b_malloc_len in sg_scatter_hold struct · fd1c1de0

由 FUJITA Tomonori 提交于 9月 02, 2008

It's not used for anything useful after the block layer conversion.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fd1c1de0

sg: remove SG_ALLOW_DIO_CODE define · 7e56cb0f

由 FUJITA Tomonori 提交于 9月 02, 2008

sg had lots of the own functions for the direct IO but now sg uses the
block layer functions for it. There are only five lines for the direct
IO. SG_ALLOW_DIO_CODE define was used to compile out the direct IO
code but we don't need the define. If someone wants to remove the
direct IO code, he can do easily without the define.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7e56cb0f

sg: rename sg_cmd_done sg_rq_end_io · a91a3a20

由 FUJITA Tomonori 提交于 9月 02, 2008

old sg_rq_end_io() was used to wrap sg_cmd_done during converting sg
to use the block layer (in order to cover the difference
scsi_execute_async and blk_execute_rq_nowait). Now we don't need it so
let's remove it.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a91a3a20

dm: Call blk_abort_queue on failed paths · 224cb3e9

由 Mike Anderson 提交于 8月 29, 2008

Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

224cb3e9

block: Add interface to abort queued requests · 11914a53

由 Mike Anderson 提交于 9月 13, 2008

Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

11914a53

block: unify request timeout handling · 242f9dcb

由 Jens Axboe 提交于 9月 14, 2008

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: NMike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

242f9dcb

Call flush_disk() after detecting an online resize. · 608aeef1

由 Andrew Patterson 提交于 9月 04, 2008

We call flush_disk() to make sure the buffer cache for the disk is
flushed after a disk resize. There are two resize cases, growing and
shrinking. Given that users can shrink/then grow a disk before
revalidate_disk() is called, we treat the grow case identically to
shrinking. We need to flush the buffer cache after an online shrink
because, as James Bottomley puts it,

The two use cases for shrinking I can see are

1. planned: the fs is already shrunk to within the new boundaries
and all data is relocated, so invalidate is fine (any dirty
buffers that might exist in the shrunk region are there only
because they were relocated but not yet written to their
original location).
2. unplanned: In this case, the fs is probably toast, so whether
we invalidate or not isn't going to make a whole lot of
difference; it's still going to try to read or write from
sectors beyond the new size and get I/O errors.

Immediately invalidating shrunk disks will cause errors for outstanding
I/Os for reads/write beyond the new end of the disk to be generated
earlier then if we waited for the normal buffer cache operation. It also
removes a potential security hole where we might keep old data around
from beyond the end of the shrunk disk if the disk was not invalidated.
Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

608aeef1

Added flush_disk to factor out common buffer cache flushing code. · 56ade44b

由 Andrew Patterson 提交于 9月 04, 2008

We need to be able to flush the buffer cache for for more than
just when a disk is changed, so we factor out common cache flush code
in check_disk_change() to an internal flush_disk() routine.  This
routine will then be used for both disk changes and disk resizes (in a
later patch).

Include the disk name in the text indicating that there are busy
inodes on the device and increase the KERN severity of the message.
Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

56ade44b

SCSI sd driver calls revalidate_disk wrapper. · f98a8cae

由 Andrew Patterson 提交于 9月 04, 2008

Modify the SCSI disk driver to call the revalidate_disk()
wrapper. This allows us to do some housekeeping such as accounting for
a disk being resized online. The wrapper will call
sd_revalidate_disk() at the appropriate time.
Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f98a8cae

Check for device resize when rescanning partitions · 9bc3ffbf

由 Andrew Patterson 提交于 9月 04, 2008

Check for device resize in the rescan_partitions() routine. If the device
has been resized, the bdev size is set to match. The rescan_partitions()
routine is called when opening the device and when calling the
BLKRRPART ioctl.
Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9bc3ffbf

Adjust block device size after an online resize of a disk. · c3279d14

由 Andrew Patterson 提交于 9月 04, 2008

The revalidate_disk routine now checks if a disk has been resized by
comparing the gendisk capacity to the bdev inode size. If they are
different (usually because the disk has been resized underneath the kernel)
the bdev inode size is adjusted to match the capacity.
Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c3279d14

Wrapper for lower-level revalidate_disk routines. · 0c002c2f

由 Andrew Patterson 提交于 9月 04, 2008

This is a wrapper for the lower-level revalidate_disk call-backs such
as sd_revalidate_disk(). It allows us to perform pre and post
operations when calling them.

We will use this wrapper in a later patch to adjust block device sizes
after an online resize (a _post_ operation).
Signed-off-by: NAndrew Patterson <andrew.patterson@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0c002c2f

block: fix duplicate headers for /proc/partitions · 243294da

由 Tejun Heo 提交于 9月 04, 2008

seqf can be started multiple times for a read and the header should be
printed only for the initial one.  Fix it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

243294da

sg: set dxferp to NULL for READ with the older SG interface · fad7f01e

由 FUJITA Tomonori 提交于 9月 02, 2008

With the older SG interface, we don't know a user-space address to
trasfer data when executing a SCSI command. So we can't pass a
user-space address to blk_rq_map_user.

This patch fixes sg to pass a NULL user-space address to
blk_rq_map_user so that it just sets up a request and bios with page
frames propely without data transfer.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fad7f01e

block: make blk_rq_map_user take a NULL user-space buffer · 81882766

由 FUJITA Tomonori 提交于 9月 02, 2008

This patch changes blk_rq_map_user to accept a NULL user-space buffer
with a READ command if rq_map_data is not NULL. Thus a caller can pass
page frames to lk_rq_map_user to just set up a request and bios with
page frames propely. bio_uncopy_user (called via blk_rq_unmap_user)
doesn't copy data to user space with such request.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

81882766

block: update comment on end_request() · 839e96af

由 Jens Axboe 提交于 9月 02, 2008

It refers to functions that no longer exist after the IO completion
changes.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

839e96af

init: DEBUG_BLOCK_EXT_DEVT requires explicit root= param · 55dc7db7

由 Tejun Heo 提交于 9月 01, 2008

DEBUG_BLOCK_EXT_DEVT shuffles SCSI and IDE device numbers and root
device number set using rdev become meaningless.  Root devices should
be explicitly specified using textual names.  Warn about it if root
can't be found and DEBUG_BLOCK_EXT_DEVT is enabled.  Also, add warning
to the help text.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

55dc7db7

block: don't test for partition size in bdget_disk() and blk_lookup_devt() · 2bbedcb4

由 Tejun Heo 提交于 8月 29, 2008

bdget_disk() and blk_lookup_devt() never cared whether the specified
partition (or disk) is zero sized or not.  I got confused while
converting those not to depend on consecutive minor numbers in commit
5a6411b1178baf534aa9138052864dfa89d3eada and later when dev0 was added
it broke callers which expected to get valid return for zero sized
disk devices.

So, they never needed nr_sects checks in the first place.  Kill them.

This problem was spotted and debugged by Bartlmoiej Zolnierkiewicz.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2bbedcb4

Change default value of CONFIG_DEBUG_BLOCK_EXT_DEVT to 'n' · 759f8ca3

由 Jens Axboe 提交于 8月 29, 2008

It's a debug option that you would explicitly enable to test this
feature, we should default it to 'n' to prevent accidental surprises
for now.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

759f8ca3

block: kmalloc args reversed, small function definition fixes · aeb3d3a8

由 Harvey Harrison 提交于 8月 28, 2008

Noticed by sparse:
block/blk-softirq.c:156:12: warning: symbol 'blk_softirq_init' was not declared. Should it be static?
block/genhd.c:583:28: warning: function 'bdget_disk' with external linkage has definition
block/genhd.c:659:17: warning: incorrect type in argument 1 (different base types)
block/genhd.c:659:17: expected unsigned int [unsigned] [usertype] size
block/genhd.c:659:17: got restricted gfp_t
block/genhd.c:659:29: warning: incorrect type in argument 2 (different base types)
block/genhd.c:659:29: expected restricted gfp_t [usertype] flags
block/genhd.c:659:29: got unsigned int
block: kmalloc args reversed
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

aeb3d3a8

sg: use blk_rq_aligned helper function · 01cfcddd

由 FUJITA Tomonori 提交于 8月 28, 2008

Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

01cfcddd

block: add blk_rq_aligned helper function · 87904074

由 FUJITA Tomonori 提交于 8月 28, 2008

This adds blk_rq_aligned helper function to see if alignment and
padding requirement is satisfied for DMA transfer. This also converts
blk_rq_map_kern and __blk_rq_map_user to use the helper function.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

87904074

bio: convert bio_copy_kern to use bio_copy_user · 4d8ab62e

由 FUJITA Tomonori 提交于 8月 28, 2008

bio_copy_kern and bio_copy_user are very similar. This converts
bio_copy_kern to use bio_copy_user.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4d8ab62e

sg: convert the indirect IO path to use the block layer · 10db10d1

由 FUJITA Tomonori 提交于 8月 29, 2008

This patch converts the indirect IO path (including mmap IO and old
struct sg_header) to use the block layer functions (blk_get_request,
blk_execute_rq_nowait, blk_rq_map_user, etc) instead of
scsi_execute_async().

[Jens: fixed compile error with SCSI logging enabled]
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NDouglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

10db10d1

sg: convert the direct IO path to use the block layer · 6e5a30cb

由 FUJITA Tomonori 提交于 8月 28, 2008

This patch converts the direct IO path (SG_FLAG_DIRECT_IO) to use the
block layer functions (blk_get_request, blk_execute_rq_nowait,
blk_rq_map_user, etc) instead of scsi_execute_async().
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NDouglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6e5a30cb

sg: convert the non-data path to use the block layer · 10865dfa

由 FUJITA Tomonori 提交于 8月 28, 2008

This patch converts the non data path to use the block layer functions
(blk_get_request, blk_execute_rq_nowait, etc) instead of uses
scsi_execute_async().
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NDouglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

10865dfa

block: introduce struct rq_map_data to use reserved pages · 152e283f

由 FUJITA Tomonori 提交于 8月 28, 2008

This patch introduces struct rq_map_data to enable bio_copy_use_iov()
use reserved pages.

Currently, bio_copy_user_iov allocates bounce pages but
drivers/scsi/sg.c wants to allocate pages by itself and use
them. struct rq_map_data can be used to pass allocated pages to
bio_copy_user_iov.

The current users of bio_copy_user_iov simply passes NULL (they don't
want to use pre-allocated pages).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

152e283f

block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov · a3bce90e

由 FUJITA Tomonori 提交于 8月 28, 2008

Currently, blk_rq_map_user and blk_rq_map_user_iov always do
GFP_KERNEL allocation.

This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
so sg can use it (sg always does GFP_ATOMIC allocation).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NDouglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a3bce90e

cfq-iosched: fix queue depth detection · 45333d5a

由 Aaron Carroll 提交于 8月 26, 2008

CFQ's detection of queueing devices assumes a non-queuing device and detects
if the queue depth reaches a certain threshold. Under some workloads (e.g.
synchronous reads), CFQ effectively forces a unit queue depth, thus defeating
the detection logic. This leads to poor performance on queuing hardware,
since the idle window remains enabled.

This patch inverts the sense of the logic: assume a queuing-capable device,
and detect if the depth does not exceed the threshold.
Signed-off-by: NAaron Carroll <aaronc@gelato.unsw.edu.au>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

45333d5a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功