提交 · a3bce90edd8f6cafe3f63b1a943800792e830178 · openeuler / Kernel

09 10月, 2008 11 次提交

block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov · a3bce90e

由 FUJITA Tomonori 提交于 8月 28, 2008

Currently, blk_rq_map_user and blk_rq_map_user_iov always do
GFP_KERNEL allocation.

This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
so sg can use it (sg always does GFP_ATOMIC allocation).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NDouglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a3bce90e

block: inherit CPU completion on bio->rq and rq->rq merges · ab780f1e

由 Jens Axboe 提交于 8月 26, 2008

Somewhat incomplete, as we do allow merges of requests and bios
that have different completion CPUs given. This is done on the
assumption that a larger IO is still more beneficial than CPU
locality.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ab780f1e

block: add support for IO CPU affinity · c7c22e4d

由 Jens Axboe 提交于 9月 13, 2008

This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c7c22e4d

J
block: make kblockd_schedule_work() take the queue as parameter · 18887ad9
由 Jens Axboe 提交于 7月 28, 2008
```
Preparatory patch for checking queuing affinity.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
18887ad9

drop vmerge accounting · 5df97b91

由 Mikulas Patocka 提交于 8月 15, 2008

Remove hw_segments field from struct bio and struct request. Without virtual
merge accounting they have no purpose.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5df97b91

virtio_blk: use a wrapper function to access io context information of IO requests · 766ca442

由 Fernando Luis Vázquez Cao 提交于 8月 14, 2008

struct request has an ioprio member but it is never updated because
currently bios do not hold io context information. The implication of
this is that virtio_blk ends up passing useless information to the
backend driver.

That said, some IO schedulers such as CFQ do store io context
information in struct request, but use private members for that, which
means that that information cannot be directly accessed in a IO
scheduler-independent way.

This patch adds a function to obtain the ioprio of a request. We should
avoid accessing ioprio directly and use this function instead, so that
its users do not have to care about future changes in block layer
structures or what the currently active IO controller is.

This patch does not introduce any functional changes but paves the way
for future clean-ups and enhancements.
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

766ca442

Kill REQ_TYPE_FLUSH · 1a8e2bdd

由 David Woodhouse 提交于 8月 13, 2008

It was only used by ps3disk, and it should probably have been
REQ_TYPE_LINUX_BLOCK + REQ_LB_OP_FLUSH.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1a8e2bdd

Allow elevators to sort/merge discard requests · e17fc0a1

由 David Woodhouse 提交于 8月 09, 2008

But blkdev_issue_discard() still emits requests which are interpreted as
soft barriers, because naïve callers might otherwise issue subsequent
writes to those same sectors, which might cross on the queue (if they're
reallocated quickly enough).

Callers still _can_ issue non-barrier discard requests, but they have to
take care of queue ordering for themselves.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e17fc0a1

D
Support 'discard sectors' operation in translation layer support core · eae9acd1
由 David Woodhouse 提交于 8月 05, 2008
```
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
eae9acd1

Add 'discard' request handling · fb2dce86

由 David Woodhouse 提交于 8月 05, 2008

Some block devices benefit from a hint that they can forget the contents
of certain sectors. Add basic support for this to the block core, along
with a 'blkdev_issue_discard()' helper function which issues such
requests.

The caller doesn't get to provide an end_io functio, since
blkdev_issue_discard() will automatically split the request up into
multiple bios if appropriate. Neither does the function wait for
completion -- it's expected that callers won't care about when, or even
_if_, the request completes. It's only a hint to the device anyway. By
definition, the file system doesn't _care_ about these sectors any more.

[With feedback from OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> and
Jens Axboe <jens.axboe@oracle.com]
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

fb2dce86

Fix up comments about matching flags between bio and rq · d628eaef

由 David Woodhouse 提交于 8月 09, 2008

Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d628eaef

11 9月, 2008 1 次提交

block: disable sysfs parts of the disk command filter · 2dc75d3c

由 Jens Axboe 提交于 9月 11, 2008

We still have life time issues with the sysfs command filter kobject,
so disable it for 2.6.27 release. We can revisit this and make it work
properly for 2.6.28, for 2.6.27 release it's too risky.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2dc75d3c

27 8月, 2008 3 次提交

J
block: remove blk_queue_tag_depth() and blk_queue_tag_queue() · 5168c47b
由 Jens Axboe 提交于 8月 26, 2008
```
They are unused and ->busy doesn't exist anymore.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
5168c47b

block: rename blk_scsi_cmd_filter to blk_cmd_filter · 4beab5c6

由 FUJITA Tomonori 提交于 7月 26, 2008

Technically, the cmd_filter would be applied to other protocols though
it's unlikely to happen. Putting SCSI stuff to request_queue is kinda
layer violation. So let's rename it.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4beab5c6

block: move cmdfilter from gendisk to request_queue · abf54393

由 FUJITA Tomonori 提交于 8月 16, 2008

cmd_filter works only for the block layer SG_IO with SCSI block
devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
character devices (such as st). We hit a kernel crash with them.

The problem is that cmd_filter code accesses to gendisk (having struct
blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
SCSI block device files. With character device files, inode->i_bdev
leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
isn't safe.

SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
independent on any protocols. We shouldn't change ULDs to expose their
gendisk.

This patch moves struct blk_scsi_cmd_filter from gendisk to
request_queue, a common object, which eveyone can access to.

The user interface doesn't change; users can change the filters via
/sys/block/. gendisk has a pointer to request_queue so the cmd_filter
code accesses to struct blk_scsi_cmd_filter.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

abf54393

02 8月, 2008 1 次提交

block: add a blk_plug_device_unlocked() that grabs the queue lock · 6c5e0c4d

由 Jens Axboe 提交于 8月 01, 2008

blk_plug_device() must be called with the queue lock held, so callers
often just grab and release the lock for that purpose. Add a helper
that does just that.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6c5e0c4d

17 7月, 2008 1 次提交

block: Trivial fix for blk_integrity_rq() · d442cc44

由 Martin K. Petersen 提交于 7月 16, 2008

Fail integrity check gracefully when request does not have a bio
attached (BLOCK_PC).
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d442cc44

16 7月, 2008 1 次提交

block: unexport blk_end_sync_rq · 681a561b

由 FUJITA Tomonori 提交于 7月 15, 2008

All the users of blk_end_sync_rq has gone (they are converted to use
blk_execute_rq). This unexports blk_end_sync_rq.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>

681a561b

04 7月, 2008 1 次提交

block: add blk_queue_update_dma_pad · 27f8221a

由 FUJITA Tomonori 提交于 7月 04, 2008

This adds blk_queue_update_dma_pad to prevent LLDs from overwriting
the dma pad mask wrongly (we added blk_queue_update_dma_alignment due
to the same reason).

This also converts libata to use blk_queue_update_dma_pad instead of
blk_queue_dma_pad.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

27f8221a

03 7月, 2008 8 次提交

block: extend queue_flag bitops · e48ec690

由 Jens Axboe 提交于 7月 03, 2008

Add test_and_clear and test_and_set.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e48ec690

Add bvec_merge_data to handle stacked devices and ->merge_bvec() · cc371e66

由 Alasdair G Kergon 提交于 7月 03, 2008

When devices are stacked, one device's merge_bvec_fn may need to perform
the mapping and then call one or more functions for its underlying devices.

The following bio fields are used:
  bio->bi_sector
  bio->bi_bdev
  bio->bi_size
  bio->bi_rw  using bio_data_dir()

This patch creates a new struct bvec_merge_data holding a copy of those
fields to avoid having to change them directly in the struct bio when
going down the stack only to have to change them back again on the way
back up.  (And then when the bio gets mapped for real, the whole
exercise gets repeated, but that's a problem for another day...)
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Milan Broz <mbroz@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cc371e66

block: integrity flags can't use bit ops on unsigned short · b24498d4

由 Jens Axboe 提交于 6月 27, 2008

Just use normal open coded bit operations instead, they need not be
atomic.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b24498d4

allow userspace to modify scsi command filter on per device basis · 0b07de85

由 Adel Gadllah 提交于 6月 26, 2008

This patch exports the per-gendisk command filter to user space through
sysfs, so it can be changed by the system administrator.
All users of the old cmd filter have been converted to use the new one.

Original patch from Peter Jones.
Signed-off-by: NAdel Gadllah <adel.gadllah@gmail.com>
Signed-off-by: NPeter Jones <pjones@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0b07de85

block: integrity cleanups · 6e2401ad

由 Jens Axboe 提交于 6月 18, 2008

- No need to check for NULL bio, we'll get an immediate oops anyway.
- Make bio_integrity() a proper function.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6e2401ad

J
block: blkdev.h cleanup, move iocontext stuff to iocontext.h · da9cbc87
由 Jens Axboe 提交于 6月 30, 2008
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
da9cbc87

block: Block layer data integrity support · 7ba1ba12

由 Martin K. Petersen 提交于 6月 30, 2008

Some block devices support verifying the integrity of requests by way
of checksums or other protection information that is submitted along
with the I/O.

This patch implements support for generating and verifying integrity
metadata, as well as correctly merging, splitting and cloning bios and
requests that have this extra information attached.

See Documentation/block/data-integrity.txt for more information.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7ba1ba12

block: kill request_queue_t · 244b4d56

由 Jens Axboe 提交于 6月 12, 2008

Everything was moved to struct request_queue a few kernel revisions
ago, maintaining the deprecated typedef to avoid breaking things.
Now the time has come to get rid of that typedef.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

244b4d56

30 4月, 2008 2 次提交

Improve queue_is_locked() · 7663c1e2

由 Jens Axboe 提交于 4月 29, 2008

spin_is_locked() doesn't work on UP without spinlock debugging. Make it
safer and just return 1 on UP, so we don't get false positives. The plan
is to kill this debug function during the -rc cycle.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7663c1e2

block: fix queue locking verification · 8f45c1a5

由 Linus Torvalds 提交于 4月 29, 2008

The new queue_flag_set/clear() functions verify that the queue is
locked, but in doing so they will actually instead oops if the queue
lock hasn't been initialized at all.

So fix the lock debug test to consider the "no lock" case to be
unlocked.  This way you get a nice WARN_ON_ONCE() instead of a fatal
oops.

Bug introduced by commit 75ad23bc
("block: make queue flags non-atomic").

Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f45c1a5

29 4月, 2008 4 次提交

block: Skip I/O merges when disabled · ac9fafa1

由 Alan D. Brunelle 提交于 4月 29, 2008

The block I/O + elevator + I/O scheduler code spend a lot of time trying
to merge I/Os -- rightfully so under "normal" circumstances. However,
if one were to know that the incoming I/O stream was /very/ random in
nature, the cycles are wasted.

This patch adds a per-request_queue tunable that (when set) disables
merge attempts (beyond the simple one-hit cache check), thus freeing up
a non-trivial amount of CPU cycles.
Signed-off-by: NAlan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ac9fafa1

block: add large command support · d7e3c324

由 FUJITA Tomonori 提交于 4月 29, 2008

This patch changes rq->cmd from the static array to a pointer to
support large commands.

We rarely handle large commands. So for optimization, a struct request
still has a static array for a command. rq_init sets rq->cmd pointer
to the static array.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d7e3c324

block: rename and export rq_init() · 2a4aa30c

由 FUJITA Tomonori 提交于 4月 29, 2008

This rename rq_init() blk_rq_init() and export it. Any path that hands
the request to the block layer needs to call it to initialize the
request.

This is a preparation for large command support, which needs to
initialize the request in a proper way (that is, just doing a memset()
will not work).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2a4aa30c

block: make queue flags non-atomic · 75ad23bc

由 Nick Piggin 提交于 4月 29, 2008

We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

75ad23bc

21 4月, 2008 2 次提交

block: fix memory hotplug and bouncing in block layer · 2472892a

由 Andi Kleen 提交于 4月 21, 2008

Only noticed this while hacking something else, no test case.

blk_max_low_pfn is initialized once at bootup by the block layer from
max_low_pfn.  But max_low_pfn is not necessarily constant over the runtime of
the system when you consider memory hotplug.  What could happen if that
someone adds memory later the block layer wouldn't get updated and then start
bouncing memory unnecessarily.

Also on 64bit blk_max_low_pfn actually isn't needed because it just disables
bouncing essentially and there is no highmem.  And nobody can pass pfns >
max_low_pfn to the block layer, because those wouldn't have a struct page and
I suspect block layer wouldn't be very happy without that.

So set BLK_BOUNCE_HIGH to infinity (-1ULL) on 64bit.  That avoids the problem
of having to update it on memory hotadd.

On 32bit I kept the same behaviour because at least on i386
memory hotadd only adds HIGHMEM, never lowmem.

BLK_BOUNCE_ANY is always set to infinity on both 32 and 64bit.
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Acked-by: NYasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2472892a

block: move the padding adjustment to blk_rq_map_sg · f18573ab

由 FUJITA Tomonori 提交于 4月 11, 2008

blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule
that req->data_len (the true data length) is equal to sum(bio). It
broke the scsi command completion code.

commit e97a294e was introduced to fix
the above issue. However, the partial completion code doesn't work
with it. The commit is also a layer violation (scsi mid-layer should
not know about the block layer's padding).

This patch moves the padding adjustment to blk_rq_map_sg (suggested by
James). The padding works like the drain buffer. This patch breaks the
rule that req->data_len is equal to sum(sg), however, the drain buffer
already broke it. So this patch just restores the rule that
req->data_len is equal to sub(bio) without breaking anything new.

Now when a low level driver needs padding, blk_rq_map_user and
blk_rq_map_user_iov guarantee there's enough room for padding.
blk_rq_map_sg can safely extend the last entry of a scatter list.

blk_rq_map_sg must extend the last entry of a scatter list only for a
request that got through bio_copy_user_iov. This patches introduces
new REQ_COPY_USER flag.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f18573ab

04 3月, 2008 2 次提交

block: separate out padding from alignment · e3790c7d

由 Tejun Heo 提交于 3月 04, 2008

Block layer alignment was used for two different purposes - memory
alignment and padding.  This causes problems in lower layers because
drivers which only require memory alignment ends up with adjusted
rq->data_len.  Separate out padding such that padding occurs iff
driver explicitly requests it.

Tomo: restorethe code to update bio in blk_rq_map_user
      introduced by the commit 40b01b9b
      according to padding alignment.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e3790c7d

block: restore the meaning of rq->data_len to the true data length · 7a85f889

由 FUJITA Tomonori 提交于 3月 04, 2008

The meaning of rq->data_len was changed to the length of an allocated
buffer from the true data length. It breaks SG_IO friends and
bsg. This patch restores the meaning of rq->data_len to the true data
length and adds rq->extra_len to store an extended length (due to
drain buffer and padding).

This patch also removes the code to update bio in blk_rq_map_user
introduced by the commit 40b01b9b.
The commit adjusts bio according to memory alignment
(queue_dma_alignment). However, memory alignment is NOT padding
alignment. This adjustment also breaks SG_IO friends and bsg. Padding
alignment needs to be fixed in a proper way (by a separate patch).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>

7a85f889

19 2月, 2008 2 次提交

block: implement request_queue->dma_drain_needed · 2fb98e84

由 Tejun Heo 提交于 2月 19, 2008

Draining shouldn't be done for commands where overflow may indicate
data integrity issues.  Add dma_drain_needed callback to
request_queue.  Drain buffer is appened iff this function returns
non-zero.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

2fb98e84

block: add request->raw_data_len · 6b00769f

由 Tejun Heo 提交于 2月 19, 2008

With padding and draining moved into it, block layer now may extend
requests as directed by queue parameters, so now a request has two
sizes - the original request size and the extended size which matches
the size of area pointed to by bios and later by sgs. The latter size
is what lower layers are primarily interested in when allocating,
filling up DMA tables and setting up the controller.

Both padding and draining extend the data area to accomodate
controller characteristics. As any controller which speaks SCSI can
handle underflows, feeding larger data area is safe.

So, this patch makes the primary data length field, request->data_len,
indicate the size of full data area and add a separate length field,
request->raw_data_len, for the unmodified request size. The latter is
used to report to higher layer (userland) and where the original
request size should be fed to the controller or device.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6b00769f

08 2月, 2008 1 次提交

block: fixup rq_init() a bit · 63a71386

由 Jens Axboe 提交于 2月 08, 2008

Rearrange fields in cache order and initialize some fields that
we didn't previously init. Remove init of ->completion_data, it's
part of a union with ->hash. Luckily clearing the rb node is the same
as setting it to null!
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

63a71386

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功