提交 · 48c0d4d4c04dd520c55e0fd756fa4e7c83de3d13 · FengXiao2002 / Linux Imx

11 7月, 2009 1 次提交

block: fix sg SG_DXFER_TO_FROM_DEV regression · ecb554a8

由 FUJITA Tomonori 提交于 7月 09, 2009

I overlooked SG_DXFER_TO_FROM_DEV support when I converted sg to use
the block layer mapping API (2.6.28).

Douglas Gilbert explained SG_DXFER_TO_FROM_DEV:

http://www.spinics.net/lists/linux-scsi/msg37135.html

=
The semantics of SG_DXFER_TO_FROM_DEV were:
   - copy user space buffer to kernel (LLD) buffer
   - do SCSI command which is assumed to be of the DATA_IN
     (data from device) variety. This would overwrite
     some or all of the kernel buffer
   - copy kernel (LLD) buffer back to the user space.

The idea was to detect short reads by filling the original
user space buffer with some marker bytes ("0xec" it would
seem in this report). The "resid" value is a better way
of detecting short reads but that was only added this century
and requires co-operation from the LLD.
=

This patch changes the block layer mapping API to support this
semantics. This simply adds another field to struct rq_map_data and
enables __bio_copy_iov() to copy data from user space even with READ
requests.

It's better to add the flags field and kills null_mapped and the new
from_user fields in struct rq_map_data but that approach makes it
difficult to send this patch to stable trees because st and osst
drivers use struct rq_map_data (they were converted to use the block
layer in 2.6.29 and 2.6.30). Well, I should clean up the block layer
mapping API.

zhou sf reported this regiression and tested this patch:

http://www.spinics.net/lists/linux-scsi/msg37128.html
http://www.spinics.net/lists/linux-scsi/msg37168.htmlReported-by: Nzhou sf <sxzzsf@gmail.com>
Tested-by: Nzhou sf <sxzzsf@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ecb554a8

01 7月, 2009 1 次提交

block: Create bip slabs with embedded integrity vectors · 7878cba9

由 Martin K. Petersen 提交于 6月 26, 2009

This patch restores stacking ability to the block layer integrity
infrastructure by creating a set of dedicated bip slabs.  Each bip slab
has an embedded bio_vec array at the end.  This cuts down on memory
allocations and also simplifies the code compared to the original bvec
version.  Only the largest bip slab is backed by a mempool.  The pool is
contained in the bio_set so stacking drivers can ensure forward
progress.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@carl.(none)>

7878cba9

16 6月, 2009 1 次提交

block: remove some includings of blktrace_api.h · e212d6f2

由 Li Zefan 提交于 6月 16, 2009

When porting blktrace to tracepoints, we changed to trace/block.h
for trace prober declarations.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e212d6f2

13 6月, 2009 1 次提交

trivial: fix typo in bio_alloc kernel doc · 76d93ff3

由 Nikanth Karthikesan 提交于 4月 22, 2009

Fix typo in bio_alloc kernel doc.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

76d93ff3

11 6月, 2009 1 次提交

fs/bio.c: add missing __user annotation · 0e0c6212

由 Michal Simek 提交于 6月 10, 2009

As reported by sparse:

fs/bio.c:720:13: warning: incorrect type in assignment (different address spaces)
fs/bio.c:720:13:    expected char *iov_addr
fs/bio.c:720:13:    got void [noderef] <asn:1>*
fs/bio.c:724:36: warning: incorrect type in argument 2 (different address spaces)
fs/bio.c:724:36:    expected void const [noderef] <asn:1>*from
fs/bio.c:724:36:    got char *iov_addr
Signed-off-by: NMichal Simek <monstr@monstr.eu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0e0c6212

10 6月, 2009 1 次提交

tracing/events: convert block trace points to TRACE_EVENT() · 55782138

由 Li Zefan 提交于 6月 09, 2009

TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:

  - zero-copy and per-cpu splice() tracing
  - binary tracing without printf overhead
  - structured logging records exposed under /debug/tracing/events
  - trace events embedded in function tracer output and other plugins
  - user-defined, per tracepoint filter expressions
  ...

Cons:

  - no dev_t info for the output of plug, unplug_timer and unplug_io events.
    no dev_t info for getrq and sleeprq events if bio == NULL.
    no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.

    This is mainly because we can't get the deivce from a request queue.
    But this may change in the future.

  - A packet command is converted to a string in TP_assign, not TP_print.
    While blktrace do the convertion just before output.

    Since pc requests should be rather rare, this is not a big issue.

  - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
    has a unique format, which means we have some unused data in a trace entry.

    The overhead is minimized by using __dynamic_array() instead of __array().

I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:

      dd                   dd + ioctl blktrace       dd + TRACE_EVENT (splice)
1     7.36s, 42.7 MB/s     7.50s, 42.0 MB/s          7.41s, 42.5 MB/s
2     7.43s, 42.3 MB/s     7.48s, 42.1 MB/s          7.43s, 42.4 MB/s
3     7.38s, 42.6 MB/s     7.45s, 42.2 MB/s          7.41s, 42.5 MB/s

So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.

And the binary output of TRACE_EVENT is much smaller than blktrace:

 # ls -l -h
 -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
 -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
 -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out

Following are some comparisons between TRACE_EVENT and blktrace:

plug:
  kjournald-480   [000]   303.084981: block_plug: [kjournald]
  kjournald-480   [000]   303.084981:   8,0    P   N [kjournald]

unplug_io:
  kblockd/0-118   [000]   300.052973: block_unplug_io: [kblockd/0] 1
  kblockd/0-118   [000]   300.052974:   8,0    U   N [kblockd/0] 1

remap:
  kjournald-480   [000]   303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
  kjournald-480   [000]   303.085043:   8,0    A   W 102736992 + 8 <- (8,8) 33384

bio_backmerge:
  kjournald-480   [000]   303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
  kjournald-480   [000]   303.085086:   8,0    M   W 102737032 + 8 [kjournald]

getrq:
  kjournald-480   [000]   303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
  kjournald-480   [000]   303.084975:   8,0    G   W 102736984 + 8 [kjournald]

  bash-2066  [001]  1072.953770:   8,0    G   N [bash]
  bash-2066  [001]  1072.953773: block_getrq: 0,0 N 0 + 0 [bash]

rq_complete:
  konsole-2065  [001]   300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
  konsole-2065  [001]   300.053191:   8,0    C   W 103669040 + 16 [0]

  ksoftirqd/1-7   [001]  1072.953811:   8,0    C   N (5a 00 08 00 00 00 00 00 24 00) [0]
  ksoftirqd/1-7   [001]  1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]

rq_insert:
  kjournald-480   [000]   303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
  kjournald-480   [000]   303.084986:   8,0    I   W 102736984 + 8 [kjournald]

Changelog from v2 -> v3:

- use the newly introduced __dynamic_array().

Changelog from v1 -> v2:

- use __string() instead of __array() to minimize the memory required
  to store hex dump of rq->cmd().

- support large pc requests.

- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.

- some cleanups.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

55782138

23 5月, 2009 2 次提交

block: Use accessor functions for queue limits · ae03bf63

由 Martin K. Petersen 提交于 5月 22, 2009

Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ae03bf63

block: Do away with the notion of hardsect_size · e1defc4f

由 Martin K. Petersen 提交于 5月 22, 2009

Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case.  The
sector size will be 4KB but the logical block size will remain
512-bytes.  Hence we need to distinguish between the physical block size
and the logical ditto.

This patch renames hardsect_size to logical_block_size.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

e1defc4f

19 5月, 2009 1 次提交

bio: always copy back data for copied kernel requests · 4fc981ef

由 Tejun Heo 提交于 5月 19, 2009

When a read bio_copy_kern() request fails, the content of the bounce
buffer is not copied back.  However, as request failure doesn't
necessarily mean complete failure, the buffer state can be useful.
This behavior is also inconsistent with the user map counterpart and
causes the subtle difference between bounced and unbounced IO causes
confusion.

This patch makes bio_copy_kern_endio() ignore @err and always copy
back data on request completion.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4fc981ef

29 4月, 2009 1 次提交

bio: fix memcpy corruption in bio_copy_user_iov() · 69838727

由 FUJITA Tomonori 提交于 4月 28, 2009

st driver uses blk_rq_map_user() in order to just build a request out
of page frames. In this case, map_data->offset is a non zero value and
iov[0].iov_base is NULL. We need to increase nr_pages for that.

Cc: stable@kernel.org
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

69838727

22 4月, 2009 2 次提交

bio: use bio_kmalloc() in copy/map functions · a9e9dc24

由 Tejun Heo 提交于 4月 15, 2009

Impact: remove possible deadlock condition

There is no reason to use mempool backed allocation for map functions.
Also, because kern mapping is used inside LLDs (e.g. for EH), using
mempool backed allocation can lead to deadlock under extreme
conditions (mempool already consumed by the time a request reached EH
and requests are blocked on EH).

Switch copy/map functions to bio_kmalloc().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a9e9dc24

bio: fix bio_kmalloc() · 451a9ebf

由 Tejun Heo 提交于 4月 15, 2009

Impact: fix bio_kmalloc() and its destruction path

bio_kmalloc() was broken in two ways.

* bvec_alloc_bs() first allocates bvec using kmalloc() and then
  ignores it and allocates again like non-kmalloc bvecs.

* bio_kmalloc_destructor() didn't check for and free bio integrity
  data.

This patch fixes the above problems.  kmalloc patch is separated out
from bio_alloc_bioset() and allocates the requested number of bvecs as
inline bvecs.

* bio_alloc_bioset() no longer takes NULL @bs.  None other than
  bio_kmalloc() used it and outside users can't know how it was
  allocated anyway.

* Define and use BIO_POOL_NONE so that pool index check in
  bvec_free_bs() triggers if inline or kmalloc allocated bvec gets
  there.

* Relocate destructors on top of each allocation function so that how
  they're used is more clear.

Jens Axboe suggested allocating bvecs inline.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

451a9ebf

15 4月, 2009 1 次提交

bio: add documentation to bio_alloc() · 86c824b9

由 Jens Axboe 提交于 4月 15, 2009

Explain that with __GFP_WAIT set it will not fail, and that the caller
must never allocate more than 1 bio at the time.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

86c824b9

30 3月, 2009 1 次提交

trivial: Fix typo in bio_split()'s documentation · c7eee1b8

由 Alberto Bertogli 提交于 1月 25, 2009

Signed-off-by: NAlberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

c7eee1b8

24 3月, 2009 3 次提交

block: add private bio_set for bio integrity allocations · 6d2a78e7

由 Martin K. Petersen 提交于 3月 10, 2009

The integrity bio allocation needs its own bio_set to avoid violating
the mempool allocation rules and risking deadlocks.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6d2a78e7

block: don't create bio_vec slabs of less than the inline number · a7fcd37c

由 Jens Axboe 提交于 12月 05, 2008

If we don't have CONFIG_BLK_DEV_INTEGRITY set, then we don't have
any external dependencies on the bio_vec slabs. So don't create
the ones that we will inline anyway.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a7fcd37c

block: cleanup bio_alloc_bioset() · 34053979

由 Ingo Molnar 提交于 2月 21, 2009

this warning (which got fixed by commit b2bf9683):

  fs/bio.c: In function ‘bio_alloc_bioset’:
  fs/bio.c:305: warning: ‘p’ may be used uninitialized in this function

Triggered because the code flow in bio_alloc_bioset() is correct
but a bit complex for the compiler to see through.

Streamline it a bit - this also makes the code a tiny bit more compact:

   text	   data	    bss	    dec	    hex	filename
   7540	    256	     40	   7836	   1e9c	bio.o.before
   7539	    256	     40	   7835	   1e9b	bio.o.after

Also remove an older compiler-warnings annotation from this function,
it's not needed.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

34053979

15 3月, 2009 2 次提交

block: fix memory leak in bio_clone() · 059ea331

由 Li Zefan 提交于 3月 09, 2009

If bio_integrity_clone() fails, bio_clone() returns NULL without freeing
the newly allocated bio.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

059ea331

block: Add gfp_mask parameter to bio_integrity_clone() · 87092698

由 un'ichi Nomura 提交于 3月 09, 2009

Stricter gfp_mask might be required for clone allocation.
For example, request-based dm may clone bio in interrupt context
so it has to use GFP_ATOMIC.
Signed-off-by: NKiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

87092698

26 2月, 2009 1 次提交

block: fix bogus gcc warning for uninitialized var usage · b2bf9683

由 Jens Axboe 提交于 2月 19, 2009

Newer gcc throw this warning:

        fs/bio.c: In function ?bio_alloc_bioset?:
        fs/bio.c:305: warning: ?p? may be used uninitialized in this function

since it cannot figure out that 'p' is only ever used if 'bs' is non-NULL.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b2bf9683

18 2月, 2009 1 次提交

fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free · a60e78e5

由 Subhash Peddamallu 提交于 2月 16, 2009

When freeing from bio pool use right ptr to account for bs->front_pad,
instead of bio ptr,
Signed-off-by: NSubhash Peddamallu <subhash.peddamallu@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a60e78e5

03 1月, 2009 3 次提交

[SCSI] block: make blk_rq_map_user take a NULL user-space buffer for WRITE · 97ae77a1

由 FUJITA Tomonori 提交于 12月 18, 2008

The commit 81882766 (block: make
blk_rq_map_user take a NULL user-space buffer) extended
blk_rq_map_user to accept a NULL user-space buffer with a READ
command. It was necessary to convert sg to use the block layer mapping
API.

This patch extends blk_rq_map_user again for a WRITE command. It is
necessary to convert st and osst drivers to use the block layer
apping API.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

97ae77a1

[SCSI] block: fix the partial mappings with struct rq_map_data · 56c451f4

由 FUJITA Tomonori 提交于 12月 18, 2008

This fixes bio_copy_user_iov to properly handle the partial mappings
with struct rq_map_data (which only sg uses for now but st and osst
will shortly). It adds the offset member to struct rq_map_data and
changes blk_rq_map_user to update it so that bio_copy_user_iov can add
an appropriate page frame via bio_add_pc_page().
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

56c451f4

[SCSI] block: fix bio_add_page misuse with rq_map_data · e623ddb4

由 FUJITA Tomonori 提交于 12月 18, 2008

This fixes bio_add_page misuse in bio_copy_user_iov with rq_map_data,
which only sg uses now.

rq_map_data carries page frames for bio_add_pc_page. bio_copy_user_iov
uses bio_add_pc_page with a larger size than PAGE_SIZE. It's clearly
wrong.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>

e623ddb4

29 12月, 2008 5 次提交

bio: get rid of bio_vec clearing · d3f76110

由 Jens Axboe 提交于 12月 23, 2008

We don't need to clear the memory used for adding bio_vec entries,
since nobody should be looking at members unitialized. Any valid
use should be below bio->bi_vcnt, and that members up until that count
must be valid since they were added through bio_add_page().
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d3f76110

bio: add support for inlining a number of bio_vecs inside the bio · 392ddc32

由 Jens Axboe 提交于 12月 23, 2008

When we go and allocate a bio for IO, we actually do two allocations.
One for the bio itself, and one for the bi_io_vec that holds the
actual pages we are interested in.

This feature inlines a definable amount of io vecs inside the bio
itself, so we eliminate the bio_vec array allocation for IO's up
to a certain size. It defaults to 4 vecs, which is typically 16k
of IO.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

392ddc32

bio: allow individual slabs in the bio_set · bb799ca0

由 Jens Axboe 提交于 12月 10, 2008

Instead of having a global bio slab cache, add a reference to one
in each bio_set that is created. This allows for personalized slabs
in each bio_set, so that they can have bios of different sizes.

This means we can personalize the bios we return. File systems may
want to embed the bio inside another structure, to avoid allocation
more items (and stuffing them in ->bi_private) after the get a bio.
Or we may want to embed a number of bio_vecs directly at the end
of a bio, to avoid doing two allocations to return a bio. This is now
possible.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

bb799ca0

bio: move the slab pointer inside the bio_set · 1b434498

由 Jens Axboe 提交于 10月 22, 2008

In preparation for adding differently sized bios.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

1b434498

bio: only mempool back the largest bio_vec slab cache · 7ff9345f

由 Jens Axboe 提交于 12月 11, 2008

We only very rarely need the mempool backing, so it makes sense to
get rid of all but one of the mempool in a bio_set. So keep the
largest bio_vec count mempool so we can always honor the largest
allocation, and "upgrade" callers that fail.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7ff9345f

26 11月, 2008 2 次提交

blktrace: port to tracepoints, update · 0bfc2455

由 Ingo Molnar 提交于 11月 26, 2008

Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE()
sites. Spread them out to the usage sites, as suggested by
Mathieu Desnoyers.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>

0bfc2455

blktrace: port to tracepoints · 5f3ea37c

由 Arnaldo Carvalho de Melo 提交于 10月 30, 2008

This was a forward port of work done by Mathieu Desnoyers, I changed it to
encode the 'what' parameter on the tracepoint name, so that one can register
interest in specific events and not on classes of events to then check the
'what' parameter.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5f3ea37c

09 10月, 2008 9 次提交

block: mark bio_split_pool static · 6feef531

由 Denis ChengRq 提交于 10月 09, 2008

Since all bio_split calls refer the same single bio_split_pool, the bio_split
function can use bio_split_pool directly instead of the mempool_t parameter;

then the mempool_t parameter can be removed from bio_split param list, and
bio_split_pool is only referred in fs/bio.c file, can be marked static.
Signed-off-by: NDenis ChengRq <crquan@gmail.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6feef531

block: Find bio sector offset given idx and offset · ad3316bf

由 Martin K. Petersen 提交于 10月 01, 2008

Helper function to find the sector offset in a bio given bvec index
and page offset.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

ad3316bf

block: add bio_kmalloc() · 0a0d96b0

由 Jens Axboe 提交于 9月 11, 2008

Not all callers need (or want!) the mempool backing guarentee, it
essentially means that you can only use bio_alloc() for short allocations
and not for preallocating some bio's at setup or init time.

So add bio_kmalloc() which does the same thing as bio_alloc(), except
it just uses kmalloc() as the backing instead of the bio mempools.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0a0d96b0

block: make blk_rq_map_user take a NULL user-space buffer · 81882766

由 FUJITA Tomonori 提交于 9月 02, 2008

This patch changes blk_rq_map_user to accept a NULL user-space buffer
with a READ command if rq_map_data is not NULL. Thus a caller can pass
page frames to lk_rq_map_user to just set up a request and bios with
page frames propely. bio_uncopy_user (called via blk_rq_unmap_user)
doesn't copy data to user space with such request.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

81882766

bio: convert bio_copy_kern to use bio_copy_user · 4d8ab62e

由 FUJITA Tomonori 提交于 8月 28, 2008

bio_copy_kern and bio_copy_user are very similar. This converts
bio_copy_kern to use bio_copy_user.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

4d8ab62e

block: introduce struct rq_map_data to use reserved pages · 152e283f

由 FUJITA Tomonori 提交于 8月 28, 2008

This patch introduces struct rq_map_data to enable bio_copy_use_iov()
use reserved pages.

Currently, bio_copy_user_iov allocates bounce pages but
drivers/scsi/sg.c wants to allocate pages by itself and use
them. struct rq_map_data can be used to pass allocated pages to
bio_copy_user_iov.

The current users of bio_copy_user_iov simply passes NULL (they don't
want to use pre-allocated pages).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Douglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

152e283f

block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov · a3bce90e

由 FUJITA Tomonori 提交于 8月 28, 2008

Currently, blk_rq_map_user and blk_rq_map_user_iov always do
GFP_KERNEL allocation.

This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
so sg can use it (sg always does GFP_ATOMIC allocation).
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NDouglas Gilbert <dougg@torque.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

a3bce90e

block: add support for IO CPU affinity · c7c22e4d

由 Jens Axboe 提交于 9月 13, 2008

This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c7c22e4d

drop vmerge accounting · 5df97b91

由 Mikulas Patocka 提交于 8月 15, 2008

Remove hw_segments field from struct bio and struct request. Without virtual
merge accounting they have no purpose.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5df97b91