- 19 5月, 2009 4 次提交
-
-
由 Miklos Szeredi 提交于
Unfortunately multiple kmap() within a single thread are deadlockable, so writing out multiple buffers with writev() isn't possible. Change the implementation so that it does a separate write() for each buffer. This actually simplifies the code a lot since the splice_from_pipe() helper can be used. This limitation is caused by HIGHMEM pages, and so only affects a subset of architectures and configurations. In the future it may be worth to implement default_file_splice_write() in a more efficient way on configs that allow it. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
When a read bio_copy_kern() request fails, the content of the bounce buffer is not copied back. However, as request failure doesn't necessarily mean complete failure, the buffer state can be useful. This behavior is also inconsistent with the user map counterpart and causes the subtle difference between bounced and unbounced IO causes confusion. This patch makes bio_copy_kern_endio() ignore @err and always copy back data on request completion. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
In commit c3a4d78c, while introducing rq->resid_len, the default value of residue count was changed from full count to zero. The conversion was done under the assumption that when a request fails residue count wasn't defined. However, Boaz and James pointed out that this wasn't true and the residue count should be preserved for failed requests too. This patchset restores the original behavior by setting rq->resid_len to blk_rq_bytes(rq) on request start and restoring explicit clearing in affected drivers. While at it, take advantage of the fact that rq->resid_len is set to full count where applicable. * ide-cd: rq->resid_len cleared on pc success * mptsas: req->resid_len cleared on success * sas_expander: rsp/req->resid_len cleared on success * mpt2sas_transport: req->resid_len cleared on success * ide-cd, ide-tape, mptsas, sas_host_smp, mpt2sas_transport, ub: take advantage of initial full count to simplify code Boaz Harrosh spotted bug in resid_len initialization. Fixed as suggested. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NBorislav Petkov <petkovbb@googlemail.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
ub_end_rq() always tries to complete full request. The @cmd_len parameter was there because rq->data_len used to be overwritten with residue count. Drop @cmd_len and use __blk_end_request_all(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 18 5月, 2009 3 次提交
-
-
由 Jens Axboe 提交于
drivers/block/virtio_blk.c: In function 'blk_done': drivers/block/virtio_blk.c:53: warning: unused variable 'nr_bytes' Leftover from commit 1cde26f9Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Hannes Reinecke 提交于
Add support for SG_IO passthru to virtio_blk. We add the scsi command block after the normal outhdr, and the scsi inhdr with full status information aswell as the sense buffer before the regular inhdr. [hch: forward ported, added the VIRTIO_BLK_F_SCSI flags, some comments and tested the whole beast] [axboe: updated to use ->resid and not dual-path the byte count] Signed-off-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ checkpatch.pl tweak) Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Christoph Hellwig 提交于
request->rq_disk is only set for FS requests or BLOCK_PC requests originating from the generic block layer scsi ioctls. It's not set for requests origination from other soures or internal cache flush commands implemented by the patch I'll send after this. So instead of using it to get at the private data in do_virtblk_request setup queue->queuedata and use it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 14 5月, 2009 1 次提交
-
-
由 Andrew Morton 提交于
fs/splice.c: In function 'default_file_splice_read': fs/splice.c:566: warning: 'error' may be used uninitialized in this function which is sort-of true. The code will in fact return -ENOMEM instead of the kernel_readv() return value. Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 13 5月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
We cannot reliably map more than one page at the time, or we risk deadlocking. Just allocate the pages from low mem instead. Reported-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 12 5月, 2009 2 次提交
-
-
由 Jens Axboe 提交于
Splice is tied to pipes by design, it'll not change. And now that the splice stuff is in splice.h (and note pipe.h), the rest of the comment is out-of-date as well. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Commit c3a4d78c introduced rq->data_len and converted residual count users to it. While converting, it mistakenly converted scsi_end_request() to finish requests with residual count when it wants to do is fully complete the request. Fix it by using blk_end_request_all() instead. This bug was spotted by Boaz Harrosh. Signed-off-by: NTejun Heo <tj@kernel.org> Spotted-by: NBoaz Harrosh <bharrosh@panasas.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 11 5月, 2009 29 次提交
-
-
由 Miklos Szeredi 提交于
If f_op->splice_write() is not implemented, fall back to a plain write. Use vfs_writev() to write from the pipe buffers. This will allow splice on all filesystems and file types. This includes "direct_io" files in fuse which bypass the page cache. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Miklos Szeredi 提交于
If f_op->splice_read() is not implemented, fall back to a plain read. Use vfs_readv() to read into previously allocated pages. This will allow splice and functions using splice, such as the loop device, to work on all filesystems. This includes "direct_io" files in fuse which bypass the page cache. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Miklos Szeredi 提交于
Allow splice(2) to work when both the input and the output is a pipe. Based on the impementation of the tee(2) syscall, but instead of duplicating the buffer references move the buffers from the input pipe to the output pipe. Moving the whole buffer only succeeds if the full length of the buffer is spliced. Otherwise duplicate the buffer, just like tee(2), set the length of the output buffer and advance the offset on the input buffer. Since splice is operating on two pipes, special care needs to be taken with locking to prevent AN ABBA deadlock. Again this is done similarly to the tee(2) syscall, first preparing the input and output pipes so there's data to consume and space for that data, and then doing the move operation while holding both locks. If other processes are doing I/O on the same pipes parallel to the splice, then by the time both inodes are locked there might be no buffers left to move, or no space to move them to. In this case retry the whole operation, including the preparation phase. This could lead to starvation, but I'm not sure if that's serious enough to worry about. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
Let's put the completion related functions back to block/blk-core.c where they have lived. We can also unexport blk_end_bidi_request() and __blk_end_bidi_request(), which nobody uses. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
Let's use blk_end_request_all() instead of blk_end_bidi_request(). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 FUJITA Tomonori 提交于
blk_end_request_all() and __blk_end_request_all() should finish all bytes including bidi, by definition. That's what all bidi users need , bidi requests must be complete as a whole (partial completion is impossible). Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Till now block layer allowed two separate modes of request execution. A request is always acquired from the request queue via elv_next_request(). After that, drivers are free to either dequeue it or process it without dequeueing. Dequeue allows elv_next_request() to return the next request so that multiple requests can be in flight. Executing requests without dequeueing has its merits mostly in allowing drivers for simpler devices which can't do sg to deal with segments only without considering request boundary. However, the benefit this brings is dubious and declining while the cost of the API ambiguity is increasing. Segment based drivers are usually for very old or limited devices and as converting to dequeueing model isn't difficult, it doesn't justify the API overhead it puts on block layer and its more modern users. Previous patches converted all block low level drivers to dequeueing model. This patch completes the API transition by... * renaming elv_next_request() to blk_peek_request() * renaming blkdev_dequeue_request() to blk_start_request() * adding blk_fetch_request() which is combination of peek and start * disallowing completion of queued (not started) requests * applying new API to all LLDs Renamings are for consistency and to break out of tree code so that it's apparent that out of tree drivers need updating. [ Impact: block request issue API cleanup, no functional change ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Mike Miller <mike.miller@hp.com> Cc: unsik Kim <donari75@gmail.com> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Laurent Vivier <Laurent@lvivier.info> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Stefan Weinhuber <wein@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
gdrom already dequeues and fully completes requests on normal path and the error paths can be easily converted to do so too. Clean it up and dequeue requests on error paths too. While at it remove superflous blk_fs_request() && !blk_rq_sectors() condition check. [ Impact: dequeue in-flight request, cleanup ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
plat-omap/mailbox, floppy, viocd, mspro_block, i2o_block and mmc/card/queue are already pretty close to dequeueing model and can be converted with simple changes. Convert them. While at it, * xen-blkfront: !fs check moved downwards to share dequeue call with normal path. * mspro_block: __blk_end_request(..., blk_rq_cur_byte()) converted to __blk_end_request_cur() * mmc/card/queue: loop of __blk_end_request() converted to __blk_end_request_all() [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Alex Dubov <oakad@yahoo.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
z2ram processes requests one-by-one synchronously and can be easily converted to dequeueing model. Convert it. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
jsflash processes requests one-by-one synchronously from a kthread and can be easily converted to dequeueing model. Convert it. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
mtd_blkdevs processes requests one-by-one synchronously from a kthread and can be easily converted to dequeueing model. Convert it. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
xd processes requests one-by-one synchronously and can be easily converted to dequeueing model. Convert it. While at it, use rq_cur_bytes instead of rq_bytes when checking for sector overflow. This is for for consistency and better behavior for merged requests. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
swim processes requests one-by-one synchronously and can easily be converted to dequeuing model. Convert it. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Laurent Vivier <Laurent@lvivier.info> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Request processing in amiflop is done sequentially in redo_fd_request() proper and redo_fd_request() can easily be converted to track in-flight request. Remove CURRENT, track in-flight request directly and dequeue it when processing starts. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Other than in issue error paths, ps3disk always completely finishes fetched requests. With full completion on error paths, it can be easily converted to dequeueing model. * After L1 r/w call failure, ps3disk_submit_request_sg() now fails the whole request. Issue failure isn't likely to benefit from partial retry anyway and ps3disk uses full failure in completion error path too, so I don't think this amounts to any meaningful functionality loss. * flush completion is converted to _all for consistency. It doesn't make any functional difference. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
pd/pf/pcd have track in-flight request by pd/pf/pcd_req. They can be converted to dequeueing model by updating fetching and completion paths. Convert them. Note that removal of elv_next_request() call from pf_next_buf() doesn't make any functional difference. The path is traveled only during partial completion of a request and elv_next_request() call must return the same request anyway. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Tim Waugh <tim@cyberelk.net> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
xsysace already tracks in-flight request using ace->req. Converting to dequeueing model is mostly a matter of adding dequeueing call after request fetching. The only tricky part is handling CF removal which should complete both in flight and on queue requests. Convert to dequeueing model. While at it, remove explicit blk_rq_cur_bytes() and use __blk_end_request_cur() instead. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
swim3 has at most single request in flight and already tracks it using fd_req. Convert it to dequeuing model by updating request fetching and wrapping completion function. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
ataflop has single request in flight. Till now, whenever it needs to access the in-flight request it called elv_next_request(). This patch makes ataflop track the in-flight request directly and dequeue it when processing starts. The added complexity is minimal and this will help future block layer changes. [ Impact: dequeue in-flight request, one elv_next_request() per request ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
hd has at most single request in flight. Till now, whenever it needs to access the in-flight request it called elv_next_request(). This patch makes hd track the in-flight request directly and dequeue it when processing starts. The added complexity is minimal and this will help future block layer changes. [ Impact: dequeue in-flight request, one elv_next_request() per request ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
mg_disk has at most single request in flight per device. Till now, whenever it needs to access the in-flight request it called elv_next_request(). This patch makes mg_disk track the in-flight request directly using mg_host->req and dequeue it when processing starts. q->queuedata is set to mg_host so that mg_host can be determined without fetching request from the queue. [ Impact: dequeue in-flight request, one elv_next_request() per request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: unsik Kim <donari75@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Both request functions in mg_disk simply return when they encounter a !fs request, which means the request will never be cleared from the queue causing queue hang and indefinite retry of the request. Fix it. While at it, flatten condition checks and add unlikely to !fs tests. [ Impact: fix possible queue hang / infinite retry of !fs requests ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: unsik Kim <donari75@gmail.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
ide generally has single request in flight and tracks it using hwif->rq and all state handlers follow the following convention. * ide_started is returned if the request is in flight. * ide_stopped is returned if the queue needs to be restarted. The request might or might not have been processed fully or partially. * hwif->rq is set to NULL, when an issued request completes. So, dequeueing model can be implemented by dequeueing after fetch, requeueing if hwif->rq isn't NULL on ide_stopped return and doing about the same thing on completion / port unlock paths. These changes can be made in ide-io proper. In addition to the above main changes, the following updates are necessary. * ide-cd shouldn't dequeue a request when issuing REQUEST SENSE for it as the request is already dequeued. * ide-atapi uses request queue as stack when issuing REQUEST SENSE to put the REQUEST SENSE in front of the failed request. This now needs to be done using requeueing. [ Impact: dequeue in-flight request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
With the previous changes, the followings are now guaranteed for all requests in any valid state. * blk_rq_sectors() == blk_rq_bytes() >> 9 * blk_rq_cur_sectors() == blk_rq_cur_bytes() >> 9 Clean up accessor usages. Notable changes are * nbd,i2o_block: end_all used instead of explicit byte count * scsi_lib: unnecessary conditional on request type removed [ Impact: cleanup ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
Block low level drivers for some reason have been pretty good at abusing block layer API. Especially struct request's fields tend to get violated in all possible ways. Make it clear that low level drivers MUST NOT access or manipulate rq->sector and rq->data_len directly by prefixing them with double underscores. This change is also necessary to break build of out-of-tree codes which assume the previous block API where internal fields can be manipulated and rq->data_len carries residual count on completion. [ Impact: hide internal fields, block API change ] Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
With recent unification of fields, it's now guaranteed that rq->data_len always equals blk_rq_bytes(). Convert all direct users to accessors. [ Impact: convert direct rq->data_len usages to blk_rq_bytes() ] Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
With recent unification of fields, it's now guaranteed that rq->data_len always equals blk_rq_bytes(). Convert all non-IDE direct users to accessors. IDE will be converted in a separate patch. Boaz: spotted incorrect data_len/resid_len conversion in osd. [ Impact: convert direct rq->data_len usages to blk_rq_bytes() ] Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NSergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Darrick J. Wong <djwong@us.ibm.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Tejun Heo 提交于
struct request has had a few different ways to represent some properties of a request. ->hard_* represent block layer's view of the request progress (completion cursor) and the ones without the prefix are supposed to represent the issue cursor and allowed to be updated as necessary by the low level drivers. The thing is that as block layer supports partial completion, the two cursors really aren't necessary and only cause confusion. In addition, manual management of request detail from low level drivers is cumbersome and error-prone at the very least. Another interesting duplicate fields are rq->[hard_]nr_sectors and rq->{hard_cur|current}_nr_sectors against rq->data_len and rq->bio->bi_size. This is more convoluted than the hard_ case. rq->[hard_]nr_sectors are initialized for requests with bio but blk_rq_bytes() uses it only for !pc requests. rq->data_len is initialized for all request but blk_rq_bytes() uses it only for pc requests. This causes good amount of confusion throughout block layer and its drivers and determining the request length has been a bit of black magic which may or may not work depending on circumstances and what the specific LLD is actually doing. rq->{hard_cur|current}_nr_sectors represent the number of sectors in the contiguous data area at the front. This is mainly used by drivers which transfers data by walking request segment-by-segment. This value always equals rq->bio->bi_size >> 9. However, data length for pc requests may not be multiple of 512 bytes and using this field becomes a bit confusing. In general, having multiple fields to represent the same property leads only to confusion and subtle bugs. With recent block low level driver cleanups, no driver is accessing or manipulating these duplicate fields directly. Drop all the duplicates. Now rq->sector means the current sector, rq->data_len the current total length and rq->bio->bi_size the current segment length. Everything else is defined in terms of these three and available only through accessors. * blk_recalc_rq_sectors() is collapsed into blk_update_request() and now handles pc and fs requests equally other than rq->sector update. This means that now pc requests can use partial completion too (no in-kernel user yet tho). * bio_cur_sectors() is replaced with bio_cur_bytes() as block layer now uses byte count as the primary data length. * blk_rq_pos() is now guranteed to be always correct. In-block users converted. * blk_rq_bytes() is now guaranteed to be always valid as is blk_rq_sectors(). In-block users converted. * blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9. More convenient one is used. * blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const pointer to request. [ Impact: API cleanup, single way to represent one property of a request ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-