1. 11 5月, 2009 7 次提交
    • T
      block: drop request->hard_* and *nr_sectors · 2e46e8b2
      Tejun Heo 提交于
      struct request has had a few different ways to represent some
      properties of a request.  ->hard_* represent block layer's view of the
      request progress (completion cursor) and the ones without the prefix
      are supposed to represent the issue cursor and allowed to be updated
      as necessary by the low level drivers.  The thing is that as block
      layer supports partial completion, the two cursors really aren't
      necessary and only cause confusion.  In addition, manual management of
      request detail from low level drivers is cumbersome and error-prone at
      the very least.
      
      Another interesting duplicate fields are rq->[hard_]nr_sectors and
      rq->{hard_cur|current}_nr_sectors against rq->data_len and
      rq->bio->bi_size.  This is more convoluted than the hard_ case.
      
      rq->[hard_]nr_sectors are initialized for requests with bio but
      blk_rq_bytes() uses it only for !pc requests.  rq->data_len is
      initialized for all request but blk_rq_bytes() uses it only for pc
      requests.  This causes good amount of confusion throughout block layer
      and its drivers and determining the request length has been a bit of
      black magic which may or may not work depending on circumstances and
      what the specific LLD is actually doing.
      
      rq->{hard_cur|current}_nr_sectors represent the number of sectors in
      the contiguous data area at the front.  This is mainly used by drivers
      which transfers data by walking request segment-by-segment.  This
      value always equals rq->bio->bi_size >> 9.  However, data length for
      pc requests may not be multiple of 512 bytes and using this field
      becomes a bit confusing.
      
      In general, having multiple fields to represent the same property
      leads only to confusion and subtle bugs.  With recent block low level
      driver cleanups, no driver is accessing or manipulating these
      duplicate fields directly.  Drop all the duplicates.  Now rq->sector
      means the current sector, rq->data_len the current total length and
      rq->bio->bi_size the current segment length.  Everything else is
      defined in terms of these three and available only through accessors.
      
      * blk_recalc_rq_sectors() is collapsed into blk_update_request() and
        now handles pc and fs requests equally other than rq->sector update.
        This means that now pc requests can use partial completion too (no
        in-kernel user yet tho).
      
      * bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
        now uses byte count as the primary data length.
      
      * blk_rq_pos() is now guranteed to be always correct.  In-block users
        converted.
      
      * blk_rq_bytes() is now guaranteed to be always valid as is
        blk_rq_sectors().  In-block users converted.
      
      * blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
        More convenient one is used.
      
      * blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
        pointer to request.
      
      [ Impact: API cleanup, single way to represent one property of a request ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      2e46e8b2
    • T
      ide: convert to rq pos and nr_sectors accessors · 9780e2dd
      Tejun Heo 提交于
      ide doesn't manipulate request fields anymore and thus all hard and
      their soft equivalents are always equal.  Convert all references to
      accessors.
      
      [ Impact: use pos and nr_sectors accessors ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      9780e2dd
    • T
      block: convert to pos and nr_sectors accessors · 83096ebf
      Tejun Heo 提交于
      With recent cleanups, there is no place where low level driver
      directly manipulates request fields.  This means that the 'hard'
      request fields always equal the !hard fields.  Convert all
      rq->sectors, nr_sectors and current_nr_sectors references to
      accessors.
      
      While at it, drop superflous blk_rq_pos() < 0 test in swim.c.
      
      [ Impact: use pos and nr_sectors accessors ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Tested-by: NGrant Likely <grant.likely@secretlab.ca>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Tested-by: NAdrian McMenamin <adrian@mcmen.demon.co.uk>
      Acked-by: NAdrian McMenamin <adrian@mcmen.demon.co.uk>
      Acked-by: NMike Miller <mike.miller@hp.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Cc: Tim Waugh <tim@cyberelk.net>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Dario Ballabio <ballabio_dario@emc.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: unsik Kim <donari75@gmail.com>
      Cc: Laurent Vivier <Laurent@lvivier.info>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      83096ebf
    • T
      block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones · 5b93629b
      Tejun Heo 提交于
      Implement accessors - blk_rq_pos(), blk_rq_sectors() and
      blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
      and rq->hard_cur_sectors respectively and convert direct references of
      the said fields to the accessors.
      
      This is in preparation of request data length handling cleanup.
      
      Geert	: suggested adding const to struct request * parameter to accessors
      Sergei	: spotted error in patch description
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NGeert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
      Acked-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Tested-by: NGrant Likely <grant.likely@secretlab.ca>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Ackec-by: NSergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5b93629b
    • T
      block: add rq->resid_len · c3a4d78c
      Tejun Heo 提交于
      rq->data_len served two purposes - the length of data buffer on issue
      and the residual count on completion.  This duality creates some
      headaches.
      
      First of all, block layer and low level drivers can't really determine
      what rq->data_len contains while a request is executing.  It could be
      the total request length or it coulde be anything else one of the
      lower layers is using to keep track of residual count.  This
      complicates things because blk_rq_bytes() and thus
      [__]blk_end_request_all() relies on rq->data_len for PC commands.
      Drivers which want to report residual count should first cache the
      total request length, update rq->data_len and then complete the
      request with the cached data length.
      
      Secondly, it makes requests default to reporting full residual count,
      ie. reporting that no data transfer occurred.  The residual count is
      an exception not the norm; however, the driver should clear
      rq->data_len to zero to signify the normal cases while leaving it
      alone means no data transfer occurred at all.  This reverse default
      behavior complicates code unnecessarily and renders block PC on some
      drivers (ide-tape/floppy) unuseable.
      
      This patch adds rq->resid_len which is used only for residual count.
      
      While at it, remove now unnecessasry blk_rq_bytes() caching in
      ide_pc_intr() as rq->data_len is not changed anymore.
      
      Boaz	: spotted missing conversion in osd
      Sergei	: spotted too early conversion to blk_rq_bytes() in ide-tape
      
      [ Impact: cleanup residual count handling, report 0 resid by default ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Borislav Petkov <petkovbb@googlemail.com>
      Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Doug Gilbert <dgilbert@interlog.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Eric Moore <Eric.Moore@lsi.com>
      Cc: Darrick J. Wong <djwong@us.ibm.com>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c3a4d78c
    • T
      ide-tape: don't initialize rq->sector for rw requests · 9720aef2
      Tejun Heo 提交于
      rq->sector is set to the tape->first_frame but it's never actually
      used and not even in the correct unit (512 byte sectors).  Don't set
      it.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NBorislav Petkov <petkovbb@gmail.com>
      Acked-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      9720aef2
    • T
      nbd: don't clear rq->sector and nr_sectors unnecessarily · 53d6979a
      Tejun Heo 提交于
      There's no reason to clear rq->sector and nr_sectors after calling
      blk_rq_init().  They're guaranteed to be clear.  Drop unnecessary
      clearing.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Paul Clements <paul.clements@steeleye.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      53d6979a
  2. 28 4月, 2009 33 次提交
    • B
      mg_disk: use defines from <linux/ata.h> · f68adec3
      Bartlomiej Zolnierkiewicz 提交于
      While at it:
      - remove MG_REG_HEAD_MUST_BE_ON define
      - remove MG_REG_CTRL_INTR_ENABLE define
      - remove MG_REG_HEAD_LBA_MODE define
      - remove unused defines
      
      Cc: unsik Kim <donari75@gmail.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f68adec3
    • B
      mg_disk: fix dependency on libata · 8a11a789
      Bartlomiej Zolnierkiewicz 提交于
      Add local copies of ata_id_string() and ata_id_c_string() to mg_disk
      so there is no need for the driver to depend on ATA and SCSI.
      
      [ Impact: break dependency on libata by copying ata id string functions ]
      
      Cc: unsik Kim <donari75@gmail.com>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      8a11a789
    • T
      mg_disk: clean up request completion paths · a03bb5a3
      Tejun Heo 提交于
      mg_disk implements its own partial completion.  Convert to standard
      block layer partial completion.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: unsik Kim <donari75@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      a03bb5a3
    • T
      mg_disk: fold mg_disk.h into mg_disk.c · eec94620
      Tejun Heo 提交于
      include/linux/mg_disk.h is used only by drivers/block/mg_disk.c.  No
      reason to put it in a separate header.  Fold it into mg_disk.c.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: unsik Kim <donari75@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      eec94620
    • T
      swim: clean up request completion paths · e138b4e0
      Tejun Heo 提交于
      swim curiously tries to update request parameters before calling
      __blk_end_request() when __blk_end_request() will do it anyway and
      unnecessarily checks whether current_nr_sectors is zero right after
      fetching.
      
      Drop unnecessary stuff and use standard block layer mechanisms.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Laurent Vivier <Laurent@lvivier.info>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e138b4e0
    • T
      swim3: clean up request completion paths · 467ca759
      Tejun Heo 提交于
      swim3 curiously tries to update request parameters before calling
      __blk_end_request() when __blk_end_request() will do it anyway, and it
      updates request for partial completion manually instead of using
      blk_update_request().  Also, it does some spurious checks on rq such
      as testing whether rq->sector is negative or current_nr_sectors is
      zero right after fetching.
      
      Drop unnecessary stuff and use standard block layer mechanisms.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      467ca759
    • T
      hd: clean up request completion paths · e091eb67
      Tejun Heo 提交于
      hd read/write_intr() functions manually manipulate request to
      incrementally complete it, which block layer already supports.  Simply
      use block layer completion routines instead of manual partial
      completion.
      
      While at it, clear unnecessary elv_next_request() check at the tail of
      read_intr().  This also makes read and write_intr() more consistent.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      e091eb67
    • T
      ubd: drop unnecessary rq->sector manipulation · f81f2f7c
      Tejun Heo 提交于
      ubd curiously updates rq->sector while issuing the request in multiple
      pieces.  Don't do it and simply use local copy of sector.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeff Dike <jdike@linux.intel.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f81f2f7c
    • T
      ubd: cleanup completion path · 4d6c84d9
      Tejun Heo 提交于
      ubd had its own block request partial completion mechanism, which is
      unnecessary as block layer already does it.  Kill ubd_end_request()
      and ubd_finish() and replace them with direct call to
      blk_end_request().
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeff Dike <jdike@linux.intel.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      4d6c84d9
    • T
      sunvdc: kill vdc_end_request() · 04420850
      Tejun Heo 提交于
      vdc_end_request() is a thin silly wrapper on top of
      __blk_end_request().  Kill it.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      04420850
    • T
      ps3disk: simplify request completion · cd4c34eb
      Tejun Heo 提交于
      ps3disk_interrupt() always completes requests fully but it uses
      rq->hard_cur_sectors for FLUSH requests for some reason.  Drop them
      and simply use __blk_end_request_all().
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      cd4c34eb
    • T
      amiflop,ataflop,xd,mg_disk: clean up unnecessary stuff from block drivers · 5b5c5d12
      Tejun Heo 提交于
      rq_data_dir() can only be READ or WRITE and rq->sector and nr_sectors
      are always automatically updated after partial request completion.
      Don't worry about rq_data_dir() not being either READ or WRITE or
      manually update sector and nr_sectors.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jörg Dorchain <joerg@dorchain.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: unsik Kim <donari75@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5b5c5d12
    • T
      block: don't init rq fields unnecessarily · 4c94dece
      Tejun Heo 提交于
      blk_get_request() always returns properly zeroed requests.  Don't set
      fields to zero/NULL unnecessarily.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      4c94dece
    • T
      block: make blk_end_request_cur() return bool · 9fd8d0e1
      Tejun Heo 提交于
      In the process of mindlessly copying [__]blk_end_request_all(),
      [__]blk_end_request_cur() ended up returning void even though they're
      partial completion functions.  Fix it.
      
      [ Impact: fix braindead API ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      9fd8d0e1
    • N
    • J
      block: include discard requests in IO accounting · c69d4854
      Jens Axboe 提交于
      We currently don't do merging on discard requests, but we potentially
      could. If we do, then we need to include discard requests in the IO
      accounting, or merging would end up decrementing in_flight IO counters
      for an IO which never incremented them.
      
      So enable accounting for discard requests.
      
      Problem found by Nikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c69d4854
    • J
      block: make blk_do_io_stat() do the full "is this rq accountable" checks · c2553b58
      Jens Axboe 提交于
      We currently check for file system requests outside of blk_do_io_stat(rq),
      but we may as well just include it.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c2553b58
    • T
      block: kill rq->data · 731ec497
      Tejun Heo 提交于
      Now that all block request data transfer is done via bio, rq->data
      isn't used.  Kill it.
      
      While at it, make the roles of rq->special and buffer clear.
      
      [ Impact: drop now unncessary field from struct request ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      731ec497
    • T
      arm-omap: don't abuse rq->data · ec24751a
      Tejun Heo 提交于
      omap mailbox uses rq->data as the second opaque pointer to carry
      mbox_msg_t and rq->special message argument which is needed only for
      tx.  Add and use omap_msg_tx_data struct for tx and use rq->special
      for mbox_msg_t for rx such that only rq->special is used as opaque
      pointer.
      
      [ Impact: cleanup rq->data usage, extra kmalloc in msg_send ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      ec24751a
    • T
      block: replace end_request() with [__]blk_end_request_cur() · f06d9a2b
      Tejun Heo 提交于
      end_request() has been kept around for backward compatibility;
      however, it's about time for it to go away.
      
      * There aren't too many users left.
      
      * Its use of @updtodate is pretty confusing.
      
      * In some cases, newer code ends up using mixture of end_request() and
        [__]blk_end_request[_all](), which is way too confusing.
      
      So, add [__]blk_end_request_cur() and replace end_request() with it.
      Most conversions are straightforward.  Noteworthy ones are...
      
      * paride/pcd: next_request() updated to take 0/-errno instead of 1/0.
      
      * paride/pf: pf_end_request() and next_request() updated to take
        0/-errno instead of 1/0.
      
      * xd: xd_readwrite() updated to return 0/-errno instead of 1/0.
      
      * mtd/mtd_blkdevs: blktrans_discard_request() updated to return
        0/-errno instead of 1/0.  Unnecessary local variable res
        initialization removed from mtd_blktrans_thread().
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NJoerg Dorchain <joerg@dorchain.net>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: NGrant Likely <grant.likely@secretlab.ca>
      Acked-by: NLaurent Vivier <Laurent@lvivier.info>
      Cc: Tim Waugh <tim@cyberelk.net>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      Cc: unsik Kim <donari75@gmail.com>
      f06d9a2b
    • T
      block: implement and use [__]blk_end_request_all() · 40cbbb78
      Tejun Heo 提交于
      There are many [__]blk_end_request() call sites which call it with
      full request length and expect full completion.  Many of them ensure
      that the request actually completes by doing BUG_ON() the return
      value, which is awkward and error-prone.
      
      This patch adds [__]blk_end_request_all() which takes @rq and @error
      and fully completes the request.  BUG_ON() is added to to ensure that
      this actually happens.
      
      Most conversions are simple but there are a few noteworthy ones.
      
      * cdrom/viocd: viocd_end_request() replaced with direct calls to
        __blk_end_request_all().
      
      * s390/block/dasd: dasd_end_request() replaced with direct calls to
        __blk_end_request_all().
      
      * s390/char/tape_block: tapeblock_end_request() replaced with direct
        calls to blk_end_request_all().
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Mike Miller <mike.miller@hp.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      40cbbb78
    • T
      block: move rq->start_time initialization to blk_rq_init() · b243ddcb
      Tejun Heo 提交于
      rq->start_time was initialized in init_request_from_bio() so special
      requests didn't have start_time set.  This has been okay as start_time
      has been used only for fs requests; however, there is no indication of
      this actually is the case or not.  Set rq->start_time in blk_rq_init()
      and guarantee that all initialized rq's have its start_time set.  This
      improves consistency at virtually no cost and future changes will make
      use of the timestamp for !bio requests.
      
      [ Impact: rq->start_time is valid for all requests ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      b243ddcb
    • T
      block: clean up request completion API · 2e60e022
      Tejun Heo 提交于
      Request completion has gone through several changes and became a bit
      messy over the time.  Clean it up.
      
      1. end_that_request_data() is a thin wrapper around
         end_that_request_data_first() which checks whether bio is NULL
         before doing anything and handles bidi completion.
         blk_update_request() is a thin wrapper around
         end_that_request_data() which clears nr_sectors on the last
         iteration but doesn't use the bidi completion.
      
         Clean it up by moving the initial bio NULL check and nr_sectors
         clearing on the last iteration into end_that_request_data() and
         renaming it to blk_update_request(), which makes blk_end_io() the
         only user of end_that_request_data().  Collapse
         end_that_request_data() into blk_end_io().
      
      2. There are four visible completion variants - blk_end_request(),
         __blk_end_request(), blk_end_bidi_request() and end_request().
         blk_end_request() and blk_end_bidi_request() uses blk_end_request()
         as the backend but __blk_end_request() and end_request() use
         separate implementation in __blk_end_request() due to different
         locking rules.
      
         blk_end_bidi_request() is identical to blk_end_io().  Collapse
         blk_end_io() into blk_end_bidi_request(), separate out request
         update into internal helper blk_update_bidi_request() and add
         __blk_end_bidi_request().  Redefine [__]blk_end_request() as thin
         inline wrappers around [__]blk_end_bidi_request().
      
      3. As the whole request issue/completion usages are about to be
         modified and audited, it's a good chance to convert completion
         functions return bool which better indicates the intended meaning
         of return values.
      
      4. The function name end_that_request_last() is from the days when it
         was a public interface and slighly confusing.  Give it a proper
         internal name - blk_finish_request().
      
      5. Add description explaning that blk_end_bidi_request() can be safely
         used for uni requests as suggested by Boaz Harrosh.
      
      The only visible behavior change is from #1.  nr_sectors counts are
      cleared after the final iteration no matter which function is used to
      complete the request.  I couldn't find any place where the code
      assumes those nr_sectors counters contain the values for the last
      segment and this change is good as it makes the API much more
      consistent as the end result is now same whether a request is
      completed using [__]blk_end_request() alone or in combination with
      blk_update_request().
      
      API further cleaned up per Christoph's suggestion.
      
      [ Impact: cleanup, rq->*nr_sectors always updated after req completion ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NBoaz Harrosh <bharrosh@panasas.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      2e60e022
    • T
      block: kill blk_end_request_callback() · 0b302d5a
      Tejun Heo 提交于
      With recent IDE updates, blk_end_request_callback() doesn't have any
      user now.  Kill it.
      
      [ Impact: removal of unused convoluted interface ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      0b302d5a
    • T
      block: reorganize request fetching functions · 158dbda0
      Tejun Heo 提交于
      Impact: code reorganization
      
      elv_next_request() and elv_dequeue_request() are public block layer
      interface than actual elevator implementation.  They mostly deal with
      how requests interact with block layer and low level drivers at the
      beginning of rqeuest processing whereas __elv_next_request() is the
      actual eleveator request fetching interface.
      
      Move the two functions to blk-core.c.  This prepares for further
      interface cleanup.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      158dbda0
    • T
      block: reorder request completion functions · 5efccd17
      Tejun Heo 提交于
      Reorder request completion functions such that
      
      * All request completion functions are located together.
      
      * Functions which are used by only one caller is put right above the
        caller.
      
      * end_request() is put after other completion functions but before
        blk_update_request().
      
      This change is for completion function cleanup which will follow.
      
      [ Impact: cleanup, code reorganization ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5efccd17
    • T
      block: clean up misc stuff after block layer timeout conversion · 2eef33e4
      Tejun Heo 提交于
      * In blk_rq_timed_out_timer(), else { if } to else if
      
      * In blk_add_timer(), simplify if/else block
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      2eef33e4
    • T
      block: cleanup REQ_SOFTBARRIER usages · 10732f56
      Tejun Heo 提交于
      blk_insert_request() doesn't need to worry about REQ_SOFTBARRIER.
      Don't set it.  Combined with recent ide updates, REQ_SOFTBARRIER is
      now only used in elevator proper and for discard requests.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      10732f56
    • T
      block: don't set REQ_NOMERGE unnecessarily · e4025f6c
      Tejun Heo 提交于
      RQ_NOMERGE_FLAGS already clears defines which REQ flags aren't
      mergeable.  There is no reason to specify it superflously.  It only
      adds to confusion.  Don't set REQ_NOMERGE for barriers and requests
      with specific queueing directive.  REQ_NOMERGE is now exclusively used
      by the merging code.
      
      [ Impact: cleanup ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      e4025f6c
    • T
      block: kill blk_start_queueing() · a7f55792
      Tejun Heo 提交于
      blk_start_queueing() is identical to __blk_run_queue() except that it
      doesn't check for recursion.  None of the current users depends on
      blk_start_queueing() running request_fn directly.  Replace usages of
      blk_start_queueing() with [__]blk_run_queue() and kill it.
      
      [ Impact: removal of mostly duplicate interface function ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a7f55792
    • T
      block: merge blk_invoke_request_fn() into __blk_run_queue() · a538cd03
      Tejun Heo 提交于
      __blk_run_queue wraps blk_invoke_request_fn() such that it
      additionally removes plug and bails out early if the queue is empty.
      Both extra operations have their own pending mechanisms and don't
      cause any harm correctness-wise when they are done superflously.
      
      The only user of blk_invoke_request_fn() being blk_start_queue(),
      there isn't much reason to keep both functions around.  Merge
      blk_invoke_request_fn() into __blk_run_queue() and make
      blk_start_queue() use __blk_run_queue() instead.
      
      [ Impact: merge two subtly different internal functions ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      a538cd03
    • J
      block: implement blkdev_readpages · db2dbb12
      Jeff Moyer 提交于
      Doing a proper block dev ->readpages() speeds up the crazy dump(8)
      approach of using interleaved process IO.
      Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      db2dbb12
    • B
      block: enable by default support for large devices and files on 32-bit archs · db29a6b4
      Bartlomiej Zolnierkiewicz 提交于
      Enable by default support for large devices and files (CONFIG_LBD):
      
      - With 1TB disks being a commodity hardware it is quite easy to hit 2TB
        limitation while building RAIDs etc. and many distros have been using
        CONFIG_LBD=y by default already (at least Fedora 10 and openSUSE 11.1).
      
      - This should also prevent a subtle ext4 filesystem compatibility issue:
        mke2fs.ext4 defaults to creating filesystems with huge_files feature
        enabled and such filesystems cannot be later mounted read-write on
        machines with CONFIG_LBD=n (it should be quite easy to hit this issue
        when trying to use filesystem created using distro kernel on system
        running the self-build kernel, think about USB disk enclosures & co.).
      
      While at it:
      
      - Clarify config option help text w.r.t. mounting ext4 filesystems
        (they can be mounted with CONFIG_LBD=n but in the read-only mode).
      
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      db29a6b4