1. 30 1月, 2009 2 次提交
  2. 29 12月, 2008 6 次提交
    • J
      bio: add support for inlining a number of bio_vecs inside the bio · 392ddc32
      Jens Axboe 提交于
      When we go and allocate a bio for IO, we actually do two allocations.
      One for the bio itself, and one for the bi_io_vec that holds the
      actual pages we are interested in.
      
      This feature inlines a definable amount of io vecs inside the bio
      itself, so we eliminate the bio_vec array allocation for IO's up
      to a certain size. It defaults to 4 vecs, which is typically 16k
      of IO.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      392ddc32
    • J
      bio: allow individual slabs in the bio_set · bb799ca0
      Jens Axboe 提交于
      Instead of having a global bio slab cache, add a reference to one
      in each bio_set that is created. This allows for personalized slabs
      in each bio_set, so that they can have bios of different sizes.
      
      This means we can personalize the bios we return. File systems may
      want to embed the bio inside another structure, to avoid allocation
      more items (and stuffing them in ->bi_private) after the get a bio.
      Or we may want to embed a number of bio_vecs directly at the end
      of a bio, to avoid doing two allocations to return a bio. This is now
      possible.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      bb799ca0
    • J
      bio: move the slab pointer inside the bio_set · 1b434498
      Jens Axboe 提交于
      In preparation for adding differently sized bios.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      1b434498
    • J
      bio: only mempool back the largest bio_vec slab cache · 7ff9345f
      Jens Axboe 提交于
      We only very rarely need the mempool backing, so it makes sense to
      get rid of all but one of the mempool in a bio_set. So keep the
      largest bio_vec count mempool so we can always honor the largest
      allocation, and "upgrade" callers that fail.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      7ff9345f
    • R
      block: reorder struct bio to remove padding on 64bit · ba744d5e
      Richard Kennedy 提交于
      Remove 8 bytes of padding from struct bio which also removes 16 bytes from
      struct bio_pair to make it 248 bytes.  bio_pair then fits into one fewer
      cache lines & into a smaller slab.
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      ba744d5e
    • K
      block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set · 08bafc03
      Keith Mannthey 提交于
      Allow the scsi request REQ_QUIET flag to be propagated to the buffer
      file system layer. The basic ideas is to pass the flag from the scsi
      request to the bio (block IO) and then to the buffer layer.  The buffer
      layer can then suppress needless printks.
      
      This patch declutters the kernel log by removed the 40-50 (per lun)
      buffer io error messages seen during a boot in my multipath setup . It
      is a good chance any real errors will be missed in the "noise" it the
      logs without this patch.
      
      During boot I see blocks of messages like
      "
      __ratelimit: 211 callbacks suppressed
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242847
      Buffer I/O error on device sdm, logical block 1
      Buffer I/O error on device sdm, logical block 5242878
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242872
      "
      in my logs.
      
      My disk environment is multipath fiber channel using the SCSI_DH_RDAC
      code and multipathd.  This topology includes an "active" and "ghost"
      path for each lun. IO's to the "ghost" path will never complete and the
      SCSI layer, via the scsi device handler rdac code, quick returns the IOs
      to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
      layer messages.
      
       I am wanting to extend the QUIET behavior to include the buffer file
      system layer to deal with these errors as well. I have been running this
      patch for a while now on several boxes without issue.  A few runs of
      bonnie++ show no noticeable difference in performance in my setup.
      
      Thanks for John Stultz for the quiet_error finalization.
      Submitted-by: NKeith Mannthey <kmannth@us.ibm.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      08bafc03
  3. 06 11月, 2008 1 次提交
  4. 17 10月, 2008 1 次提交
    • F
      block: fix nr_phys_segments miscalculation bug · 86771427
      FUJITA Tomonori 提交于
      This fixes the bug reported by Nikanth Karthikesan <knikanth@suse.de>:
      
      http://lkml.org/lkml/2008/10/2/203
      
      The root cause of the bug is that blk_phys_contig_segment
      miscalculates q->max_segment_size.
      
      blk_phys_contig_segment checks:
      
      req->biotail->bi_size + next_req->bio->bi_size > q->max_segment_size
      
      But blk_recalc_rq_segments might expect that req->biotail and the
      previous bio in the req are supposed be merged into one
      segment. blk_recalc_rq_segments might also expect that next_req->bio
      and the next bio in the next_req are supposed be merged into one
      segment. In such case, we merge two requests that can't be merged
      here. Later, blk_rq_map_sg gives more segments than it should.
      
      We need to keep track of segment size in blk_recalc_rq_segments and
      use it to see if two requests can be merged. This patch implements it
      in the similar way that we used to do for hw merging (virtual
      merging).
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      86771427
  5. 13 10月, 2008 1 次提交
    • M
      [SCSI] block: separate failfast into multiple bits. · 6000a368
      Mike Christie 提交于
      Multipath is best at handling transport errors. If it gets a device
      error then there is not much the multipath layer can do. It will just
      access the same device but from a different path.
      
      This patch breaks up failfast into device, transport and driver errors.
      The multipath layers (md and dm mutlipath) only ask the lower levels to
      fast fail transport errors. The user of failfast, read ahead, will ask
      to fast fail on all errors.
      
      Note that blk_noretry_request will return true if any failfast bit
      is set. This allows drivers that do not support the multipath failfast
      bits to continue to fail on any failfast error like before. Drivers
      like scsi that are able to fail fast specific errors can check
      for the specific fail fast type. In the next patch I will convert
      scsi.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      6000a368
  6. 09 10月, 2008 18 次提交
  7. 03 7月, 2008 3 次提交
  8. 29 4月, 2008 1 次提交
  9. 21 4月, 2008 1 次提交
    • F
      block: convert bio_copy_user to bio_copy_user_iov · c5dec1c3
      FUJITA Tomonori 提交于
      This patch enables bio_copy_user to take struct sg_iovec (renamed
      bio_copy_user_iov). bio_copy_user uses bio_copy_user_iov internally as
      bio_map_user uses bio_map_user_iov.
      
      The major changes are:
      
      - adds sg_iovec array to struct bio_map_data
      
      - adds __bio_copy_iov that copy data between bio and
      sg_iovec. bio_copy_user_iov and bio_uncopy_user use it.
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      c5dec1c3
  10. 19 2月, 2008 1 次提交
    • A
      fs/block_dev.c: remove #if 0'ed code · 86b6c7a7
      Adrian Bunk 提交于
      Commit b2e895db #if 0'ed this code stating:
      
      <--  snip  -->
      
          [PATCH] revert blockdev direct io back to 2.6.19 version
      
          Andrew Vasquez is reporting as-iosched oopses and a 65% throughput
          slowdown due to the recent special-casing of direct-io against
          blockdevs.  We don't know why either of these things are occurring.
      
          The patch minimally reverts us back to the 2.6.19 code for a 2.6.20
          release.
      
      <--  snip  -->
      
      It has since been dead code, and unless someone wants to revive it now
      it's time to remove it.
      
      This patch also makes bio_release_pages() static again and removes the
      ki_bio_count member from struct kiocb, reverting changes that had been
      done for this dead code.
      Signed-off-by: NAdrian Bunk <bunk@kernel.org>
      Signed-off-by: NJens Axboe <axboe@carl.home.kernel.dk>
      86b6c7a7
  11. 16 10月, 2007 1 次提交
  12. 10 10月, 2007 1 次提交
  13. 12 8月, 2007 1 次提交
  14. 30 4月, 2007 1 次提交
    • J
      [BLOCK] Don't pin lots of memory in mempools · 5972511b
      Jens Axboe 提交于
      Currently we scale the mempool sizes depending on memory installed
      in the machine, except for the bio pool itself which sits at a fixed
      256 entry pre-allocation.
      
      There's really no point in "optimizing" this OOM path, we just need
      enough preallocated to make progress. A single unit is enough, lets
      scale it down to 2 just to be on the safe side.
      
      This patch saves ~150kb of pinned kernel memory on a 32-bit box.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      5972511b
  15. 14 12月, 2006 1 次提交
    • C
      [PATCH] optimize o_direct on block devices · e61c9018
      Chen, Kenneth W 提交于
      Implement block device specific .direct_IO method instead of going through
      generic direct_io_worker for block device.
      
      direct_io_worker() is fairly complex because it needs to handle O_DIRECT on
      file system, where it needs to perform block allocation, hole detection,
      extents file on write, and tons of other corner cases.  The end result is
      that it takes tons of CPU time to submit an I/O.
      
      For block device, the block allocation is much simpler and a tight triple
      loop can be written to iterate each iovec and each page within the iovec in
      order to construct/prepare bio structure and then subsequently submit it to
      the block layer.  This significantly speeds up O_D on block device.
      
      [akpm@osdl.org: small speedup]
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Zach Brown <zach.brown@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e61c9018