1. 28 4月, 2009 1 次提交
  2. 22 4月, 2009 1 次提交
    • T
      bio: fix bio_kmalloc() · 451a9ebf
      Tejun Heo 提交于
      Impact: fix bio_kmalloc() and its destruction path
      
      bio_kmalloc() was broken in two ways.
      
      * bvec_alloc_bs() first allocates bvec using kmalloc() and then
        ignores it and allocates again like non-kmalloc bvecs.
      
      * bio_kmalloc_destructor() didn't check for and free bio integrity
        data.
      
      This patch fixes the above problems.  kmalloc patch is separated out
      from bio_alloc_bioset() and allocates the requested number of bvecs as
      inline bvecs.
      
      * bio_alloc_bioset() no longer takes NULL @bs.  None other than
        bio_kmalloc() used it and outside users can't know how it was
        allocated anyway.
      
      * Define and use BIO_POOL_NONE so that pool index check in
        bvec_free_bs() triggers if inline or kmalloc allocated bvec gets
        there.
      
      * Relocate destructors on top of each allocation function so that how
        they're used is more clear.
      
      Jens Axboe suggested allocating bvecs inline.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      451a9ebf
  3. 15 4月, 2009 1 次提交
  4. 06 4月, 2009 1 次提交
  5. 24 3月, 2009 1 次提交
  6. 15 3月, 2009 1 次提交
  7. 18 2月, 2009 1 次提交
  8. 02 2月, 2009 2 次提交
  9. 30 1月, 2009 4 次提交
  10. 29 12月, 2008 6 次提交
    • J
      bio: add support for inlining a number of bio_vecs inside the bio · 392ddc32
      Jens Axboe 提交于
      When we go and allocate a bio for IO, we actually do two allocations.
      One for the bio itself, and one for the bi_io_vec that holds the
      actual pages we are interested in.
      
      This feature inlines a definable amount of io vecs inside the bio
      itself, so we eliminate the bio_vec array allocation for IO's up
      to a certain size. It defaults to 4 vecs, which is typically 16k
      of IO.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      392ddc32
    • J
      bio: allow individual slabs in the bio_set · bb799ca0
      Jens Axboe 提交于
      Instead of having a global bio slab cache, add a reference to one
      in each bio_set that is created. This allows for personalized slabs
      in each bio_set, so that they can have bios of different sizes.
      
      This means we can personalize the bios we return. File systems may
      want to embed the bio inside another structure, to avoid allocation
      more items (and stuffing them in ->bi_private) after the get a bio.
      Or we may want to embed a number of bio_vecs directly at the end
      of a bio, to avoid doing two allocations to return a bio. This is now
      possible.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      bb799ca0
    • J
      bio: move the slab pointer inside the bio_set · 1b434498
      Jens Axboe 提交于
      In preparation for adding differently sized bios.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      1b434498
    • J
      bio: only mempool back the largest bio_vec slab cache · 7ff9345f
      Jens Axboe 提交于
      We only very rarely need the mempool backing, so it makes sense to
      get rid of all but one of the mempool in a bio_set. So keep the
      largest bio_vec count mempool so we can always honor the largest
      allocation, and "upgrade" callers that fail.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      7ff9345f
    • R
      block: reorder struct bio to remove padding on 64bit · ba744d5e
      Richard Kennedy 提交于
      Remove 8 bytes of padding from struct bio which also removes 16 bytes from
      struct bio_pair to make it 248 bytes.  bio_pair then fits into one fewer
      cache lines & into a smaller slab.
      Signed-off-by: NRichard Kennedy <richard@rsk.demon.co.uk>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      ba744d5e
    • K
      block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set · 08bafc03
      Keith Mannthey 提交于
      Allow the scsi request REQ_QUIET flag to be propagated to the buffer
      file system layer. The basic ideas is to pass the flag from the scsi
      request to the bio (block IO) and then to the buffer layer.  The buffer
      layer can then suppress needless printks.
      
      This patch declutters the kernel log by removed the 40-50 (per lun)
      buffer io error messages seen during a boot in my multipath setup . It
      is a good chance any real errors will be missed in the "noise" it the
      logs without this patch.
      
      During boot I see blocks of messages like
      "
      __ratelimit: 211 callbacks suppressed
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242847
      Buffer I/O error on device sdm, logical block 1
      Buffer I/O error on device sdm, logical block 5242878
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242872
      "
      in my logs.
      
      My disk environment is multipath fiber channel using the SCSI_DH_RDAC
      code and multipathd.  This topology includes an "active" and "ghost"
      path for each lun. IO's to the "ghost" path will never complete and the
      SCSI layer, via the scsi device handler rdac code, quick returns the IOs
      to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
      layer messages.
      
       I am wanting to extend the QUIET behavior to include the buffer file
      system layer to deal with these errors as well. I have been running this
      patch for a while now on several boxes without issue.  A few runs of
      bonnie++ show no noticeable difference in performance in my setup.
      
      Thanks for John Stultz for the quiet_error finalization.
      Submitted-by: NKeith Mannthey <kmannth@us.ibm.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      08bafc03
  11. 06 11月, 2008 1 次提交
  12. 17 10月, 2008 1 次提交
    • F
      block: fix nr_phys_segments miscalculation bug · 86771427
      FUJITA Tomonori 提交于
      This fixes the bug reported by Nikanth Karthikesan <knikanth@suse.de>:
      
      http://lkml.org/lkml/2008/10/2/203
      
      The root cause of the bug is that blk_phys_contig_segment
      miscalculates q->max_segment_size.
      
      blk_phys_contig_segment checks:
      
      req->biotail->bi_size + next_req->bio->bi_size > q->max_segment_size
      
      But blk_recalc_rq_segments might expect that req->biotail and the
      previous bio in the req are supposed be merged into one
      segment. blk_recalc_rq_segments might also expect that next_req->bio
      and the next bio in the next_req are supposed be merged into one
      segment. In such case, we merge two requests that can't be merged
      here. Later, blk_rq_map_sg gives more segments than it should.
      
      We need to keep track of segment size in blk_recalc_rq_segments and
      use it to see if two requests can be merged. This patch implements it
      in the similar way that we used to do for hw merging (virtual
      merging).
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      86771427
  13. 13 10月, 2008 1 次提交
    • M
      [SCSI] block: separate failfast into multiple bits. · 6000a368
      Mike Christie 提交于
      Multipath is best at handling transport errors. If it gets a device
      error then there is not much the multipath layer can do. It will just
      access the same device but from a different path.
      
      This patch breaks up failfast into device, transport and driver errors.
      The multipath layers (md and dm mutlipath) only ask the lower levels to
      fast fail transport errors. The user of failfast, read ahead, will ask
      to fast fail on all errors.
      
      Note that blk_noretry_request will return true if any failfast bit
      is set. This allows drivers that do not support the multipath failfast
      bits to continue to fail on any failfast error like before. Drivers
      like scsi that are able to fail fast specific errors can check
      for the specific fail fast type. In the next patch I will convert
      scsi.
      Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
      6000a368
  14. 09 10月, 2008 18 次提交