1. 04 3月, 2016 1 次提交
  2. 04 12月, 2015 1 次提交
    • K
      blk-integrity: empty implementation when disabled · 06c1e390
      Keith Busch 提交于
      This patch moves the blk_integrity_payload definition outside the
      CONFIG_BLK_DEV_INTERITY dependency and provides empty function
      implementations when the kernel configuration disables integrity
      extensions. This simplifies drivers that make use of these to map user
      data so they don't need to repeat the same configuration checks.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      
      Updated by Jens to pass an error pointer return from
      bio_integrity_alloc(), otherwise if CONFIG_BLK_DEV_INTEGRITY isn't
      set, we return a weird ENOMEM from __nvme_submit_user_cmd()
      if a meta buffer is set.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      06c1e390
  3. 20 8月, 2015 1 次提交
  4. 14 8月, 2015 1 次提交
  5. 29 7月, 2015 3 次提交
    • J
      block: shrink struct bio down to 2 cache lines again · 2c68f6dc
      Jens Axboe 提交于
      Commit bcf2843b3f8f added ->bi_error to cleanup the error passing
      for struct bio, but that ended up adding 4 bytes and a 4 byte hole
      to the size of struct bio. For a clean config, that bumped it from
      128 bytes, to 136 bytes, on x86-64.
      
      The ->bi_flags member is currently an unsigned long, but it fits
      easily within an int. Change it to an unsigned int, adjust the
      the pool offset code, and move ->bi_error into the new hole. Then
      we end up with a 128 byte bio again.
      
      Change the bio flag set/clear to use cmpxchg to ensure we don't
      lose any flags when manipulating them.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2c68f6dc
    • J
      block: manipulate bio->bi_flags through helpers · b7c44ed9
      Jens Axboe 提交于
      Some places use helpers now, others don't. We only have the 'is set'
      helper, add helpers for setting and clearing flags too.
      
      It was a bit of a mess of atomic vs non-atomic access. With
      BIO_UPTODATE gone, we don't have any risk of concurrent access to the
      flags. So relax the restriction and don't make any of them atomic. The
      flags that do have serialization issues (reffed and chained), we
      already handle those separately.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b7c44ed9
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  6. 02 6月, 2015 1 次提交
    • T
      blkcg: implement bio_associate_blkcg() · 1d933cf0
      Tejun Heo 提交于
      Currently, a bio can only be associated with the io_context and blkcg
      of %current using bio_associate_current().  This is too restrictive
      for cgroup writeback support.  Implement bio_associate_blkcg() which
      associates a bio with the specified blkcg.
      
      bio_associate_blkcg() leaves the io_context unassociated.
      bio_associate_current() is updated so that it considers a bio as
      already associated if it has a blkcg_css, instead of an io_context,
      associated with it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1d933cf0
  7. 22 5月, 2015 1 次提交
    • M
      block: remove management of bi_remaining when restoring original bi_end_io · 326e1dbb
      Mike Snitzer 提交于
      Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
      non-chains") regressed all existing callers that followed this pattern:
       1) saving a bio's original bi_end_io
       2) wiring up an intermediate bi_end_io
       3) restoring the original bi_end_io from intermediate bi_end_io
       4) calling bio_endio() to execute the restored original bi_end_io
      
      The regression was due to BIO_CHAIN only ever getting set if
      bio_inc_remaining() is called.  For the above pattern it isn't set until
      step 3 above (step 2 would've needed to establish BIO_CHAIN).  As such
      the first bio_endio(), in step 2 above, never decremented __bi_remaining
      before calling the intermediate bi_end_io -- leaving __bi_remaining with
      the value 1 instead of 0.  When bio_inc_remaining() occurred during step
      3 it brought it to a value of 2.  When the second bio_endio() was
      called, in step 4 above, it should've called the original bi_end_io but
      it didn't because there was an extra reference that wasn't dropped (due
      to atomic operations being optimized away since BIO_CHAIN wasn't set
      upfront).
      
      Fix this issue by removing the __bi_remaining management complexity for
      all callers that use the above pattern -- bio_chain() is the only
      interface that _needs_ to be concerned with __bi_remaining.  For the
      above pattern callers just expect the bi_end_io they set to get called!
      Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
      that aren't associated with the bio_chain() interface.
      
      Also, the bio_inc_remaining() interface has been moved local to bio.c.
      
      Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      326e1dbb
  8. 06 5月, 2015 2 次提交
    • J
      bio: skip atomic inc/dec of ->bi_cnt for most use cases · dac56212
      Jens Axboe 提交于
      Struct bio has a reference count that controls when it can be freed.
      Most uses cases is allocating the bio, which then returns with a
      single reference to it, doing IO, and then dropping that single
      reference. We can remove this atomic_dec_and_test() in the completion
      path, if nobody else is holding a reference to the bio.
      
      If someone does call bio_get() on the bio, then we flag the bio as
      now having valid count and that we must properly honor the reference
      count when it's being put.
      Tested-by: NRobert Elliott <elliott@hp.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dac56212
    • J
      bio: skip atomic inc/dec of ->bi_remaining for non-chains · c4cf5261
      Jens Axboe 提交于
      Struct bio has an atomic ref count for chained bio's, and we use this
      to know when to end IO on the bio. However, most bio's are not chained,
      so we don't need to always introduce this atomic operation as part of
      ending IO.
      
      Add a helper to elevate the bi_remaining count, and flag the bio as
      now actually needing the decrement at end_io time. Rename the field
      to __bi_remaining to catch any current users of this doing the
      incrementing manually.
      
      For high IOPS workloads, this reduces the overhead of bio_endio()
      substantially.
      Tested-by: NRobert Elliott <elliott@hp.com>
      Acked-by: NKent Overstreet <kent.overstreet@gmail.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c4cf5261
  9. 06 2月, 2015 3 次提交
  10. 24 11月, 2014 1 次提交
  11. 04 10月, 2014 1 次提交
  12. 01 10月, 2014 1 次提交
  13. 27 9月, 2014 7 次提交
  14. 02 7月, 2014 1 次提交
    • G
      bio-integrity: add "bip_max_vcnt" into struct bio_integrity_payload · cbcd1054
      Gu Zheng 提交于
      Commit 08778795 ("block: Fix nr_vecs for inline integrity vectors") from
      Martin introduces the function bip_integrity_vecs(get the useful vectors)
      to fix the issue about nr_vecs for inline integrity vectors that reported
      by David Milburn.
      
      But it seems that bip_integrity_vecs() will return the wrong number if the
      bio is not based on any bio_set for some reason(bio->bi_pool == NULL),
      because in that case, the bip_inline_vecs[0] is malloced directly.  So
      here we add the bip_max_vcnt to record the count of vector slots, and
      cleanup the function bip_integrity_vecs().
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      cbcd1054
  15. 25 6月, 2014 2 次提交
  16. 23 4月, 2014 1 次提交
  17. 09 4月, 2014 1 次提交
  18. 02 4月, 2014 1 次提交
  19. 11 2月, 2014 1 次提交
  20. 10 2月, 2014 1 次提交
  21. 24 11月, 2013 8 次提交
    • K
      block: Kill bio_pair_split() · 4b1faf93
      Kent Overstreet 提交于
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      4b1faf93
    • K
      block: Introduce new bio_split() · 20d0189b
      Kent Overstreet 提交于
      The new bio_split() can split arbitrary bios - it's not restricted to
      single page bios, like the old bio_split() (previously renamed to
      bio_pair_split()). It also has different semantics - it doesn't allocate
      a struct bio_pair, leaving it up to the caller to handle completions.
      
      Then convert the existing bio_pair_split() users to the new bio_split()
      - and also nvme, which was open coding bio splitting.
      
      (We have to take that BUG_ON() out of bio_integrity_trim() because this
      bio_split() needs to use it, and there's no reason it has to be used on
      bios marked as cloned; BIO_CLONED doesn't seem to have clearly
      documented semantics anyways.)
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Neil Brown <neilb@suse.de>
      20d0189b
    • K
      block: Rename bio_split() -> bio_pair_split() · ee67891b
      Kent Overstreet 提交于
      This is prep work for introducing a more general bio_split().
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: Sage Weil <sage@inktank.com>
      ee67891b
    • K
      block: Generic bio chaining · 196d38bc
      Kent Overstreet 提交于
      This adds a generic mechanism for chaining bio completions. This is
      going to be used for a bio_split() replacement, and it turns out to be
      very useful in a fair amount of driver code - a fair number of drivers
      were implementing this in their own roundabout ways, often painfully.
      
      Note that this means it's no longer to call bio_endio() more than once
      on the same bio! This can cause problems for drivers that save/restore
      bi_end_io. Arguably they shouldn't be saving/restoring bi_end_io at all
      - in all but the simplest cases they'd be better off just cloning the
      bio, and immutable biovecs is making bio cloning cheaper. But for now,
      we add a bio_endio_nodec() for these cases.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      196d38bc
    • K
      dm: Refactor for new bio cloning/splitting · 1c3b13e6
      Kent Overstreet 提交于
      We need to convert the dm code to the new bvec_iter primitives which
      respect bi_bvec_done; they also allow us to drastically simplify dm's
      bio splitting code.
      
      Also, it's no longer necessary to save/restore the bvec array anymore -
      driver conversions for immutable bvecs are done, so drivers should never
      be modifying it.
      
      Also kill bio_sector_offset(), dm was the only user and it doesn't make
      much sense anymore.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: dm-devel@redhat.com
      Reviewed-by: NMike Snitzer <snitzer@redhat.com>
      1c3b13e6
    • K
      block: Add bio_clone_fast() · 59d276fe
      Kent Overstreet 提交于
      bio_clone() just got more expensive - however, most users of bio_clone()
      don't actually need to modify the biovec. If they aren't modifying the
      biovec, and they can guarantee that the original bio isn't freed before
      the clone (also true in most cases), we can just point the clone at the
      original bio's biovec.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      59d276fe
    • K
      block: Kill bio_iovec_idx(), __bio_iovec() · f619d254
      Kent Overstreet 提交于
      bio_iovec_idx() and __bio_iovec() don't have any valid uses anymore -
      previous users have been converted to bio_iovec_iter() or other methods.
      
      __BVEC_END() has to go too - the bvec array can't be used directly for
      the last biovec because we might only be using the first portion of it,
      we have to iterate over the bvec array with bio_for_each_segment() which
      checks against the current value of bi_iter.bi_size.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      f619d254
    • K
      block: Kill bio_segments()/bi_vcnt usage · 458b76ed
      Kent Overstreet 提交于
      When we start sharing biovecs, keeping bi_vcnt accurate for splits is
      going to be error prone - and unnecessary, if we refactor some code.
      
      So bio_segments() has to go - but most of the existing users just needed
      to know if the bio had multiple segments, which is easier - add a
      bio_multiple_segments() for them.
      
      (Two of the current uses of bio_segments() are going to go away in a
      couple patches, but the current implementation of bio_segments() is
      unsafe as soon as we start doing driver conversions for immutable
      biovecs - so implement a dumb version for bisectability, it'll go away
      in a couple patches)
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
      Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      458b76ed