1. 08 11月, 2015 1 次提交
  2. 14 8月, 2015 1 次提交
  3. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  4. 17 7月, 2015 1 次提交
  5. 11 7月, 2015 1 次提交
  6. 01 7月, 2015 2 次提交
  7. 02 6月, 2015 1 次提交
    • T
      writeback: separate out include/linux/backing-dev-defs.h · 66114cad
      Tejun Heo 提交于
      With the planned cgroup writeback support, backing-dev related
      declarations will be more widely used across block and cgroup;
      unfortunately, including backing-dev.h from include/linux/blkdev.h
      makes cyclic include dependency quite likely.
      
      This patch separates out backing-dev-defs.h which only has the
      essential definitions and updates blkdev.h to include it.  c files
      which need access to more backing-dev details now include
      backing-dev.h directly.  This takes backing-dev.h off the common
      include dependency chain making it a lot easier to use it across block
      and cgroup.
      
      v2: fs/fat build failure fixed.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      66114cad
  8. 22 5月, 2015 1 次提交
    • M
      block: remove management of bi_remaining when restoring original bi_end_io · 326e1dbb
      Mike Snitzer 提交于
      Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
      non-chains") regressed all existing callers that followed this pattern:
       1) saving a bio's original bi_end_io
       2) wiring up an intermediate bi_end_io
       3) restoring the original bi_end_io from intermediate bi_end_io
       4) calling bio_endio() to execute the restored original bi_end_io
      
      The regression was due to BIO_CHAIN only ever getting set if
      bio_inc_remaining() is called.  For the above pattern it isn't set until
      step 3 above (step 2 would've needed to establish BIO_CHAIN).  As such
      the first bio_endio(), in step 2 above, never decremented __bi_remaining
      before calling the intermediate bi_end_io -- leaving __bi_remaining with
      the value 1 instead of 0.  When bio_inc_remaining() occurred during step
      3 it brought it to a value of 2.  When the second bio_endio() was
      called, in step 4 above, it should've called the original bi_end_io but
      it didn't because there was an extra reference that wasn't dropped (due
      to atomic operations being optimized away since BIO_CHAIN wasn't set
      upfront).
      
      Fix this issue by removing the __bi_remaining management complexity for
      all callers that use the above pattern -- bio_chain() is the only
      interface that _needs_ to be concerned with __bi_remaining.  For the
      above pattern callers just expect the bi_end_io they set to get called!
      Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
      that aren't associated with the bio_chain() interface.
      
      Also, the bio_inc_remaining() interface has been moved local to bio.c.
      
      Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      326e1dbb
  9. 06 5月, 2015 1 次提交
    • J
      bio: skip atomic inc/dec of ->bi_cnt for most use cases · dac56212
      Jens Axboe 提交于
      Struct bio has a reference count that controls when it can be freed.
      Most uses cases is allocating the bio, which then returns with a
      single reference to it, doing IO, and then dropping that single
      reference. We can remove this atomic_dec_and_test() in the completion
      path, if nobody else is holding a reference to the bio.
      
      If someone does call bio_get() on the bio, then we flag the bio as
      now having valid count and that we must properly honor the reference
      count when it's being put.
      Tested-by: NRobert Elliott <elliott@hp.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dac56212
  10. 24 11月, 2014 1 次提交
  11. 05 10月, 2014 1 次提交
    • M
      block: disable entropy contributions for nonrot devices · b277da0a
      Mike Snitzer 提交于
      Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set
      QUEUE_FLAG_NONROT.
      
      Historically, all block devices have automatically made entropy
      contributions.  But as previously stated in commit e2e1a148 ("block: add
      sysfs knob for turning off disk entropy contributions"):
          - On SSD disks, the completion times aren't as random as they
            are for rotational drives. So it's questionable whether they
            should contribute to the random pool in the first place.
          - Calling add_disk_randomness() has a lot of overhead.
      
      There are more reliable sources for randomness than non-rotational block
      devices.  From a security perspective it is better to err on the side of
      caution than to allow entropy contributions from unreliable "random"
      sources.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b277da0a
  12. 05 8月, 2014 22 次提交
  13. 18 4月, 2014 1 次提交
  14. 19 3月, 2014 5 次提交
    • J
      bcache: remove nested function usage · cb851149
      John Sheu 提交于
      Uninlined nested functions can cause crashes when using ftrace, as they don't
      follow the normal calling convention and confuse the ftrace function graph
      tracer as it examines the stack.
      
      Also, nested functions are supported as a gcc extension, but may fail on other
      compilers (e.g. llvm).
      Signed-off-by: NJohn Sheu <john.sheu@gmail.com>
      cb851149
    • K
      bcache: Kill bucket->gc_gen · 3a2fd9d5
      Kent Overstreet 提交于
      gc_gen was a temporary used to recalculate last_gc, but since we only need
      bucket->last_gc when gc isn't running (gc_mark_valid = 1), we can just update
      last_gc directly.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      3a2fd9d5
    • K
      bcache: Kill unused freelist · 2531d9ee
      Kent Overstreet 提交于
      This was originally added as at optimization that for various reasons isn't
      needed anymore, but it does add a lot of nasty corner cases (and it was
      responsible for some recently fixed bugs). Just get rid of it now.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      2531d9ee
    • K
      bcache: Rework btree cache reserve handling · 0a63b66d
      Kent Overstreet 提交于
      This changes the bucket allocation reserves to use _real_ reserves - separate
      freelists - instead of watermarks, which if nothing else makes the current code
      saner to reason about and is going to be important in the future when we add
      support for multiple btrees.
      
      It also adds btree_check_reserve(), which checks (and locks) the reserves for
      both bucket allocation and memory allocation for btree nodes; the old code just
      kinda sorta assumed that since (e.g. for btree node splits) it had the root
      locked and that meant no other threads could try to make use of the same
      reserve; this technically should have been ok for memory allocation (we should
      always have a reserve for memory allocation (the btree node cache is used as a
      reserve and we preallocate it)), but multiple btrees will mean that locking the
      root won't be sufficient anymore, and for the bucket allocation reserve it was
      technically possible for the old code to deadlock.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      0a63b66d
    • K
      bcache: Kill btree_io_wq · 56b30770
      Kent Overstreet 提交于
      With the locking rework in the last patch, this shouldn't be needed anymore -
      btree_node_write_work() only takes b->write_lock which is never held for very
      long.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      56b30770