1. 19 3月, 2014 1 次提交
    • N
      bcache: Fix moving_gc deadlocking with a foreground write · da415a09
      Nicholas Swenson 提交于
      Deadlock happened because a foreground write slept, waiting for a bucket
      to be allocated. Normally the gc would mark buckets available for invalidation.
      But the moving_gc was stuck waiting for outstanding writes to complete.
      These writes used the bcache_wq, the same queue foreground writes used.
      
      This fix gives moving_gc its own work queue, so it was still finish moving
      even if foreground writes are stuck waiting for allocation. It also makes
      work queue a parameter to the data_insert path, so moving_gc can use its
      workqueue for writes.
      Signed-off-by: NNicholas Swenson <nks@daterainc.com>
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      da415a09
  2. 30 1月, 2014 1 次提交
    • D
      bcache: fix BUG_ON due to integer overflow with GC_SECTORS_USED · 94717447
      Darrick J. Wong 提交于
      The BUG_ON at the end of __bch_btree_mark_key can be triggered due to
      an integer overflow error:
      
      BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 13);
      ...
      SET_GC_SECTORS_USED(g, min_t(unsigned,
      	     GC_SECTORS_USED(g) + KEY_SIZE(k),
      	     (1 << 14) - 1));
      BUG_ON(!GC_SECTORS_USED(g));
      
      In bcache.h, the SECTORS_USED bitfield is defined to be 13 bits wide.
      While the SET_ code tries to ensure that the field doesn't overflow by
      clamping it to (1<<14)-1 == 16383, this is incorrect because 16383
      requires 14 bits.  Therefore, if GC_SECTORS_USED() + KEY_SIZE() =
      8192, the SET_ statement tries to store 8192 into a 13-bit field.  In
      a 13-bit field, 8192 becomes zero, thus triggering the BUG_ON.
      
      Therefore, create a field width constant and a max value constant, and
      use those to create the bitfield and check the inputs to
      SET_GC_SECTORS_USED.  Arguably the BITMASK() template ought to have
      BUG_ON checks for too-large values, but that's a separate patch.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      94717447
  3. 09 1月, 2014 11 次提交
  4. 17 12月, 2013 2 次提交
  5. 24 11月, 2013 2 次提交
    • K
      block: Introduce new bio_split() · 20d0189b
      Kent Overstreet 提交于
      The new bio_split() can split arbitrary bios - it's not restricted to
      single page bios, like the old bio_split() (previously renamed to
      bio_pair_split()). It also has different semantics - it doesn't allocate
      a struct bio_pair, leaving it up to the caller to handle completions.
      
      Then convert the existing bio_pair_split() users to the new bio_split()
      - and also nvme, which was open coding bio splitting.
      
      (We have to take that BUG_ON() out of bio_integrity_trim() because this
      bio_split() needs to use it, and there's no reason it has to be used on
      bios marked as cloned; BIO_CLONED doesn't seem to have clearly
      documented semantics anyways.)
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Neil Brown <neilb@suse.de>
      20d0189b
    • K
      bcache: Kill unaligned bvec hack · ed9c47be
      Kent Overstreet 提交于
      Bcache has a hack to avoid cloning the biovec if it's all full pages -
      but with immutable biovecs coming this won't be necessary anymore.
      
      For now, we remove the special case and always clone the bvec array so
      that the immutable biovec patches are simpler.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      ed9c47be
  6. 11 11月, 2013 18 次提交
  7. 25 9月, 2013 1 次提交
    • K
      bcache: Fix a writeback performance regression · c2a4f318
      Kent Overstreet 提交于
      Background writeback works by scanning the btree for dirty data and
      adding those keys into a fixed size buffer, then for each dirty key in
      the keybuf writing it to the backing device.
      
      When read_dirty() finishes and it's time to scan for more dirty data, we
      need to wait for the outstanding writeback IO to finish - they still
      take up slots in the keybuf (so that foreground writes can check for
      them to avoid races) - without that wait, we'll continually rescan when
      we'll be able to add at most a key or two to the keybuf, and that takes
      locks that starves foreground IO.  Doh.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c2a4f318
  8. 12 7月, 2013 2 次提交
    • K
      bcache: Allocation kthread fixes · 79826c35
      Kent Overstreet 提交于
      The alloc kthread should've been using try_to_freeze() - and also there
      was the potential for the alloc kthread to get woken up after it had
      shut down, which would have been bad.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      79826c35
    • K
      bcache: Fix a sysfs splat on shutdown · c9502ea4
      Kent Overstreet 提交于
      If we stopped a bcache device when we were already detaching (or
      something like that), bcache_device_unlink() would try to remove a
      symlink from sysfs that was already gone because the bcache dev kobject
      had already been removed from sysfs.
      
      So keep track of whether we've removed stuff from sysfs.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
      c9502ea4
  9. 27 6月, 2013 2 次提交