1. 11 11月, 2013 12 次提交
    • K
      bcache: Convert gc to a kthread · 72a44517
      Kent Overstreet 提交于
      We needed a dedicated rescuer workqueue for gc anyways... and gc was
      conceptually a dedicated thread, just one that wasn't running all the
      time. Switch it to a dedicated thread to make the code a bit more
      straightforward.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      72a44517
    • K
      bcache: Convert bucket_wait to wait_queue_head_t · 35fcd848
      Kent Overstreet 提交于
      At one point we did do fancy asynchronous waiting stuff with
      bucket_wait, but that's all gone (and bucket_wait is used a lot less
      than it used to be). So use the standard primitives.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      35fcd848
    • K
      bcache: Convert try_wait to wait_queue_head_t · e8e1d468
      Kent Overstreet 提交于
      We never waited on c->try_wait asynchronously, so just use the standard
      primitives.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      e8e1d468
    • K
      bcache: Move keylist out of btree_op · 0b93207a
      Kent Overstreet 提交于
      Slowly working on pruning struct btree_op - the aim is for it to only
      contain things that are actually necessary for traversing the btree.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      0b93207a
    • K
      bcache: Refactor journalling flow control · a34a8bfd
      Kent Overstreet 提交于
      Making things less asynchronous that don't need to be - bch_journal()
      only has to block when the journal or journal entry is full, which is
      emphatically not a fast path. So make it a normal function that just
      returns when it finishes, to make the code and control flow easier to
      follow.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      a34a8bfd
    • K
      bcache: Clean up keylist code · c2f95ae2
      Kent Overstreet 提交于
      More random refactoring.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      c2f95ae2
    • K
      bcache: Add explicit keylist arg to btree_insert() · 4f3d4014
      Kent Overstreet 提交于
      Some refactoring - better to explicitly pass stuff around instead of
      having it all in the "big bag of state", struct btree_op. Going to prune
      struct btree_op quite a bit over time.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      4f3d4014
    • K
      bcache: Convert btree_insert_check_key() to btree_insert_node() · e7c590eb
      Kent Overstreet 提交于
      This was the main point of all this refactoring - now,
      btree_insert_check_key() won't fail just because the leaf node happened
      to be full.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      e7c590eb
    • K
      bcache: Insert multiple keys at a time · 403b6cde
      Kent Overstreet 提交于
      We'll often end up with a list of adjacent keys to insert -
      because bch_data_insert() may have to fragment the data it writes.
      
      Originally, to simplify things and avoid having to deal with corner
      cases bch_btree_insert() would pass keys from this list one at a time to
      btree_insert_recurse() - mainly because the list of keys might span leaf
      nodes, so it was easier this way.
      
      With the btree_insert_node() refactoring, it's now a lot easier to just
      pass down the whole list and have btree_insert_recurse() iterate over
      leaf nodes until it's done.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      403b6cde
    • K
      bcache: Add btree_insert_node() · 26c949f8
      Kent Overstreet 提交于
      The flow of control in the old btree insertion code was rather -
      backwards; we'd recurse down the btree (in btree_insert_recurse()), and
      then if we needed to split the keys to be inserted into the parent node
      would be effectively returned up to btree_insert_recurse(), which would
      notice there was more work to do and finish the insertion.
      
      The main problem with this was that the full logic for btree insertion
      could only be used by calling btree_insert_recurse; if you'd gotten to a
      btree leaf some other way and had a key to insert, if it turned out that
      node needed to be split you were SOL.
      
      This inverts the flow of control so btree_insert_node() does _full_
      btree insertion, including splitting - and takes a (leaf) btree node to
      insert into as a parameter.
      
      This means we can now _correctly_ handle cache misses - for cache
      misses, we need to insert a fake "check" key into the btree when we
      discover we have a cache miss - while we still have the btree locked.
      Previously, if the btree node was full inserting a cache miss would just
      fail.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      26c949f8
    • K
      bcache: Explicitly track btree node's parent · d6fd3b11
      Kent Overstreet 提交于
      This is prep work for the reworked btree insertion code.
      
      The way we set b->parent is ugly and hacky... the problem is, when
      btree_split() or garbage collection splits or rewrites a btree node, the
      parent changes for all its (potentially already cached) children.
      
      I may change this later and add some code to look through the btree node
      cache and find all our cached child nodes and change the parent pointer
      then...
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      d6fd3b11
    • K
      bcache: Fix dirty_data accounting · 1fa8455d
      Kent Overstreet 提交于
      Dirty data accounting wasn't quite right - firstly, we were adding the key we're
      inserting after it could have merged with another dirty key already in the
      btree, and secondly we could sometimes pass the wrong offset to
      bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
      important when tracking dirty data by stripe.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
      1fa8455d
  2. 25 9月, 2013 2 次提交
  3. 11 9月, 2013 1 次提交
    • D
      drivers: convert shrinkers to new count/scan API · 7dc19d5a
      Dave Chinner 提交于
      Convert the driver shrinkers to the new API.  Most changes are compile
      tested only because I either don't have the hardware or it's staging
      stuff.
      
      FWIW, the md and android code is pretty good, but the rest of it makes me
      want to claw my eyes out.  The amount of broken code I just encountered is
      mind boggling.  I've added comments explaining what is broken, but I fear
      that some of the code would be best dealt with by being dragged behind the
      bike shed, burying in mud up to it's neck and then run over repeatedly
      with a blunt lawn mower.
      
      Special mention goes to the zcache/zcache2 drivers.  They can't co-exist
      in the build at the same time, they are under different menu options in
      menuconfig, they only show up when you've got the right set of mm
      subsystem options configured and so even compile testing is an exercise in
      pulling teeth.  And that doesn't even take into account the horrible,
      broken code...
      
      [glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NGlauber Costa <glommer@openvz.org>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Cc: Arve Hjønnevåg <arve@android.com>
      Cc: Carlos Maiolino <cmaiolino@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: J. Bruce Fields <bfields@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Hellstrom <thellstrom@vmware.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7dc19d5a
  4. 12 7月, 2013 1 次提交
    • K
      bcache: Fix GC_SECTORS_USED() calculation · 29ebf465
      Kent Overstreet 提交于
      Part of the job of garbage collection is to add up however many sectors
      of live data it finds in each bucket, but that doesn't work very well if
      it doesn't reset GC_SECTORS_USED() when it starts. Whoops.
      
      This wouldn't have broken anything horribly, but allocation tries to
      preferentially reclaim buckets that are mostly empty and that's not
      gonna work with an incorrect GC_SECTORS_USED() value.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
      29ebf465
  5. 02 7月, 2013 4 次提交
  6. 27 6月, 2013 7 次提交
    • K
      bcache: Write out full stripes · 72c27061
      Kent Overstreet 提交于
      Now that we're tracking dirty data per stripe, we can add two
      optimizations for raid5/6:
      
       * If a stripe is already dirty, force writes to that stripe to
         writeback mode - to help build up full stripes of dirty data
      
       * When flushing dirty data, preferentially write out full stripes first
         if there are any.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      72c27061
    • K
      bcache: Track dirty data by stripe · 279afbad
      Kent Overstreet 提交于
      To make background writeback aware of raid5/6 stripes, we first need to
      track the amount of dirty data within each stripe - we do this by
      breaking up the existing sectors_dirty into per stripe atomic_ts
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      279afbad
    • K
      bcache: Initialize sectors_dirty when attaching · 444fc0b6
      Kent Overstreet 提交于
      Previously, dirty_data wouldn't get initialized until the first garbage
      collection... which was a bit of a problem for background writeback (as
      the PD controller keys off of it) and also confusing for users.
      
      This is also prep work for making background writeback aware of raid5/6
      stripes.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      444fc0b6
    • K
      bcache: Rip out pkey()/pbtree() · 85b1492e
      Kent Overstreet 提交于
      Old gcc doesnt like the struct hack, and it is kind of ugly. So finish
      off the work to convert pr_debug() statements to tracepoints, and delete
      pkey()/pbtree().
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      85b1492e
    • K
      bcache: Fix/revamp tracepoints · c37511b8
      Kent Overstreet 提交于
      The tracepoints were reworked to be more sensible, and fixed a null
      pointer deref in one of the tracepoints.
      
      Converted some of the pr_debug()s to tracepoints - this is partly a
      performance optimization; it used to be that with DEBUG or
      CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it
      was changed to an empty inline function.
      
      Some of the pr_debug() statements had rather expensive function calls as
      part of the arguments, so this code was getting run unnecessarily even
      on non debug kernels - in some fast paths, too.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      c37511b8
    • K
      bcache: Refactor btree io · 57943511
      Kent Overstreet 提交于
      The most significant change is that btree reads are now done
      synchronously, instead of asynchronously and doing the post read stuff
      from a workqueue.
      
      This was originally done because we can't block on IO under
      generic_make_request(). But - we already have a mechanism to punt cache
      lookups to workqueue if needed, so if we just use that we don't have to
      deal with the complexity of doing things asynchronously.
      
      The main benefit is this makes the locking situation saner; we can hold
      our write lock on the btree node until we're finished reading it, and we
      don't need that btree_node_read_done() flag anymore.
      
      Also, for writes, btree_write() was broken out into btree_node_write()
      and btree_leaf_dirty() - the old code with the boolean argument was dumb
      and confusing.
      
      The prio_blocked mechanism was improved a bit too, now the only counter
      is in struct btree_write, we don't mess with transfering a count from
      struct btree anymore.
      
      This required changing garbage collection to block prios at the start
      and unblock when it finishes, which is cleaner than what it was doing
      anyways (the old code had mostly the same effect, but was doing it in a
      convoluted way)
      
      And the btree iter btree_node_read_done() uses was converted to a real
      mempool.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      57943511
    • K
      bcache: Convert allocator thread to kthread · 119ba0f8
      Kent Overstreet 提交于
      Using a workqueue when we just want a single thread is a bit silly.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      119ba0f8
  7. 01 5月, 2013 1 次提交
  8. 09 4月, 2013 2 次提交
  9. 29 3月, 2013 1 次提交
  10. 26 3月, 2013 2 次提交
    • K
      bcache: Style/checkpatch fixes · b1a67b0f
      Kent Overstreet 提交于
      Took out some nested functions, and fixed some more checkpatch
      complaints.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: linux-bcache@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b1a67b0f
    • K
      bcache: Build fixes from test robot · 07e86ccb
      Kent Overstreet 提交于
      config: make ARCH=i386 allmodconfig
      
      All error/warnings:
      
         drivers/md/bcache/bset.c: In function 'bch_ptr_bad':
      >> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
      --
         drivers/md/bcache/debug.c: In function 'bch_pbtree':
      >> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
      --
         drivers/md/bcache/btree.c: In function 'bch_btree_read_done':
      >> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat]
      --
         drivers/md/bcache/closure.o: In function `closure_debug_init':
      >> (.init.text+0x0): multiple definition of `init_module'
      >> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: linux-bcache@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      07e86ccb
  11. 24 3月, 2013 1 次提交