提交 · 72a44517f3ca3725dc86081d105457df46448679 · openanolis / cloud-kernel

11 11月, 2013 12 次提交

bcache: Convert gc to a kthread · 72a44517

由 Kent Overstreet 提交于 10月 24, 2013

We needed a dedicated rescuer workqueue for gc anyways... and gc was
conceptually a dedicated thread, just one that wasn't running all the
time. Switch it to a dedicated thread to make the code a bit more
straightforward.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

72a44517

bcache: Convert bucket_wait to wait_queue_head_t · 35fcd848

由 Kent Overstreet 提交于 7月 24, 2013

At one point we did do fancy asynchronous waiting stuff with
bucket_wait, but that's all gone (and bucket_wait is used a lot less
than it used to be). So use the standard primitives.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

35fcd848

bcache: Convert try_wait to wait_queue_head_t · e8e1d468

由 Kent Overstreet 提交于 7月 24, 2013

We never waited on c->try_wait asynchronously, so just use the standard
primitives.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

e8e1d468

bcache: Move keylist out of btree_op · 0b93207a

由 Kent Overstreet 提交于 7月 24, 2013

Slowly working on pruning struct btree_op - the aim is for it to only
contain things that are actually necessary for traversing the btree.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

0b93207a

bcache: Refactor journalling flow control · a34a8bfd

由 Kent Overstreet 提交于 10月 24, 2013

Making things less asynchronous that don't need to be - bch_journal()
only has to block when the journal or journal entry is full, which is
emphatically not a fast path. So make it a normal function that just
returns when it finishes, to make the code and control flow easier to
follow.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

a34a8bfd

K
bcache: Clean up keylist code · c2f95ae2
由 Kent Overstreet 提交于 7月 24, 2013
```
More random refactoring.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
c2f95ae2

bcache: Add explicit keylist arg to btree_insert() · 4f3d4014

由 Kent Overstreet 提交于 9月 10, 2013

Some refactoring - better to explicitly pass stuff around instead of
having it all in the "big bag of state", struct btree_op. Going to prune
struct btree_op quite a bit over time.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

4f3d4014

bcache: Convert btree_insert_check_key() to btree_insert_node() · e7c590eb

由 Kent Overstreet 提交于 9月 10, 2013

This was the main point of all this refactoring - now,
btree_insert_check_key() won't fail just because the leaf node happened
to be full.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

e7c590eb

bcache: Insert multiple keys at a time · 403b6cde

由 Kent Overstreet 提交于 7月 24, 2013

We'll often end up with a list of adjacent keys to insert -
because bch_data_insert() may have to fragment the data it writes.

Originally, to simplify things and avoid having to deal with corner
cases bch_btree_insert() would pass keys from this list one at a time to
btree_insert_recurse() - mainly because the list of keys might span leaf
nodes, so it was easier this way.

With the btree_insert_node() refactoring, it's now a lot easier to just
pass down the whole list and have btree_insert_recurse() iterate over
leaf nodes until it's done.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

403b6cde

bcache: Add btree_insert_node() · 26c949f8

由 Kent Overstreet 提交于 9月 10, 2013

The flow of control in the old btree insertion code was rather -
backwards; we'd recurse down the btree (in btree_insert_recurse()), and
then if we needed to split the keys to be inserted into the parent node
would be effectively returned up to btree_insert_recurse(), which would
notice there was more work to do and finish the insertion.

The main problem with this was that the full logic for btree insertion
could only be used by calling btree_insert_recurse; if you'd gotten to a
btree leaf some other way and had a key to insert, if it turned out that
node needed to be split you were SOL.

This inverts the flow of control so btree_insert_node() does _full_
btree insertion, including splitting - and takes a (leaf) btree node to
insert into as a parameter.

This means we can now _correctly_ handle cache misses - for cache
misses, we need to insert a fake "check" key into the btree when we
discover we have a cache miss - while we still have the btree locked.
Previously, if the btree node was full inserting a cache miss would just
fail.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

26c949f8

bcache: Explicitly track btree node's parent · d6fd3b11

由 Kent Overstreet 提交于 7月 24, 2013

This is prep work for the reworked btree insertion code.

The way we set b->parent is ugly and hacky... the problem is, when
btree_split() or garbage collection splits or rewrites a btree node, the
parent changes for all its (potentially already cached) children.

I may change this later and add some code to look through the btree node
cache and find all our cached child nodes and change the parent pointer
then...
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

d6fd3b11

bcache: Fix dirty_data accounting · 1fa8455d

由 Kent Overstreet 提交于 11月 10, 2013

Dirty data accounting wasn't quite right - firstly, we were adding the key we're
inserting after it could have merged with another dirty key already in the
btree, and secondly we could sometimes pass the wrong offset to
bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
important when tracking dirty data by stripe.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

1fa8455d

25 9月, 2013 2 次提交

bcache: Fix a shrinker deadlock · a698e08c

由 Kent Overstreet 提交于 9月 23, 2013

GFP_NOIO means we could be getting called recursively - mca_alloc() ->
mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then.
Whoops.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a698e08c

bcache: Correct printf()-style format length modifier · 61cbd250

由 Geert Uytterhoeven 提交于 9月 23, 2013

Fix

drivers/md/bcache/btree.c: In function ‘bch_btree_node_read’:
drivers/md/bcache/btree.c:259: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘size_t’
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

61cbd250

11 9月, 2013 1 次提交

drivers: convert shrinkers to new count/scan API · 7dc19d5a

由 Dave Chinner 提交于 8月 28, 2013

Convert the driver shrinkers to the new API.  Most changes are compile
tested only because I either don't have the hardware or it's staging
stuff.

FWIW, the md and android code is pretty good, but the rest of it makes me
want to claw my eyes out.  The amount of broken code I just encountered is
mind boggling.  I've added comments explaining what is broken, but I fear
that some of the code would be best dealt with by being dragged behind the
bike shed, burying in mud up to it's neck and then run over repeatedly
with a blunt lawn mower.

Special mention goes to the zcache/zcache2 drivers.  They can't co-exist
in the build at the same time, they are under different menu options in
menuconfig, they only show up when you've got the right set of mm
subsystem options configured and so even compile testing is an exercise in
pulling teeth.  And that doesn't even take into account the horrible,
broken code...

[glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NGlauber Costa <glommer@openvz.org>
Acked-by: NMel Gorman <mgorman@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7dc19d5a

12 7月, 2013 1 次提交

bcache: Fix GC_SECTORS_USED() calculation · 29ebf465

由 Kent Overstreet 提交于 7月 11, 2013

Part of the job of garbage collection is to add up however many sectors
of live data it finds in each bucket, but that doesn't work very well if
it doesn't reset GC_SECTORS_USED() when it starts. Whoops.

This wouldn't have broken anything horribly, but allocation tries to
preferentially reclaim buckets that are mostly empty and that's not
gonna work with an incorrect GC_SECTORS_USED() value.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

29ebf465

02 7月, 2013 4 次提交

bcache: Use standard utility code · 8e51e414

由 Kent Overstreet 提交于 6月 06, 2013

Some of bcache's utility code has made it into the rest of the kernel,
so drop the bcache versions.

Bcache used to have a workaround for allocating from a bio set under
generic_make_request() (if you allocated more than once, the bios you
already allocated would get stuck on current->bio_list when you
submitted, and you'd risk deadlock) - bcache would mask out __GFP_WAIT
when allocating bios under generic_make_request() so that allocation
could fail and it could retry from workqueue. But bio_alloc_bioset() has
a workaround now, so we can drop this hack and the associated error
handling.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

8e51e414

bcache: Delete fuzz tester · f3059a54

由 Kent Overstreet 提交于 5月 15, 2013

This code has rotted and it hasn't been used in ages anyways.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

f3059a54

K
bcache: Document shrinker reserve better · 36c9ea98
由 Kent Overstreet 提交于 6月 03, 2013
```
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
36c9ea98

bcache: FUA fixes · e49c7c37

由 Kent Overstreet 提交于 6月 26, 2013

Journal writes need to be marked FUA, not just REQ_FLUSH. And btree node
writes have... weird ordering requirements.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

e49c7c37

27 6月, 2013 7 次提交

bcache: Write out full stripes · 72c27061

由 Kent Overstreet 提交于 6月 05, 2013

Now that we're tracking dirty data per stripe, we can add two
optimizations for raid5/6:

 * If a stripe is already dirty, force writes to that stripe to
   writeback mode - to help build up full stripes of dirty data

 * When flushing dirty data, preferentially write out full stripes first
   if there are any.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

72c27061

bcache: Track dirty data by stripe · 279afbad

由 Kent Overstreet 提交于 6月 05, 2013

To make background writeback aware of raid5/6 stripes, we first need to
track the amount of dirty data within each stripe - we do this by
breaking up the existing sectors_dirty into per stripe atomic_ts
Signed-off-by: NKent Overstreet <koverstreet@google.com>

279afbad

bcache: Initialize sectors_dirty when attaching · 444fc0b6

由 Kent Overstreet 提交于 5月 11, 2013

Previously, dirty_data wouldn't get initialized until the first garbage
collection... which was a bit of a problem for background writeback (as
the PD controller keys off of it) and also confusing for users.

This is also prep work for making background writeback aware of raid5/6
stripes.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

444fc0b6

bcache: Rip out pkey()/pbtree() · 85b1492e

由 Kent Overstreet 提交于 5月 14, 2013

Old gcc doesnt like the struct hack, and it is kind of ugly. So finish
off the work to convert pr_debug() statements to tracepoints, and delete
pkey()/pbtree().
Signed-off-by: NKent Overstreet <koverstreet@google.com>

85b1492e

bcache: Fix/revamp tracepoints · c37511b8

由 Kent Overstreet 提交于 4月 26, 2013

The tracepoints were reworked to be more sensible, and fixed a null
pointer deref in one of the tracepoints.

Converted some of the pr_debug()s to tracepoints - this is partly a
performance optimization; it used to be that with DEBUG or
CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it
was changed to an empty inline function.

Some of the pr_debug() statements had rather expensive function calls as
part of the arguments, so this code was getting run unnecessarily even
on non debug kernels - in some fast paths, too.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

c37511b8

bcache: Refactor btree io · 57943511

由 Kent Overstreet 提交于 4月 25, 2013

The most significant change is that btree reads are now done
synchronously, instead of asynchronously and doing the post read stuff
from a workqueue.

This was originally done because we can't block on IO under
generic_make_request(). But - we already have a mechanism to punt cache
lookups to workqueue if needed, so if we just use that we don't have to
deal with the complexity of doing things asynchronously.

The main benefit is this makes the locking situation saner; we can hold
our write lock on the btree node until we're finished reading it, and we
don't need that btree_node_read_done() flag anymore.

Also, for writes, btree_write() was broken out into btree_node_write()
and btree_leaf_dirty() - the old code with the boolean argument was dumb
and confusing.

The prio_blocked mechanism was improved a bit too, now the only counter
is in struct btree_write, we don't mess with transfering a count from
struct btree anymore.

This required changing garbage collection to block prios at the start
and unblock when it finishes, which is cleaner than what it was doing
anyways (the old code had mostly the same effect, but was doing it in a
convoluted way)

And the btree iter btree_node_read_done() uses was converted to a real
mempool.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

57943511

bcache: Convert allocator thread to kthread · 119ba0f8

由 Kent Overstreet 提交于 4月 24, 2013

Using a workqueue when we just want a single thread is a bit silly.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

119ba0f8

01 5月, 2013 1 次提交

bcache: Allocator cleanup/fixes · 86b26b82

由 Kent Overstreet 提交于 4月 30, 2013

The main fix is that bch_allocator_thread() wasn't waiting on
garbage collection to finish (if invalidate_buckets had set
ca->invalidate_needs_gc); we need that to make sure the allocator
doesn't spin and potentially block gc from finishing.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

86b26b82

09 4月, 2013 2 次提交

bcache: Add missing #include <linux/prefetch.h> · cd953ed0

由 Geert Uytterhoeven 提交于 3月 27, 2013

m68k/allmodconfig:

drivers/md/bcache/bset.c: In function ‘bset_search_tree’:
drivers/md/bcache/bset.c:727: error: implicit declaration of function ‘prefetch’

drivers/md/bcache/btree.c: In function ‘bch_btree_node_get’:
drivers/md/bcache/btree.c:933: error: implicit declaration of function ‘prefetch’
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NKent Overstreet <koverstreet@google.com>

cd953ed0

K
bcache: Sparse fixes · c19ed23a
由 Kent Overstreet 提交于 3月 26, 2013
```
Signed-off-by: NKent Overstreet <koverstreet@google.com>
```
c19ed23a

29 3月, 2013 1 次提交

bcache: Don't export utility code, prefix with bch_ · 169ef1cf

由 Kent Overstreet 提交于 3月 28, 2013

Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

169ef1cf

26 3月, 2013 2 次提交

bcache: Style/checkpatch fixes · b1a67b0f

由 Kent Overstreet 提交于 3月 25, 2013

Took out some nested functions, and fixed some more checkpatch
complaints.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1a67b0f

bcache: Build fixes from test robot · 07e86ccb

由 Kent Overstreet 提交于 3月 25, 2013

config: make ARCH=i386 allmodconfig

All error/warnings:

   drivers/md/bcache/bset.c: In function 'bch_ptr_bad':
>> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/debug.c: In function 'bch_pbtree':
>> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/btree.c: In function 'bch_btree_read_done':
>> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/closure.o: In function `closure_debug_init':
>> (.init.text+0x0): multiple definition of `init_module'
>> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

07e86ccb

24 3月, 2013 1 次提交

bcache: A block layer cache · cafe5635

由 Kent Overstreet 提交于 3月 23, 2013

Does writethrough and writeback caching, handles unclean shutdown, and
has a bunch of other nifty features motivated by real world usage.

See the wiki at http://bcache.evilpiepirate.org for more.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

cafe5635

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功