提交 · 0a45114534766058193eb2605c136562a4f7bcc8 · openeuler / Kernel

09 1月, 2014 5 次提交

bcache: Use a mempool for mergesort temporary space · 0a451145

由 Kent Overstreet 提交于 12月 18, 2013

It was a single element mempool before, it's slightly cleaner to just use a real
mempool.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

0a451145

bcache: Btree verify code improvements · 78b77bf8

由 Kent Overstreet 提交于 12月 17, 2013

Used this fixed code to find and fix the bug fixed by
a4d885097b0ac0cd1337f171f2d4b83e946094d4.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

78b77bf8

bcache: kill index() · 88b9f8c4

由 Kent Overstreet 提交于 12月 17, 2013

That was a terrible name for a macro, add some better helpers to replace it.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

88b9f8c4

bcache: Rework allocator reserves · 78365411

由 Kent Overstreet 提交于 12月 17, 2013

We need a reserve for allocating buckets for new btree nodes - and now that
we've got multiple btrees, it really needs to be per btree.

This reworks the reserves so we've got separate freelists for each reserve
instead of watermarks, which seems to make things a bit cleaner, and it adds
some code so that btree_split() can make sure the reserve is available before it
starts.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

78365411

K
bcache: kill closure locking usage · cb7a583e
由 Kent Overstreet 提交于 12月 16, 2013
```
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
cb7a583e

17 12月, 2013 2 次提交

bcache: New writeback PD controller · 16749c23

由 Kent Overstreet 提交于 11月 11, 2013

The old writeback PD controller could get into states where it had throttled all
the way down and take way too long to recover - it was too complicated to really
understand what it was doing.

This rewrites a good chunk of it to hopefully be simpler and make more sense,
and it also pays more attention to units which should make the behaviour a bit
easier to understand.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

16749c23

bcache: bugfix - moving_gc now moves only correct buckets · 981aa8c0

由 Nicholas Swenson 提交于 11月 07, 2013

Removed gc_move_threshold because picking buckets only by
threshold could lead moving extra buckets (ei. if there are
buckets at the threshold that aren't supposed to be moved
do to space considerations).

This is replaced by a GC_MOVE bit in the gc_mark bitmask.
Now only marked buckets get moved.
Signed-off-by: NNicholas Swenson <nks@daterainc.com>
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

981aa8c0

24 11月, 2013 2 次提交

block: Introduce new bio_split() · 20d0189b

由 Kent Overstreet 提交于 11月 23, 2013

The new bio_split() can split arbitrary bios - it's not restricted to
single page bios, like the old bio_split() (previously renamed to
bio_pair_split()). It also has different semantics - it doesn't allocate
a struct bio_pair, leaving it up to the caller to handle completions.

Then convert the existing bio_pair_split() users to the new bio_split()
- and also nvme, which was open coding bio splitting.

(We have to take that BUG_ON() out of bio_integrity_trim() because this
bio_split() needs to use it, and there's no reason it has to be used on
bios marked as cloned; BIO_CLONED doesn't seem to have clearly
documented semantics anyways.)
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Neil Brown <neilb@suse.de>

20d0189b

bcache: Kill unaligned bvec hack · ed9c47be

由 Kent Overstreet 提交于 11月 22, 2013

Bcache has a hack to avoid cloning the biovec if it's all full pages -
but with immutable biovecs coming this won't be necessary anymore.

For now, we remove the special case and always clone the bvec array so
that the immutable biovec patches are simpler.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

ed9c47be

11 11月, 2013 18 次提交

bcache: Bypass torture test · 5ceaaad7

由 Kent Overstreet 提交于 9月 10, 2013

More testing ftw! Also, now verify mode doesn't break if you read dirty
data.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

5ceaaad7

K
bcache: Fix sysfs splat on shutdown with flash only devs · c4d951dd
由 Kent Overstreet 提交于 8月 21, 2013
```
Whoops.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
c4d951dd

bcache: Better full stripe scanning · 48a915a8

由 Kent Overstreet 提交于 10月 31, 2013

The old scanning-by-stripe code burned too much CPU, this should be
better.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

48a915a8

K
bcache: Move spinlock into struct time_stats · 65d22e91
由 Kent Overstreet 提交于 7月 31, 2013
```
Minor cleanup.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
65d22e91

bcache: Kill sequential_merge option · 8aee1220

由 Kent Overstreet 提交于 7月 30, 2013

It never really made sense to expose this, so just kill it.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

8aee1220

bcache: Incremental gc · a1f0358b

由 Kent Overstreet 提交于 9月 10, 2013

Big garbage collection rewrite; now, garbage collection uses the same
mechanisms as used elsewhere for inserting/updating btree node pointers,
instead of rewriting interior btree nodes in place.

This makes the code significantly cleaner and less fragile, and means we
can now make garbage collection incremental - it doesn't have to hold a
write lock on the root of the btree for the entire duration of garbage
collection.

This means that there's less of a latency hit for doing garbage
collection, which means we can gc more frequently (and do a better job
of reclaiming from the cache), and we can coalesce across more btree
nodes (improving our space efficiency).
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

a1f0358b

bcache: Debug code improvements · 280481d0

由 Kent Overstreet 提交于 10月 24, 2013

Couple changes:
 * Consolidate bch_check_keys() and bch_check_key_order(), and move the
   checks that only check_key_order() could do to bch_btree_iter_next().

 * Get rid of CONFIG_BCACHE_EDEBUG - now, all that code is compiled in
   when CONFIG_BCACHE_DEBUG is enabled, and there's now a sysfs file to
   flip on the EDEBUG checks at runtime.

 * Dropped an old not terribly useful check in rw_unlock(), and
   refactored/improved a some of the other debug code.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

280481d0

bcache: Pull on disk data structures out into a separate header · 81ab4190

由 Kent Overstreet 提交于 10月 31, 2013

Now, the on disk data structures are in a header that can be exported to
userspace - and having them all centralized is nice too.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

81ab4190

K
bcache: Move sector allocator to alloc.c · 2599b53b
由 Kent Overstreet 提交于 7月 24, 2013
```
Just reorganizing things a bit.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
```
2599b53b

bcache: Add btree_map() functions · 48dad8ba

由 Kent Overstreet 提交于 9月 10, 2013

Lots of stuff has been open coding its own btree traversal - which is
generally pretty simple code, but there are a few subtleties.

This adds new new functions, bch_btree_map_nodes() and
bch_btree_map_keys(), which do the traversal for you. Everything that's
open coding btree traversal now (with the exception of garbage
collection) is slowly going to be converted to these two functions;
being able to write other code at a higher level of abstraction  is a
big improvement w.r.t. overall code quality.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

48dad8ba

bcache: Convert writeback to a kthread · 5e6926da

由 Kent Overstreet 提交于 7月 24, 2013

This simplifies the writeback flow control quite a bit - previously, it
was conceptually two coroutines, refill_dirty() and read_dirty(). This
makes the code quite a bit more straightforward.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

5e6926da

bcache: Convert gc to a kthread · 72a44517

由 Kent Overstreet 提交于 10月 24, 2013

We needed a dedicated rescuer workqueue for gc anyways... and gc was
conceptually a dedicated thread, just one that wasn't running all the
time. Switch it to a dedicated thread to make the code a bit more
straightforward.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

72a44517

bcache: Convert bucket_wait to wait_queue_head_t · 35fcd848

由 Kent Overstreet 提交于 7月 24, 2013

At one point we did do fancy asynchronous waiting stuff with
bucket_wait, but that's all gone (and bucket_wait is used a lot less
than it used to be). So use the standard primitives.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

35fcd848

bcache: Convert try_wait to wait_queue_head_t · e8e1d468

由 Kent Overstreet 提交于 7月 24, 2013

We never waited on c->try_wait asynchronously, so just use the standard
primitives.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

e8e1d468

bcache: Convert btree_insert_check_key() to btree_insert_node() · e7c590eb

由 Kent Overstreet 提交于 9月 10, 2013

This was the main point of all this refactoring - now,
btree_insert_check_key() won't fail just because the leaf node happened
to be full.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

e7c590eb

bcache: Stripe size isn't necessarily a power of two · 2d679fc7

由 Kent Overstreet 提交于 8月 17, 2013

Originally I got this right... except that the divides didn't use
do_div(), which broke 32 bit kernels. When I went to fix that, I forgot
that the raid stripe size usually isn't a power of two... doh
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

2d679fc7

bcache: Add on error panic/unregister setting · 77c320eb

由 Kent Overstreet 提交于 7月 11, 2013

Works kind of like the ext4 setting, to panic or remount read only on
errors.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

77c320eb

bcache: Use blkdev_issue_discard() · 49b1212d

由 Kent Overstreet 提交于 7月 24, 2013

The old asynchronous discard code was really a relic from when all the
allocation code was asynchronous - now that allocation runs out of a
dedicated thread there's no point in keeping around all that complicated
machinery.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

49b1212d

25 9月, 2013 1 次提交

bcache: Fix a writeback performance regression · c2a4f318

由 Kent Overstreet 提交于 9月 23, 2013

Background writeback works by scanning the btree for dirty data and
adding those keys into a fixed size buffer, then for each dirty key in
the keybuf writing it to the backing device.

When read_dirty() finishes and it's time to scan for more dirty data, we
need to wait for the outstanding writeback IO to finish - they still
take up slots in the keybuf (so that foreground writes can check for
them to avoid races) - without that wait, we'll continually rescan when
we'll be able to add at most a key or two to the keybuf, and that takes
locks that starves foreground IO. Doh.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c2a4f318

12 7月, 2013 2 次提交

bcache: Allocation kthread fixes · 79826c35

由 Kent Overstreet 提交于 7月 10, 2013

The alloc kthread should've been using try_to_freeze() - and also there
was the potential for the alloc kthread to get woken up after it had
shut down, which would have been bad.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>

79826c35

bcache: Fix a sysfs splat on shutdown · c9502ea4

由 Kent Overstreet 提交于 7月 10, 2013

If we stopped a bcache device when we were already detaching (or
something like that), bcache_device_unlink() would try to remove a
symlink from sysfs that was already gone because the bcache dev kobject
had already been removed from sysfs.

So keep track of whether we've removed stuff from sysfs.
Signed-off-by: NKent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

c9502ea4

27 6月, 2013 7 次提交

bcache: Write out full stripes · 72c27061

由 Kent Overstreet 提交于 6月 05, 2013

Now that we're tracking dirty data per stripe, we can add two
optimizations for raid5/6:

 * If a stripe is already dirty, force writes to that stripe to
   writeback mode - to help build up full stripes of dirty data

 * When flushing dirty data, preferentially write out full stripes first
   if there are any.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

72c27061

bcache: Track dirty data by stripe · 279afbad

由 Kent Overstreet 提交于 6月 05, 2013

To make background writeback aware of raid5/6 stripes, we first need to
track the amount of dirty data within each stripe - we do this by
breaking up the existing sectors_dirty into per stripe atomic_ts
Signed-off-by: NKent Overstreet <koverstreet@google.com>

279afbad

bcache: Initialize sectors_dirty when attaching · 444fc0b6

由 Kent Overstreet 提交于 5月 11, 2013

Previously, dirty_data wouldn't get initialized until the first garbage
collection... which was a bit of a problem for background writeback (as
the PD controller keys off of it) and also confusing for users.

This is also prep work for making background writeback aware of raid5/6
stripes.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

444fc0b6

bcache: Improve lazy sorting · 6ded34d1

由 Kent Overstreet 提交于 5月 11, 2013

The old lazy sorting code was kind of hacky - rewrite in a way that
mathematically makes more sense; the idea is that the size of the sets
of keys in a btree node should increase by a more or less fixed ratio
from smallest to biggest.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

6ded34d1

bcache: Fix/revamp tracepoints · c37511b8

由 Kent Overstreet 提交于 4月 26, 2013

The tracepoints were reworked to be more sensible, and fixed a null
pointer deref in one of the tracepoints.

Converted some of the pr_debug()s to tracepoints - this is partly a
performance optimization; it used to be that with DEBUG or
CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it
was changed to an empty inline function.

Some of the pr_debug() statements had rather expensive function calls as
part of the arguments, so this code was getting run unnecessarily even
on non debug kernels - in some fast paths, too.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

c37511b8

bcache: Refactor btree io · 57943511

由 Kent Overstreet 提交于 4月 25, 2013

The most significant change is that btree reads are now done
synchronously, instead of asynchronously and doing the post read stuff
from a workqueue.

This was originally done because we can't block on IO under
generic_make_request(). But - we already have a mechanism to punt cache
lookups to workqueue if needed, so if we just use that we don't have to
deal with the complexity of doing things asynchronously.

The main benefit is this makes the locking situation saner; we can hold
our write lock on the btree node until we're finished reading it, and we
don't need that btree_node_read_done() flag anymore.

Also, for writes, btree_write() was broken out into btree_node_write()
and btree_leaf_dirty() - the old code with the boolean argument was dumb
and confusing.

The prio_blocked mechanism was improved a bit too, now the only counter
is in struct btree_write, we don't mess with transfering a count from
struct btree anymore.

This required changing garbage collection to block prios at the start
and unblock when it finishes, which is cleaner than what it was doing
anyways (the old code had mostly the same effect, but was doing it in a
convoluted way)

And the btree iter btree_node_read_done() uses was converted to a real
mempool.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

57943511

bcache: Convert allocator thread to kthread · 119ba0f8

由 Kent Overstreet 提交于 4月 24, 2013

Using a workqueue when we just want a single thread is a bit silly.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

119ba0f8

15 5月, 2013 1 次提交

bcache: Fix error handling in init code · f59fce84

由 Kent Overstreet 提交于 5月 15, 2013

This code appears to have rotted... fix various bugs and do some
refactoring.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

f59fce84

21 4月, 2013 1 次提交

bcache: Take data offset from the bdev superblock. · 2903381f

由 Kent Overstreet 提交于 4月 11, 2013

Add a new superblock version, and consolidate related defines.
Signed-off-by: NGabriel de Perthuis <g2p.code+bcache@gmail.com>
Signed-off-by: NKent Overstreet <koverstreet@google.com>

2903381f

29 3月, 2013 1 次提交

bcache: Don't export utility code, prefix with bch_ · 169ef1cf

由 Kent Overstreet 提交于 3月 28, 2013

Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

169ef1cf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功