提交 · bb45185de2e90af63a7bc48855de6f870cc216fc · openanolis / cloud-kernel

29 3月, 2013 7 次提交

drbd: fix spurious warning about bitmap being locked from detach · bb45185d

由 Philipp Reisner 提交于 3月 27, 2013

Introduced in drbd: always write bitmap on detach,
the bitmap bulk writeout on detach was indicating
it expected exclusive bitmap access.

Where I meant to say: expect no more modifications,
but testing/counting is still allowed.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bb45185d

drbd: drop now useless duplicate state request from invalidate · 0b2dafcd

由 Philipp Reisner 提交于 3月 27, 2013

Patch best viewed with git diff --ignore-space-change.

Now that we attempt the fallback to local bitmap operation
only when disconnected, we can safely drop the extra "silent"
state request from both invalidate and invalidate-remote.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0b2dafcd

drbd: fix effective error returned when refusing an invalidate · 5c4f13d9

由 Philipp Reisner 提交于 3月 27, 2013

Since commit
  drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
trying to invalidate a disconnected Primary returned an error code
that did not really match the situation:
"Refusing to be Outdated while Connected"

Insert two more specific conditions into is_valid_state(),
changing that to "Need access to UpToDate data",
respectively "Need a connection to start verify or resync".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5c4f13d9

drbd: move invalidating the whole bitmap out of after_state ch() · 9376d9f8

由 Philipp Reisner 提交于 3月 27, 2013

To avoid other state change requests, after passing through
sanitize_state(), to be mistaken for an invalidate,
move the "set all bits as out-of-sync" into the invalidate path.

Make invalidate and invalidate-remote behave consistently wrt.
current connection state (need either an established replication link,
or really be disconnected). Also mention that in the documentation.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9376d9f8

drbd: abort start of resync early, if it raced with connection breakage · a700471b

由 Philipp Reisner 提交于 3月 27, 2013

We've seen a spurious full resync, because a connection breakage
raced with drbd_start_resync(, C_SYNC_TARGET),
and the resulting state change request intended to start the resync
ended up looking like a local invalidate.

Fix:
Double check the state inside the lock,
and don't even request that state change,
if we had connection or IO problems.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a700471b

drbd: reset ap_in_flight counter for new connections · 2d56a974

由 Philipp Reisner 提交于 3月 27, 2013

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2d56a974

idr: document exit conditions on idr_for_each_entry better · b949be58

由 George Spelvin 提交于 3月 27, 2013

And some manual common subexpression elimination which may help the
compiler produce smaller code.
Signed-off-by: NGeorge Spelvin <linux@horizon.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b949be58

26 3月, 2013 4 次提交

bcache: Fix for the build fixes · 29177b89

由 Kent Overstreet 提交于 3月 25, 2013

Commit 82a84eaf7e51ba3da0c36cbc401034a4e943492d left a return 0 in
closure_debug_init(). Whoops.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

29177b89

aoe: get rid of cached bv variable in bufinit() · 2124469e

由 Jens Axboe 提交于 3月 25, 2013

Less error prone if we just kill it, it's only used once
anyway.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2124469e

bcache: Style/checkpatch fixes · b1a67b0f

由 Kent Overstreet 提交于 3月 25, 2013

Took out some nested functions, and fixed some more checkpatch
complaints.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1a67b0f

bcache: Build fixes from test robot · 07e86ccb

由 Kent Overstreet 提交于 3月 25, 2013

config: make ARCH=i386 allmodconfig

All error/warnings:

   drivers/md/bcache/bset.c: In function 'bch_ptr_bad':
>> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/debug.c: In function 'bch_pbtree':
>> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/btree.c: In function 'bch_btree_read_done':
>> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/closure.o: In function `closure_debug_init':
>> (.init.text+0x0): multiple definition of `init_module'
>> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

07e86ccb

25 3月, 2013 1 次提交

Merge branch 'bcache-for-upstream' of... · e226e341

由 Jens Axboe 提交于 3月 24, 2013

Merge branch 'bcache-for-upstream' of http://evilpiepirate.org/git/linux-bcache into for-3.10/drivers

e226e341

24 3月, 2013 5 次提交

bcache: A block layer cache · cafe5635

由 Kent Overstreet 提交于 3月 23, 2013

Does writethrough and writeback caching, handles unclean shutdown, and
has a bunch of other nifty features motivated by real world usage.

See the wiki at http://bcache.evilpiepirate.org for more.
Signed-off-by: NKent Overstreet <koverstreet@google.com>

cafe5635

Export __lockdep_no_validate__ · ea6749c7

由 Kent Overstreet 提交于 12月 27, 2012

Hack, but bcache needs a way around lockdep for locking during garbage
collection - we need to keep multiple btree nodes locked for coalescing
and rw_lock_nested() isn't really sufficient or appropriate here.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>

ea6749c7

Export blk_fill_rwbs() · 9ca8f8e5

由 Kent Overstreet 提交于 4月 13, 2012

Exported so it can be used by bcache's tracepoints
Signed-off-by: NKent Overstreet <koverstreet@google.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Ingo Molnar <mingo@redhat.com>

9ca8f8e5

Export get_random_int() · 1f8e8ed0

由 Kent Overstreet 提交于 4月 09, 2012

Needed for bcache - need a cheap source of random numbers for perturbing
IO sizes, for rate limiting IO to the SSD.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
CC: "Theodore Ts'o" <tytso@mit.edu>

1f8e8ed0

Revert "rw_semaphore: remove up/down_read_non_owner" · 84759c6d

由 Kent Overstreet 提交于 9月 21, 2011

This reverts commit 11b80f45.

Bcache needs rw semaphores for cache coherency in writeback mode -
writes have to take a read lock on a per cache device rw sem, and
release it when the bio completes.

But since this is for bios it's naturally not in the context of the
process that originally took the lock.
Signed-off-by: NKent Overstreet <koverstreet@google.com>
CC: Christoph Hellwig <hch@infradead.org>
CC: David Howells <dhowells@redhat.com>

84759c6d

23 3月, 2013 18 次提交

drbd: adjust upper limit for activity log extents · 5bbcf5e6

由 Lars Ellenberg 提交于 3月 19, 2013

Now that the on-disk activity-log ring buffer size is adjustable,
the maximum active set can become larger, and is now limited by
the use of 16bit "labels".

This increases the maximum working set from 6433 to 65534 extents,
each of which covers an area of 4MiB.
Which means that if you use the maximum, you'd have to resync
more than 250 GiB after an unclean Primary shutdown.
With capable backend storage and replication links,
this is entirely feasible.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5bbcf5e6

drbd: try hard to max out the updates per AL transaction · 45ad07b3

由 Lars Ellenberg 提交于 3月 19, 2013

There may have been more incoming requests while we where preparing
the current transaction. Try to consolidate more updates into this
transaction until we make no more progres.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

45ad07b3

drbd: move start io accounting before activity log transaction · 7e8c288f

由 Lars Ellenberg 提交于 3月 19, 2013

The IO accounting of the drbd "queue depth" was misleading.
We only started IO accounting once we already wrote the activity log.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7e8c288f

drbd: consolidate as many updates as possible into one AL transaction · 08a1ddab

由 Lars Ellenberg 提交于 3月 19, 2013

Depending on current IO depth, try to consolidate as many updates
as possible into one activity log transaction.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

08a1ddab

lru_cache: introduce lc_get_cumulative() · cbe5e610

由 Lars Ellenberg 提交于 3月 22, 2013

New helper to be able to consolidate more updates
into a single transaction.
Without this, we can only grab a single refcount
on an updated element while preparing a transaction.

lc_get_cumulative - like lc_get; also finds to-be-changed elements
  @lc: the lru cache to operate on
  @enr: the label to look up

  Unlike lc_get this also returns the element for @enr, if it is belonging to
  a pending transaction, so the return values are like for lc_get(),
  plus:

  pointer to an element already on the "to_be_changed" list.
	  In this case, the cache was already marked %LC_DIRTY.

  Caller needs to make sure that the pending transaction is completed,
  before proceeding to actually use this element.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

Fixed up by Jens to export lc_get_cumulative().
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cbe5e610

drbd: queue writes on submitter thread, unless they pass the activity log fastpath · 779b3fe4

由 Lars Ellenberg 提交于 3月 19, 2013

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

779b3fe4

drbd: split out some helper functions to drbd_al_begin_io · 6c3c4355

由 Lars Ellenberg 提交于 3月 19, 2013

To make the code easier to follow,
use an explicit find_active_resync_extent(),
and add a "nonblock" parameter to _al_get().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6c3c4355

drbd: split drbd_al_begin_io into fastpath, prepare, and commit · b5bc8e08

由 Lars Ellenberg 提交于 3月 19, 2013

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b5bc8e08

drbd: prepare to queue write requests on a submit worker · 113fef9e

由 Lars Ellenberg 提交于 3月 22, 2013

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

113fef9e

drbd: split __drbd_make_request in before and after drbd_al_begin_io · 6d9febe2

由 Lars Ellenberg 提交于 3月 19, 2013

This is in preparation to be able to defer requests that need to wait
for an activity log transaction to a submitter workqueue.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6d9febe2

drbd: drbd_al_being_io: short circuit to reduce latency · ebfd5d8f

由 Lars Ellenberg 提交于 3月 19, 2013

A request hitting an already "hot" extent should proceed right away,
even if some other requests need to wait for pending transactions.

Without that short-circuit, several simultaneous make_request contexts
race for committing the transaction, possibly penalizing the innocent.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ebfd5d8f

drbd: Clarify when activity log I/O is delegated to the worker thread · 56392d2f

由 Lars Ellenberg 提交于 3月 19, 2013

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

56392d2f

drbd: read meta data early, base on-disk offsets on super block · c04ccaa6

由 Lars Ellenberg 提交于 3月 19, 2013

We used to calculate all on-disk meta data offsets, and then compare
the stored offsets, basically treating them as magic numbers.

Now with the activity log striping, the activity log size is no longer
fixed. We need to first read the super block, then base the activity
log and bitmap offsets on the stored offsets/al stripe settings.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c04ccaa6

drbd: mechanically rename la_size to la_size_sect · cccac985

由 Lars Ellenberg 提交于 3月 19, 2013

Make it obvious that this value is in units of 512 Byte sectors.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cccac985

drbd: use the cached meta_dev_idx · 68e41a43

由 Lars Ellenberg 提交于 3月 19, 2013

Now we have the cached meta_dev_idx member,
we can get rid of a few rcu_read_lock() sections and rcu_dereference().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

68e41a43

drbd: prepare for new striped layout of activity log · 3a4d4eb3

由 Lars Ellenberg 提交于 3月 19, 2013

Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k
The intended use case is activity log on RAID 0 or similar.
Logically consecutive transactions will advance their on-disk position
by al_stripe_size_4k 4kB (transaction sized) blocks.

Right now, these are still asserted to be the backward compatible
values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB).

Also introduce a caching member for meta_dev_idx in the in-core
structure: even though it is initially passed in in the rcu-protected
disk_conf structure, it cannot change without a detach/attach cycle.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3a4d4eb3

drbd: cleanup ondisk meta data layout calculations and defines · ae8bf312

由 Lars Ellenberg 提交于 3月 19, 2013

Add a comment about our meta data layout variants,
and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT)
to make it clear that they are short hand for fixed constants,
and not arbitrarily to be redefined as one may see fit.

Properly pad struct meta_data_on_disk to 4kB,
and initialize to zero not only the first 512 Byte,
but all of it in drbd_md_sync().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ae8bf312

drbd: cleanup bogus assert message · 9114d795

由 Lars Ellenberg 提交于 3月 19, 2013

This fixes ASSERT( mdev->state.disk == D_FAILED ) in drivers/block/drbd/drbd_main.c

When we detach from local disk, we let the local refcount hit zero twice.

First, we transition to D_FAILED, so we won't give out new references
to incoming requests; we still may give out *internal* references, though.
Once the refcount hits zero [1] while in D_FAILED, we queue a transition
to D_DISKLESS to our worker. We need to queue it, because we may be in
atomic context when putting the reference.
Once the transition to D_DISKLESS actually happened [2] from worker context,
we don't give out new internal references either.

Between hitting zero the first time [1] and actually transition to
D_DISKLESS [2], there may be a few very short lived internal get/put,
so we may hit zero more than once while being in D_FAILED, or even see a
race where a an internal get_ldev() happened while D_FAILED, but the
corresponding put_ldev() happens just after the transition to D_DISKLESS.

That's why we have the additional test_and_set_bit(GO_DISKLESS,);
and that's why the assert was placed wrong.
Since there was exactly one code path left to drbd_go_diskless(),
and that checks already for D_FAILED, drop that assert,
and fold in the drbd_queue_work().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9114d795

18 3月, 2013 4 次提交

L

Linux 3.9-rc3 · a937536b
由 Linus Torvalds 提交于 3月 17, 2013

a937536b

perf,x86: fix link failure for non-Intel configs · 6c4d3bc9

由 David Rientjes 提交于 3月 17, 2013

Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:

	arch/x86/power/built-in.o: In function `restore_processor_state':
	(.text+0x45c): undefined reference to `perf_restore_debug_store'

Fix it by defining the dummy function appropriately.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6c4d3bc9

perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2

由 Linus Torvalds 提交于 3月 17, 2013

Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") fixed a crash when doing PEBS performance profiling
after resuming, but in using init_debug_store_on_cpu() to restore the
DS_AREA mtrr it also resulted in a new WARN_ON() triggering.

init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
cross-calls to do the MSR update. Which is not really valid at the
early resume stage, and the warning is quite reasonable. Now, it all
happens to _work_, for the simple reason that smp_call_function_single()
ends up just doing the call directly on the CPU when the CPU number
matches, but we really should just do the wrmsr() directly instead.

This duplicates the wrmsr() logic, but hopefully we can just remove the
wrmsr_on_cpu() version eventually.
Reported-and-tested-by: NParag Warudkar <parag.lkml@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2a6e06b2

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 08637024

由 Linus Torvalds 提交于 3月 17, 2013

Pull btrfs fixes from Chris Mason:
 "Eric's rcu barrier patch fixes a long standing problem with our
  unmount code hanging on to devices in workqueue helpers.  Liu Bo
  nailed down a difficult assertion for in-memory extent mappings."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix warning of free_extent_map
  Btrfs: fix warning when creating snapshots
  Btrfs: return as soon as possible when edquot happens
  Btrfs: return EIO if we have extent tree corruption
  btrfs: use rcu_barrier() to wait for bdev puts at unmount
  Btrfs: remove btrfs_try_spin_lock
  Btrfs: get better concurrency for snapshot-aware defrag work

08637024

16 3月, 2013 1 次提交

Btrfs: fix warning of free_extent_map · 3b277594

由 Liu Bo 提交于 3月 15, 2013

Users report that an extent map's list is still linked when it's actually
going to be freed from cache.

The story is that

a) when we're going to drop an extent map and may split this large one into
smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means
that it's on the list to be logged, then the smaller ems split from it will also
be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected.

b) we'll keep ems from unlinking the list and freeing when they are flagged with
EXTENT_FLAG_LOGGING, because the log code holds one reference.

The end result is the warning, but the truth is that we set the flag
EXTENT_FLAG_LOGGING only during fsync.

So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one.
Reported-by: NJohannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Reported-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

3b277594

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功