提交 · ccae7868b0c5697508a541c531cf96b361d62c1c · openeuler / raspberrypi-kernel

30 10月, 2012 7 次提交

drbd: always write bitmap on detach · a2a3c74f

由 Lars Ellenberg 提交于 9月 22, 2012

If we detach due to local read-error (which sets a bit in the bitmap),
stay Primary, and then re-attach (which re-reads the bitmap from disk),
we potentially lost the "out-of-sync" (or, "bad block") information in
the bitmap.

Always (try to) write out the changed bitmap pages before going diskless.

That way, we don't lose the bit for the bad block,
the next resync will fetch it from the peer, and rewrite
it locally, which may result in block reallocation in some
lower layer (or the hardware), and thereby "heal" the bad blocks.

If the bitmap writeout errors out as well, we will (again: try to)
mark the "we need a full sync" bit in our super block,
if it was a READ error; writes are covered by the activity log already.

If that superblock does not make it to disk either, we are sorry.

Maybe we just lost an entire disk or controller (or iSCSI connection),
and there actually are no bad blocks at all, so we don't need to
re-fetch from the peer, there is no "auto-healing" necessary.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a2a3c74f

drbd: prepare for more than 32 bit flags · 06f10adb

由 Lars Ellenberg 提交于 9月 22, 2012

 - struct drbd_conf { ... unsigned long flags; ... }
 + struct drbd_conf { ... unsigned long drbd_flags[N]; ... }

And introduce wrapper functions for test/set/clear bit operations
on this member.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

06f10adb

drbd: wait for meta data IO completion even with failed disk, unless force-detached · 44edfb0d

由 Lars Ellenberg 提交于 9月 27, 2012

The intention of force-detach is to be able to deal with a completely
unresponsive lower level IO stack, which does not even deliver error
completions anymore, but no completion at all.

In all other cases, we must still wait for the meta data IO completion.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

44edfb0d

drbd: a few more GFP_KERNEL -> GFP_NOIO · 8b45a5c8

由 Lars Ellenberg 提交于 9月 20, 2012

This has not yet been observed, but conceivably, when using GFP_KERNEL
allocations from drbd_md_sync(), drbd_flush_after_epoch() or
receive_SyncParam(), we could trigger additional IO to our own device,
or an other device in a criss-cross setup, and end up in a local
deadlock, or potentially a distributed deadlock in a criss-cross setup
involving the peer blocked in a similar way waiting for us to make
progress.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8b45a5c8

drbd: Avoid NetworkFailure state during disconnect · 599377ac

由 Philipp Reisner 提交于 8月 17, 2012

Disconnecting is a cluster wide state change. In case the peer node agrees
to the state transition, it sends back the fact on the meta-data connection
and closes both sockets.

In case the node node that initiated the state transfer sees the closing
action on the data-socket, before the P_STATE_CHG_REPLY packet, it was
going into one of the network failure states.

At least with the fencing option set to something else thatn "dont-care",
the unclean shutdown of the connection causes a short IO freeze or
a fence operation.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

599377ac

drbd: introduce stop-sector to online verify · 02b91b55

由 Lars Ellenberg 提交于 6月 28, 2012

We now can schedule only a specific range of sectors for online verify,
or interrupt a running verify without interrupting the connection.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

02b91b55

drbd: Protect accesses to the uuid set with a spinlock · 9f2247bb

由 Philipp Reisner 提交于 8月 16, 2012

There is at least the worker context, the receiver context, the context of
receiving netlink packts and processes reading a sysfs attribute that access
the uuids.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9f2247bb

16 8月, 2012 1 次提交

drbd: Write all pages of the bitmap after an online resize · d1aa4d04

由 Philipp Reisner 提交于 8月 08, 2012

We need to write the whole bitmap after we moved the meta data
due to an online resize operation.

With the support for one peta byte devices bitmap IO was optimized
to only write out touched pages. This optimization must be turned
off when writing the bitmap after an online resize.

This issue was introduced with drbd-8.3.10.

The impact of this bug is that after an online resize, the next
resync could become larger than expected.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d1aa4d04

24 7月, 2012 5 次提交

drbd: fix max_bio_size to be unsigned · db141b2f

由 Lars Ellenberg 提交于 6月 25, 2012

We capped our max_bio_size respectively max_hw_sectors with
min_t(int, lower level limit, our limit);
unfortunately, some drivers, e.g. the kvm virtio block driver, initialize their
limits to "-1U", and that is of course a smaller "int" value than our limit.

Impact: we started to request 16 MB resync requests,
which lead to protocol error and a reconnect loop.

Fix all relevant constants and parameters to be unsigned int.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

db141b2f

drbd: flush drbd work queue before invalidate/invalidate remote · 7ee1fb93

由 Lars Ellenberg 提交于 6月 19, 2012

If you do back to back wait-sync/invalidate on a Primary in a tight loop,
during application IO load, you could trigger a race:
  kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync'
	but 'write from resync_finished' still pending?

Fix this by changing the order of the drbd_queue_work() and
the wake_up() in dec_ap_pending(), and adding the additional
drbd_flush_workqueue() before requesting the full sync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7ee1fb93

drbd: report congestion if we are waiting for some userland callback · c2ba686f

由 Lars Ellenberg 提交于 6月 14, 2012

If the drbd worker thread is synchronously waiting for some userland
callback, we don't want some casual pageout to block on us.
Have drbd_congested() report congestion in that case.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c2ba686f

drbd: differentiate between normal and forced detach · 383606e0

由 Lars Ellenberg 提交于 6月 14, 2012

Aborting local requests (not waiting for completion from the lower level
disk) is dangerous: if the master bio has been completed to upper
layers, data pages may be re-used for other things already.
If local IO is still pending and later completes,
this may cause crashes or corrupt unrelated data.

Only abort local IO if explicitly requested.
Intended use case is a lower level device that turned into a tarpit,
not completing io requests, not even doing error completion.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

383606e0

drbd: cleanup, remove two unused global flags · d2645801

由 Lars Ellenberg 提交于 6月 18, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d2645801

09 5月, 2012 16 次提交

drbd: introduce a bio_set to allocate housekeeping bios from · 9476f39d

由 Lars Ellenberg 提交于 2月 23, 2011

Don't rely on availability of bios from the global fs_bio_set,
we should use our own bio_set for meta data IO.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9476f39d

drbd: add page pool to be used for meta data IO · 4281808f

由 Lars Ellenberg 提交于 2月 23, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4281808f

drbd: allow bitmap to change during writeout from resync_finished · 0e8488ad

由 Lars Ellenberg 提交于 4月 25, 2012

Symptom: messages similar to
 "FIXME asender in bm_change_bits_to,
  bitmap locked for 'write from resync_finished' by worker"

If a resync or verify is finished (or aborted), a full bitmap writeout
is triggered.  If we have ongoing local IO, the bitmap may still change
during that writeout, pending and not yet processed acks may cause bits
to be cleared, while new writes may cause bits to be to be set.

To fix this, introduce the drbd_bm_write_copy_pages() variant.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0e8488ad

drbd: fix resend/resubmit of frozen IO · ba280c09

由 Lars Ellenberg 提交于 4月 25, 2012

DRBD can freeze IO, due to fencing policy (fencing resource-and-stonith),
or because we lost access to data (on-no-data-accessible suspend-io).

Resuming from there (re-connect, or re-attach, or explicit admin
intervention) should "just work".

Unfortunately, if the re-attach/re-connect did not happen within
the timeout, since the commit
  drbd: Implemented real timeout checking for request processing time
if so configured, the request_timer_fn() would timeout and
detach/disconnect virtually immediately.

This change tracks the most recent attach and connect, and does not
timeout within <configured timeout interval> after attach/connect.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

ba280c09

drbd: Delay/reject other state changes while establishing a connection · 197296ff

由 Philipp Reisner 提交于 3月 26, 2012

Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

197296ff

drbd: remove unused static helper function · 7ffcaa71

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7ffcaa71

drbd: remove some very outdated comments · a5d214f6

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a5d214f6

drbd: remove now unused seq_num member from struct drbd_request · 671a74e7

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

671a74e7

drbd: Consider the disk-timeout also for meta-data IO operations · 7caacb69

由 Philipp Reisner 提交于 12月 14, 2011

If the backing device is already frozen during attach, we failed
to recognize that. The current disk-timeout code works on top
of the drbd_request objects. During attach we do not allow IO
and therefore never generate a drbd_request object but block
before that in drbd_make_request().

This patch adds the timeout to all drbd_md_sync_page_io().

Before this patch we used to go from D_ATTACHING directly
to D_DISKLESS if IO failed during attach. We can no longer
do this since we have to stay in D_FAILED until all IO
ops issued to the backing device returned.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7caacb69

drbd: send intermediate state change results to the peer · f479ea06

由 Lars Ellenberg 提交于 10月 27, 2011

DRBD state changes schedule after_state_ch() actions to a worker thread,
which decides on the old and new states of that change, whether to send
an informational state update packet (P_STATE) to the peer.
If it decides to drbd_send_state(), it would however always send the
_curent_ state, which, if a second state change happens before the
after_state_ch() of the first ran, may "fast-forward" the peer's view
about this node.  In most cases that is harmless, but sometimes this can
confuse DRBD, for example into not actually starting a necessary resync
if you do a very tight detach/attach loop on a Connected Secondary.

Fix this by always sending the "new" state of the respective state
transition which scheduled this after_state_ch() work.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f479ea06

drbd: Allow new IOs while the local disk in in FAILED state · 5ca1de03

由 Philipp Reisner 提交于 6月 28, 2011

The last bunch of commits prepared the 'detach from tar pit' feature.
With that we can be for long time in disk state FAILED. We need
to accept new IO requests during that time.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5ca1de03

drbd: Implemented wait_until_done_or_disk_failure() · 0c464425

由 Philipp Reisner 提交于 6月 26, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0c464425

drbd: Replaced md_io_mutex by an atomic: md_io_in_use · e1711731

由 Philipp Reisner 提交于 6月 27, 2011

The new function drbd_md_get_buffer() aborts waiting for the buffer
in case the disk failes in the meantime.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e1711731

drbd: moved md_io into mdev · cc94c650

由 Philipp Reisner 提交于 6月 26, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

cc94c650

drbd: Keep a reference to barrier acked requests · 6d7e32f5

由 Philipp Reisner 提交于 3月 15, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6d7e32f5

DRBD: Fix comparison always false warning due to long/long long compare · 5f138ce0

由 David Howells 提交于 6月 15, 2011

Fix warnings of the following nature in the drbd header:

In file included from drivers/block/drbd/drbd_bitmap.c:32:
drivers/block/drbd/drbd_int.h: In function 'drbd_get_syncer_progress':
drivers/block/drbd/drbd_int.h:2234: warning: comparison is always false due to limited range of data

where mdev->rs_total (an unsigned long) is being compared to 1ULL << 32, which
is always false on a 32-bit machine.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

5f138ce0

13 1月, 2012 1 次提交

module_param: make bool parameters really bool (drivers & misc) · 90ab5ee9

由 Rusty Russell 提交于 1月 13, 2012

module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

90ab5ee9

15 9月, 2011 2 次提交

Remove unneeded version.h includes from drivers/block/ · e5de0630

由 Jesper Juhl 提交于 8月 01, 2011

It was pointed out by 'make versioncheck' that some includes of
linux/version.h are not needed in drivers/block/.
This patch removes them.
Signed-off-by: NJesper Juhl <jj@chaosbits.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

e5de0630

drbd: Use angle brackets for system includes · 1d273b92

由 Joe Perches 提交于 6月 03, 2011

Use the normal include style.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

1d273b92

12 9月, 2011 1 次提交

block: remove support for bio remapping from ->make_request · 5a7bbad2

由 Christoph Hellwig 提交于 9月 12, 2011

There is very little benefit in allowing to let a ->make_request
instance update the bios device and sector and loop around it in
__generic_make_request when we can archive the same through calling
generic_make_request from the driver and letting the loop in
generic_make_request handle it.

Note that various drivers got the return value from ->make_request and
returned non-zero values for errors.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

5a7bbad2

24 5月, 2011 6 次提交

drbd: fix warning · 0ddf72be

由 Andrew Morton 提交于 5月 23, 2011

In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type

Forward declarations of enums do not work.

Fix it unpleasantly by moving the prototype.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NLars Ellenberg <drbd-dev@lists.linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

0ddf72be

drbd: Fix spelling · 24c4830c

由 Bart Van Assche 提交于 5月 21, 2011

Found these with the help of ispell -l.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

24c4830c

drbd: fix schedule in atomic · 9a0d9d03

由 Lars Ellenberg 提交于 5月 02, 2011

An administrative detach used to request a state change directly to D_DISKLESS,
first suspending IO to avoid the last put_ldev() occuring from an endio handler,
potentially in irq context.

This is not enough on the receiving side (typically secondary), we may miss
some peer_req on the way to local disk, which then may do the last put_ldev()
from their drbd_peer_request_endio().

This patch makes the detach always go through the intermediate D_FAILED state.
We may consider to rename it D_DETACHING.

Alternative approach would be to create yet an other work item to be scheduled
on the worker, do the destructor work from there, and get the timing right.

manually picked commit 564040f from the drbd 8.4 branch.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9a0d9d03

drbd: Take a more conservative approach when deciding max_bio_size · 99432fcc

由 Philipp Reisner 提交于 5月 20, 2011

The old (optimistic) implementation could shrink the bio size
on an primary device.

Shrinking the bio size on a primary device is bad. Since there
we might get BIOs with the old (bigger) size shortly after
we published the new size.

The new implementation is more conservative, and eventually
increases the max_bio_size on a primary device (which is valid).
It does so, when it knows the local limit AND the remote limit.

 We cache the last seen max_bio_size of the peer in the meta
 data, and rely on that, to make the operation of single
 nodes more efficient.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

99432fcc

drbd: Only downgrade the disk state in case of disk failures · d2e17807

由 Philipp Reisner 提交于 3月 14, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d2e17807

drbd: Fix for application IO with the on-io-error=pass-on policy · 738a84b2

由 Philipp Reisner 提交于 3月 03, 2011

In case a write failes on the local disk, go into D_INCONSISTENT
disk state. That causes future reads of that block to be shipped
to the peer.

Read retry remote was already in place.

Actually the documentation needs to get fixed now. Since the
application is still shielded from the error. (as long as we have
only a single disk failing) The difference to detach is that
we keep the disk. And therefore might keep all the other, still
working sectors up to date.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

738a84b2

23 5月, 2011 1 次提交

Add appropriate <linux/prefetch.h> include for prefetch users · 70c71606

由 Paul Gortmaker 提交于 5月 22, 2011

After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout.  Give
them an explicit include via the following $0.02 script.

 =========================================
 #!/bin/bash
 MANUAL=""
 for i in `git grep -l 'prefetch(.*)' .` ; do
 	grep -q '<linux/prefetch.h>' $i
 	if [ $? = 0 ] ; then
 		continue
 	fi

 	(	echo '?^#include <linux/?a'
 		echo '#include <linux/prefetch.h>'
 		echo .
 		echo w
 		echo q
 	) | ed -s $i > /dev/null 2>&1
 	if [ $? != 0 ]; then
 		echo $i needs manual fixup
 		MANUAL="$i $MANUAL"
 	fi
 done
 echo ------------------- 8\<----------------------
 echo vi $MANUAL
 =========================================
Signed-off-by: NPaul <paul.gortmaker@windriver.com>
[ Fixed up some incorrect #include placements, and added some
  non-network drivers and the fib_trie.c case    - Linus ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70c71606