提交 · 5de738272e38f7051c7a44c42631b71a0e2a1e80 · openeuler / raspberrypi-kernel

09 5月, 2012 40 次提交

drbd: Ensure that data_size is not 0 before using data_size-1 as index · 5de73827

由 Philipp Reisner 提交于 3月 28, 2012

This could be exploited by a peer which runs modified code.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5de73827

drbd: Delay/reject other state changes while establishing a connection · 197296ff

由 Philipp Reisner 提交于 3月 26, 2012

Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

197296ff

drbd: move put_ldev from __req_mod() to the endio callback · 46385c84

由 Lars Ellenberg 提交于 1月 16, 2012

One invocation in the endio handler is good enough,
we don't need mention it for each of the different ways
it calls __req_mod().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

46385c84

drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE · d64957c9

由 Lars Ellenberg 提交于 3月 23, 2012

Just because this request happened during a resync does
not mean it may pretend to have been barrier-acked.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d64957c9

drbd: fix READ_RETRY_REMOTE_CANCELED to not complete if device is suspended · 41c4a003

由 Lars Ellenberg 提交于 1月 11, 2012

READ_RETRY_REMOTE_CANCELED needs to be grouped with the other _CANCELED
cases, not with CONNECTION_LOST_WHILE_PENDING, as that would complete
(fail) the bio even if the device became suspended.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

41c4a003

drbd: make OOS_HANDED_TO_NETWORK its own case · 6d49e101

由 Lars Ellenberg 提交于 1月 11, 2012

OOS_HANDED_TO_NETWORK should not be grouped with the various
*_CANCELED/*_FAILED cases.
Also, not only clear the RQ_NET_QUEUED flag, but also mark it RQ_NET_DONE,
so it can be distinguished from a local-only request even after that.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6d49e101

drbd: don't pretend that barrier_nr == 0 was special · c088b2d9

由 Lars Ellenberg 提交于 3月 23, 2012

We used to have a barrier implementation where barrier_nr 0 was
reserved. That is long gone. Just use the full sequence space.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c088b2d9

drbd: remove unused static helper function · 7ffcaa71

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7ffcaa71

drbd: remove some very outdated comments · a5d214f6

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a5d214f6

drbd: missing wakeup after drbd_rs_del_all · 1abc2af2

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1abc2af2

drbd: remove now unused seq_num member from struct drbd_request · 671a74e7

由 Lars Ellenberg 提交于 3月 08, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

671a74e7

drbd: fix potential data corruption and protocol error · 001a8868

由 Lars Ellenberg 提交于 3月 08, 2012

We assumed only bios with bi_idx == 0 would end up
in drbd_make_request().

That is wrong.

At least device mapper, in __clone_and_map(), may submit
clones only covering a partial bio, but sharing
the original bvec, by adjusting bi_idx and relevant
other bio members of the clone.

We used __bio_for_each_segment() in various places,
even though that is documented as
 * drivers should not use the __ version unless they _really_ want to
 * run through the entire bio and not just pending pieces

Impact: we would send the full bio bvec, even for the clone
with bi_idx > 0, which will cause data corruption on the
peer (because we submit wrong data at the clone offset),
and will cause a DRBD protocol error, disconnect/reconnect
and resync (thus fixing the corruption),
because the next package header would be expected right
in the middle of the sent data, causing DRBD magic mismatch.

Fix: drop the assert, and use bio_for_each_segment()
instead of the __ version.

Conflicts:

	drbd/drbd_tracing.c
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

001a8868

drbd: Fix a potential write ordering issue on SyncTarget nodes · b6a370ba

由 Philipp Reisner 提交于 2月 19, 2012

If a SyncTarget node gets a P_RS_DATA_REPLY before a P_DATA packet
for the same sector, it simply submits these two IO requests.

  This is be possible because on the SyncSource node, the data of the
  P_RS_DATA_REPLY packet was read from disk.  Immediately after that a
  write request from upper layers came in.

The disk scheduler or even the "hardware" queues on the disk drive might
reorder these writes.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b6a370ba

drbd: Fix a potential race that could case data inconsistency · fc28845b

由 Philipp Reisner 提交于 1月 16, 2012

When we have a write request and a state change C_WF_BITMAP_S -> C_SYNC_SOURCE
at the same time, and it happens that the line

	remote = remote && drbd_should_do_remote(s);

stills sees C_WF_BITMAP_S, and

	send_oos = rw == WRITE && drbd_should_send_oos(s);

already sees C_SYNC_SOURCE both are 0.

This causes the write to not be mirrored, but marked as out-of-sync on the
Sync_Source node.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

fc28845b

drbd: add missing part_round_stats to _drbd_start_io_acct · 031a7c17

由 Lars Ellenberg 提交于 1月 21, 2012

Without this, iostat frequently sees bogus svctime and >= 100% "utilization".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

031a7c17

drbd: Fix module refcount leak in drbd_accept() · 47a4f1c1

由 Lars Ellenberg 提交于 1月 12, 2012

drbd_accept was modelled after kernel_accept
with drbd commit 53eb779 in July 2008.

Only, kernel_accept was then broken, and only fixed later
with kernel commit 1b08534e in Dec 2008:
net: Fix module refcount leak in kernel_accept()

Impact: protocol families provided as modules, e.g. ipv6 or ib_sdp,
would soon have their reference count become negative, preventing
them from being unloaded (likely), or worse, hit zero without actually
being unused, allowing them to be unloaded while still in use (unlikely,
but if triggered, causing a kernel crash).
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

47a4f1c1

drbd: Consider the disk-timeout also for meta-data IO operations · 7caacb69

由 Philipp Reisner 提交于 12月 14, 2011

If the backing device is already frozen during attach, we failed
to recognize that. The current disk-timeout code works on top
of the drbd_request objects. During attach we do not allow IO
and therefore never generate a drbd_request object but block
before that in drbd_make_request().

This patch adds the timeout to all drbd_md_sync_page_io().

Before this patch we used to go from D_ATTACHING directly
to D_DISKLESS if IO failed during attach. We can no longer
do this since we have to stay in D_FAILED until all IO
ops issued to the backing device returned.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7caacb69

drbd: Do not send state packets while lower than C_CONNECTED cstate · 4afc433c

由 Philipp Reisner 提交于 12月 13, 2011

I.e. in C_WF_REPORT_PARAMS or in C_WF_CONNECTION.
Sending may already work in these cstates, but the peer still expects
the HandShake / ConnectionFeatures packet.

Actually triggered by the Testuite on kugel.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4afc433c

drbd: fix race between disconnect and receive_state · 545752d5

由 Lars Ellenberg 提交于 12月 05, 2011

If the asender thread, or request_timer_fn(), or some other part of
the code, decided to drop the connection (because of timeout or other),
but the receiver just now was processing a P_STATE packet, there was a
chance that receive_state() would do a hard state change
"re-establishing" an already failed connection without additional handshake.

Log excerpt:
  Remote failed to finish a request within ko-count * timeout
  peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
  asender terminated
  ...
  peer( Unknown -> Secondary ) conn( Timeout -> Connected ) pdsk( DUnknown -> UpToDate ) peer_isp( 0 -> 1 )
  ...
  Connection closed
  peer( Secondary -> Unknown ) conn( Connected -> Unconnected ) pdsk( UpToDate -> DUnknown ) peer_isp( 1 -> 0 )
  receiver terminated

Impact:
while the connection state is erroneously "Connected",
requests may be queued and even sent,
which would never be acknowledged,
and may have been missed by the cleanup.
These requests would never be completed.

The next drbd_suspend_io() will then lock up,
waiting forever for these requests to complete.

Fixed in several code paths:
  Make sure the connection state is NetworkFailure or worse
  before starting the cleanup in drbd_disconnect().
  This should make sure the cleanup won't miss any requests.

  Disallow receive_state() to "upgrade" the connection state
  from an error state. This will make sure the "illegal" state
  transition won't happen.

  For all connection failure states,
  relax the safe-guard in sanitize_state() again
  to silently mask out those state changes
  (e.g. Timeout -> Connected becomes Timeout -> Timeout).
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

545752d5

drbd: fix potential spinlock deadlock · 763eb636

由 Lars Ellenberg 提交于 11月 02, 2011

drbd_try_clear_on_disk_bm() has a sanity check for the number of blocks
left to be resynced (rs_left) in the current resync extent.
If it detects a mismatch, it complains, and forces a disconnect using
drbd_force_state(mdev, NS(conn, C_DISCONNECTING));

Unfortunately, this may be called while holding the req_lock,
and drbd_force_state() want's to aquire that lock itself. Deadlock.

Don't force a disconnect, but fix up rs_left by recounting and
reassigning the number of dirty blocks in that extent.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

763eb636

drbd: Fixed an obvious copy-n-paste mistake · e89868a0

由 Philipp Reisner 提交于 11月 09, 2011

This bug might have caused troubles if disk-barriers and the ahead-behind
more are enabled at the same time.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e89868a0

drbd: send intermediate state change results to the peer · f479ea06

由 Lars Ellenberg 提交于 10月 27, 2011

DRBD state changes schedule after_state_ch() actions to a worker thread,
which decides on the old and new states of that change, whether to send
an informational state update packet (P_STATE) to the peer.
If it decides to drbd_send_state(), it would however always send the
_curent_ state, which, if a second state change happens before the
after_state_ch() of the first ran, may "fast-forward" the peer's view
about this node.  In most cases that is harmless, but sometimes this can
confuse DRBD, for example into not actually starting a necessary resync
if you do a very tight detach/attach loop on a Connected Secondary.

Fix this by always sending the "new" state of the respective state
transition which scheduled this after_state_ch() work.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f479ea06

drbd: fix spurious meta data IO "error" · a2e91381

由 Lars Ellenberg 提交于 10月 06, 2011

When detaching, even cleanly detaching due to administrator request,
we always go through D_FAILED before we become D_DISKLESS.

Don't let that state change race with an in-flight meta data IO,
or that one might think it actually experienced an IO error.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a2e91381

drbd: Fixed a race condition between detach and start of resync · aaae506d

由 Philipp Reisner 提交于 10月 06, 2011

drbd_state_lock() is only there to serialize cluster wide state
changes. Testing the local disk state needs to happen while
holding the global_state_lock.

Otherwise you might see something like this (Oct 6 on kugel)
14:20:24 drbd0: conn( WFSyncUUID -> Connected ) disk( Inconsistent -> Failed )
14:20:24 drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
14:20:24 drbd0: conn( Connected -> SyncTarget ) disk( Failed -> Inconsistent )
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

aaae506d

drbd: fix harmless race to not trigger an ASSERT · 6a9a92f4

由 Lars Ellenberg 提交于 10月 06, 2011

We have one pre-allocated page to do certain synchronous meta data IO with,
using it is serialized like so:
	drbd_md_get_buffer();
	drbd_md_sync_page_io();
	drbd_md_sync_page_io();
	...
	drbd_md_put_buffer();

In drbd_md_sync_page_io() there is an
	ASSERT(atomic_read(&mdev->md_io_in_use) == 1);

We want to be able to timeout on unresponsive lower level devices, so we
can "detach" in that case. Inside drbd_md_sync_page_io() we grab an extra
reference, to not have a dangling pointer in case a delayed IO eventually
does still complete, even after we "detached" already.

We need to put the extra reference before we signal completion from the
completion handler, or the second drbd_md_sync_page_io() above may
trigger the assert (reference count still 2).
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6a9a92f4

P
drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero · 5ba3dac5
由 Philipp Reisner 提交于 10月 05, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
5ba3dac5

drbd: drbd_nl_resize(): Fix missing put_ldev() on error path · 7b4e4d31

由 Andreas Gruenbacher 提交于 9月 28, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7b4e4d31

drbd: fix "stalled" empty resync · 40424e4a

由 Lars Ellenberg 提交于 9月 26, 2011

With sync-after dependencies, given "lucky" timing of pause/unpause
events, and the end of an empty (0 bits set) resync was sometimes not
detected on the SyncTarget, leading to a "stalled" SyncSource state.

Fixed this by expecting not only "Inconsistent -> UpToDate" but also
"Consistent -> UpToDate" transitions for the peer disk state
to end a resync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

40424e4a

drbd: Bugfix for the connection behavior · 1e86ac48

由 Philipp Reisner 提交于 8月 04, 2011

If we get into the C_BROKEN_PIPE cstate once, the state engine set the
thi->t_state of the receiver thread to restarting.  But with the while loop
in drbdd_init() a new connection gets established. After the call into
drbdd() returns immediately since the thi->t_state is not RUNNING.  The
restart of drbd_init() then resets thi->t_state to RUNNING.

I.e. after entering C_BROKEN_PIPE once, the next successful established
connection gets wasted.

The two parts of the fix:
  * Do not cause the thread to restart if we detect the issue
    with the sockets while we are in C_WF_CONNECTION.

  * Make sure that all actions that would have set us to C_BROKEN_PIPE
    happen before the state change to C_WF_REPORT_PARAMS.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1e86ac48

drbd: Cleanup all epoch objects upon connection loss · 80f9fd55

由 Philipp Reisner 提交于 7月 18, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

80f9fd55

drbd: detach must not try to abort non-local requests from drbd-8.4 · fd2491f4

由 Philipp Reisner 提交于 7月 18, 2011

Cherry picked form 8.4
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

fd2491f4

drbd: Consider that the no-data-condition could be in connected state · 79f16f5d

由 Philipp Reisner 提交于 7月 15, 2011

...when the peer has inconsistent data. In that case we failed to
clear the susp_nod flag. When the local disk was attached again
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

79f16f5d

drbd: Fixed current UUID generation · bca482e9

由 Philipp Reisner 提交于 7月 15, 2011

Now, the new edition of the clause only fires if a diskless
peer gets promoted.

This is a fixup for "drbd: Delayed creation of current-UUID".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

bca482e9

drbd: change some GFP_KERNEL to GFP_NOIO · 22f46ce2

由 Lars Ellenberg 提交于 7月 11, 2011

Bitmap IO may happend in the context of an application write,
in the generic block IO path.  We need to use GFP_NOIO.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

22f46ce2

drbd: Implemented the disk-timeout option · dfa8bedb

由 Philipp Reisner 提交于 6月 29, 2011

When the disk-timeout is active, and it expires for a single request,
we consider the local disk as D_FAILED. Note: With this change,
I made both timeout based state transitions HARD state transitions.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

dfa8bedb

drbd: Force flag for the detach operation · 02ee8f95

由 Philipp Reisner 提交于 3月 14, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

02ee8f95

drbd: Allow new IOs while the local disk in in FAILED state · 5ca1de03

由 Philipp Reisner 提交于 6月 28, 2011

The last bunch of commits prepared the 'detach from tar pit' feature.
With that we can be for long time in disk state FAILED. We need
to accept new IO requests during that time.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5ca1de03

P
drbd: Bitmap IO functions can now return prematurely if the disk breaks · 9e58c4da
由 Philipp Reisner 提交于 6月 27, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
9e58c4da

drbd: Added a kref to bm_aio_ctx · d1f3779b

由 Philipp Reisner 提交于 6月 27, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d1f3779b

drbd: Hold a reference to ldev while doing meta-data IO · b2057629

由 Philipp Reisner 提交于 6月 27, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b2057629