提交 · 24c4830c8ec3cbc904d84c213126a35f41a4e455 · openeuler / Kernel

24 5月, 2011 11 次提交

由 Bart Van Assche 提交于 5月 21, 2011

Found these with the help of ispell -l.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

24c4830c

drbd: fix schedule in atomic · 9a0d9d03

由 Lars Ellenberg 提交于 5月 02, 2011

An administrative detach used to request a state change directly to D_DISKLESS,
first suspending IO to avoid the last put_ldev() occuring from an endio handler,
potentially in irq context.

This is not enough on the receiving side (typically secondary), we may miss
some peer_req on the way to local disk, which then may do the last put_ldev()
from their drbd_peer_request_endio().

This patch makes the detach always go through the intermediate D_FAILED state.
We may consider to rename it D_DETACHING.

Alternative approach would be to create yet an other work item to be scheduled
on the worker, do the destructor work from there, and get the timing right.

manually picked commit 564040f from the drbd 8.4 branch.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9a0d9d03

drbd: Take a more conservative approach when deciding max_bio_size · 99432fcc

由 Philipp Reisner 提交于 5月 20, 2011

The old (optimistic) implementation could shrink the bio size
on an primary device.

Shrinking the bio size on a primary device is bad. Since there
we might get BIOs with the old (bigger) size shortly after
we published the new size.

The new implementation is more conservative, and eventually
increases the max_bio_size on a primary device (which is valid).
It does so, when it knows the local limit AND the remote limit.

 We cache the last seen max_bio_size of the peer in the meta
 data, and rely on that, to make the operation of single
 nodes more efficient.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

99432fcc

P
drbd: Fixed state transitions after async outdate-peer-handler returned · 21423fa7
由 Philipp Reisner 提交于 5月 17, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
21423fa7
P
drbd: Disallow the peer_disk_state to be D_OUTDATED while connected · fa7d9396
由 Philipp Reisner 提交于 5月 17, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
fa7d9396

drbd: Fix for the connection problems on high latency links · a8e40792

由 Philipp Reisner 提交于 5月 13, 2011

It seems that the real cause of all the issues where that
we did not noticed in drbd_try_connect() when the other
guy closes one socket if the round trip time gets higher
than 100ms. There were that 100ms hard coded!
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a8e40792

drbd: fix potential activity log refcount imbalance in error path · 76727f68

由 Lars Ellenberg 提交于 5月 16, 2011

It is no longer sufficient to trigger on local WRITE,
we need to check on (rq_state & RQ_IN_ACT_LOG)
before calling drbd_al_complete_io also in the error path.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

76727f68

drbd: Only downgrade the disk state in case of disk failures · d2e17807

由 Philipp Reisner 提交于 3月 14, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d2e17807

drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int · f36af18c

由 Lars Ellenberg 提交于 3月 09, 2011

If there is no replication traffic within the idle timeout
(ping-int seconds), DRBD will send a P_PING,
and adjust the timeout to ping-timeout.

If there is no P_PING_ACK received within this ping-timeout,
DRBD finally drops the connection, and tries to re-establish it.

To decide which timeout was active, we compared the current timeout
with the ping-timeout, and dropped the connection, if that was the case.

By default, ping-int is 10 seconds, ping-timeout is 500 ms.

Unfortunately, if you configure ping-timeout to be the same as ping-int,
expiry of the idle-timeout had been mistaken for a missing ping ack,
and caused an immediate reconnection attempt.

Fix:
Allow both timeouts to be equal, use a local variable
to store which timeout is active.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f36af18c

drbd: fix potential distributed deadlock · 53ea4331

由 Lars Ellenberg 提交于 3月 08, 2011

We limit ourselves to a configurable maximum number of pages used as
temporary bio pages.

If the configured "max_buffers" is not big enough to match the bandwidth
of the respective deployment, a distributed deadlock could be triggered
by e.g. fast online verify and heavy application IO.

TCP connections would block on congestion, because both receivers
would wait on pages to become available.

Fortunately the respective senders in this case would be able to give
back some pages already. So do that.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

53ea4331

drbd: Fix for application IO with the on-io-error=pass-on policy · 738a84b2

由 Philipp Reisner 提交于 3月 03, 2011

In case a write failes on the local disk, go into D_INCONSISTENT
disk state. That causes future reads of that block to be shipped
to the peer.

Read retry remote was already in place.

Actually the documentation needs to get fixed now. Since the
application is still shielded from the error. (as long as we have
only a single disk failing) The difference to detach is that
we keep the disk. And therefore might keep all the other, still
working sectors up to date.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

738a84b2

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

28 3月, 2011 1 次提交

drbd: fix up merge error · 7e599e6e

由 Linus Torvalds 提交于 3月 28, 2011

In commit 95a0f10c ("drbd: store in-core bitmap little endian,
regardless of architecture") drbd had made the sane choice to use
little-endian bitmap functions everywhere.  However, it used the
horrible old functions names from <asm-generic/bitops/le.h>, that were
never really meant to be exported.

In the meantime, things got cleaned up, and in commit c4945b9e
("asm-generic: rename generic little-endian bitops functions") we
renamed the LE bitops to something sane, exactly so that they could be
used in random code without people gouging their eyes out when seeing
the crazy jumble of letters that were the old internal names.

As a result the drbd thing merged cleanly (commit 8d49a775: "Merge
branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block"),
since there was no data conflict - but the end result obviously doesn't
actually compile.
Reported-and-tested-by: NIngo Molnar <mingo@elte.hu>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e599e6e

17 3月, 2011 1 次提交
- S
  drbd: need include for bitops functions declarations · f0ff1357
  由 Stephen Rothwell 提交于 3月 17, 2011
```
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
```
  f0ff1357
10 3月, 2011 26 次提交

drbd: drop code present under #ifdef which is relevant to 2.6.28 and below · 03567812

由 Or Gerlitz 提交于 1月 13, 2011

Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

03567812

drbd: Fixed handling of read errors on a 'VerifyS' node · 7961243b

由 Philipp Reisner 提交于 3月 02, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7961243b

drbd: Fixed handling of read errors on a 'VerifyT' node · 8f21420e

由 Philipp Reisner 提交于 3月 01, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

8f21420e

P
drbd: Implemented real timeout checking for request processing time · 7fde2be9
由 Philipp Reisner 提交于 3月 01, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
7fde2be9

drbd: Remove unused function atodb_endio() · c5a91619

由 Andreas Gruenbacher 提交于 1月 25, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c5a91619

L
drbd: improve log message if received sector offset exceeds local capacity · fdda6544
由 Lars Ellenberg 提交于 1月 24, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
fdda6544

drbd: kill dead code · e99dc367

由 Lars Ellenberg 提交于 1月 24, 2011

This code became obsolete and unused last December with
 drbd: bitmap keep track of changes vs on-disk bitmap
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e99dc367

drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails · 10f6d992

由 Lars Ellenberg 提交于 1月 24, 2011

Just deal with it more gracefully, if we fail to add even a single page
to an empty bio. We used to BUG_ON() there, but it has been observed in
some Xen deployment, so we need to handle that case more robustly now.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

10f6d992

drbd: Removed left over, now wrong comments · 039312b6

由 Philipp Reisner 提交于 1月 21, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

039312b6

drbd: serialize admin requests for new verify run with pending bitmap io · 873b0d5f

由 Lars Ellenberg 提交于 1月 21, 2011

This is an addendum to
 drbd: serialize admin requests for new resync with pending bitmap io

It avoids a race that could trigger "FIXME" assert log messages.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

873b0d5f

drbd: fix potential imbalance of ap_in_flight · e636db5b

由 Lars Ellenberg 提交于 1月 21, 2011

When we receive a barrier ack, we walk the ring list of drbd requests
in the transfer log of the respective epoch, do some housekeeping,
and free those objects.

We tried to keep epochs of mirrored and unmirrored drbd requests
separate, and assert that no local-only requests are present in a
barrier_acked epoch.

It turns out that this has quite a number of corner cases and would
add bloated code without functional benefit.

We now revert the (insufficient) commits
 drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions
 drbd: Ensure that an epoch contains only requests of one kind
and instead fix the processing of barrier acks to cope with
a mix of local-only and mirrored requests.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e636db5b

drbd: silence some noisy log messages during disconnect · 0ddc5549

由 Lars Ellenberg 提交于 1月 21, 2011

If we fail to send the information that we lost our disk,
we have no connection, and no disk: no access to data anymore.
That is either expected (deconfiguration), or there will be so much
noise in the logs that "Sending state failed" is not useful at all.
Drop it.

If the reason for a shorter than expected receive was a signal,
which we sent because we already decided to disconnect,
these additional log messages are confusing and useless.

This patch follows this pattern:
 - dev_warn(DEV, "short read expecting header on sock: r=%d\n", r);
 + if (!signal_pending(current))
 + 	dev_warn(DEV, "short read expecting header on sock: r=%d\n", r);

Also make them all dev_warn for consistency.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0ddc5549

drbd: describe bitmap locking for bulk operation in finer detail · 20ceb2b2

由 Lars Ellenberg 提交于 1月 21, 2011

Now that we do no longer in-place endian-swap the bitmap, we allow
selected bitmap operations (testing bits, sometimes even settting bits)
during some bulk operations.

This caused us to hit a lot of FIXME asserts similar to
	FIXME asender in drbd_bm_count_bits,
	bitmap locked for 'write from resync_finished' by worker
Which now is nonsense: looking at the bitmap is perfectly legal
as long as it is not being resized.

This cosmetic patch defines some flags to describe expectations in finer
detail, so the asserts in e.g. bm_change_bits_to() can be skipped if
appropriate.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

20ceb2b2

drbd: log UUIDs whenever they change · 62b0da3a

由 Lars Ellenberg 提交于 1月 20, 2011

All decisions about sync, sync direction, and wether or not to
allow a connect or attach are based on our set of UUIDs to tag a
data generation.

Log changes to the UUIDs whenever they occur,
logging "new current UUID P:Q:R:S" is more useful
than "Creating new current UUID".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

62b0da3a

drbd: We can not process BIOs with a size of 0 · d07c9c10

由 Philipp Reisner 提交于 1月 20, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d07c9c10

drbd: Provide hints with the error message when clearing the sync pause flag · cd88d030

由 Philipp Reisner 提交于 1月 20, 2011

When the user clears the sync-pause flag, and sync stays in pause
state, give hints to the user, why it still is in pause state.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

cd88d030

drbd: queue bitmap writeout more intelligently · 79a30d2d

由 Lars Ellenberg 提交于 1月 20, 2011

The "lazy writeout" of cleared bitmap pages happens during resync, and
should happen again once the resync finishes cleanly, or is aborted.

If resync finished cleanly, or was aborted because of peer disk
failure, we trigger the writeout from worker context in the after
state change work.

If resync was aborted because of connection failure, we should not
immediately trigger bitmap writeout, but rather postpone the
writeout to after the connection cleanup happened.  We now do it
in the receiver context from drbd_disconnect().

If resync was aborted because of local disk failure, well, there
is nothing to write to anymore.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

79a30d2d

drbd: don't pointlessly queue bitmap send, if we lost connection · 54b956ab

由 Lars Ellenberg 提交于 1月 20, 2011

This is a minor optimization and cleanup,
and also considerably reduces some harmless (but noisy) race with
the connection cleanup code.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

54b956ab

L
drbd: serialize admin requests for new resync with pending bitmap io · 194bfb32
由 Lars Ellenberg 提交于 1月 18, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
194bfb32
L
drbd: only generate and send a new sync uuid after a successful state change · 6c922ed5
由 Lars Ellenberg 提交于 1月 12, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
6c922ed5
P
drbd: cleaned up __set_current_state() followed by schedule_timeout() calls · 20ee6390
由 Philipp Reisner 提交于 1月 18, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
20ee6390

drbd: Ensure that an epoch contains only requests of one kind · 6a35c45f

由 Philipp Reisner 提交于 1月 17, 2011

The assert in drbd_req.c:755 forces us to have only requests of
one kind in an epoch. The two kinds we distinguish here are:
local-only or mirrored.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6a35c45f

drbd: Fixed P_NEG_ACK processing for protocol A and B · 2deb8336

由 Philipp Reisner 提交于 1月 17, 2011

Protocol A has no P_WRITE_ACKs, but has P_NEG_ACKs.
The master bio might already be completed, therefore the
request is no longer in the collision hash.
=> Do not try to validate block_id as request

In Protocol B we might already have got a P_RECV_ACK
but then get a P_NEG_ACK after wards.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

2deb8336

drbd: Killed an assert that is no longer valid · 94f2b05f

由 Philipp Reisner 提交于 1月 17, 2011

The point is that drbd_disconnect() can be called with a cstate of
WFConnection.

That happens if the user issues "drbdsetup disconnect" while the
drbd_connect() function executes. Then drbdd_init() will call
drbdd(), which in turn will return without receiving any
packets. Then drbdd_init() will end up calling drbd_disconnect()
with a cstate of WFConnection.

Bottom line: This assertion is wrong as it is, and we do not
see value in fixing it. => Removing it.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

94f2b05f

P
drbd: Do not drop net config if sending in drbd_send_protocol() fails · 148efa16
由 Philipp Reisner 提交于 1月 15, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
148efa16

drbd: Work on the Ahead -> SyncSource transition · 370a43e7

由 Philipp Reisner 提交于 1月 14, 2011

The test if rs_pending_cnt == 0 was too weak. Using Test for
unacked_cnt == 0 instead. Moved that into the worker.

Since unacked_cnt gets already increased when an P_RS_DATA_REQ
comes in.

Also using a timer to make Ahead -> SyncSource -> Ahead cycles
slower...
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

370a43e7

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功