提交 · 94f2b05f03fbc605f83ae501682c85ff4535bb6d · openeuler / Kernel

10 3月, 2011 33 次提交

drbd: Killed an assert that is no longer valid · 94f2b05f

由 Philipp Reisner 提交于 1月 17, 2011

The point is that drbd_disconnect() can be called with a cstate of
WFConnection.

That happens if the user issues "drbdsetup disconnect" while the
drbd_connect() function executes. Then drbdd_init() will call
drbdd(), which in turn will return without receiving any
packets. Then drbdd_init() will end up calling drbd_disconnect()
with a cstate of WFConnection.

Bottom line: This assertion is wrong as it is, and we do not
see value in fixing it. => Removing it.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

94f2b05f

P
drbd: Do not drop net config if sending in drbd_send_protocol() fails · 148efa16
由 Philipp Reisner 提交于 1月 15, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
148efa16

drbd: Work on the Ahead -> SyncSource transition · 370a43e7

由 Philipp Reisner 提交于 1月 14, 2011

The test if rs_pending_cnt == 0 was too weak. Using Test for
unacked_cnt == 0 instead. Moved that into the worker.

Since unacked_cnt gets already increased when an P_RS_DATA_REQ
comes in.

Also using a timer to make Ahead -> SyncSource -> Ahead cycles
slower...
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

370a43e7

drbd: Do not full sync if a P_SYNC_UUID packet gets lost · 4a23f264

由 Philipp Reisner 提交于 1月 11, 2011

See also commit from 2009-08-15
"drbd_uuid_compare(): Do not full sync in case a P_SYNC_UUID packet gets lost."

We saw cases where the History UUIDs where not as expected. So the
detection of the special case did not trigger. With the sync UUID
no longer being a random number, but deducible from the previous
bitmap UUID, the detection of this special case becomes more
reliable.

The SyncUUID now is the previous bitmap UUID + 0x1000000000000.

Rule 5a:
Cs = H1p & H1p + Offset = Bp
  Connection was lost before SyncUUID Packet came through.
  Corrent (peer) UUIDs:
   Bp = H1p
   H1p = H2p
   H2p = 0
  Become Sync target.

Rule 7a:
Cp = H1s & H1s + Offset = Bs
  Connection was lost before SyncUUID Packet came through.
  Correct (own) UUIDs:
   Bs = H1s
   H1s = H2s
   H2s = 0
  Become Sync source.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4a23f264

drbd: Be more careful with SyncSource -> Ahead transitions · da0a7816

由 Philipp Reisner 提交于 12月 23, 2010

We may not get from SyncSource to Ahead if we have sent some
P_RS_DATA_REPLY packets to the peer and are waiting for
P_WRITE_ACK.

Again, this is not relevant for proper tuned systems, but makes
sure that the not-tuned system does not get diverging bitmaps.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

da0a7816

drbd: No longer answer P_RS_DATA_REQUEST packets when in C_AHEAD mode · d612d309

由 Philipp Reisner 提交于 12月 27, 2010

When the sync source node replies to a P_RS_DATA_REQUEST packet
when it is already in ahead mode. I.e. those two packets
crossed each other on the wire, that may lead to diverging
bitmaps.

  This never happens in a well-tuned-system. In a well-tuned-
  system the resync controller has reduced the resync speed
  to zero long before we got into ahead-mode.

But we have to be prepared for the not-well-tuned-system
of course as well.
Because -> diverging bitmaps = non terminating resync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d612d309

drbd: add debugging assert to make sure the protocol is clean · f735e363

由 Lars Ellenberg 提交于 12月 17, 2010

We expect to only receive the recently introduced "set out of sync"
packets in specific states. If we receive them in different states, that
may confuse the resync process to the point where it won't terminate, or
think it made negative progress.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f735e363

A
drbd: receive_bitmap_plain: Get rid of ugly and useless enum · 2c46407d
由 Andreas Gruenbacher 提交于 12月 11, 2010
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
2c46407d

drbd: receive_bitmap: Missing free_page() on error path · 78fcbdae

由 Andreas Gruenbacher 提交于 12月 10, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

78fcbdae

A
drbd: receive_bitmap: Avoid casting enum drbd_state_rv to int · de1f8e4a
由 Andreas Gruenbacher 提交于 12月 10, 2010
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
de1f8e4a

drbd: receive_bitmap: Fix the wrong return value · 4114be81

由 Andreas Gruenbacher 提交于 12月 10, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4114be81

drbd: Use the standard bool, true, and false keywords · 81e84650

由 Andreas Gruenbacher 提交于 12月 09, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

81e84650

drbd: This code is dead now · 6184ea21

由 Andreas Gruenbacher 提交于 12月 09, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6184ea21

drbd: Another small enum drbd_state_rv cleanup · bb437946

由 Andreas Gruenbacher 提交于 12月 09, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

bb437946

A
drbd: Be more explicit about functions that return an enum drbd_state_rv · bf885f8a
由 Andreas Gruenbacher 提交于 12月 08, 2010
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
bf885f8a

drbd: Get rid of unnecessary macros (2) · 0cf9d27e

由 Andreas Gruenbacher 提交于 12月 07, 2010

The FAULT_ACTIVE macro just wraps the drbd_insert_fault macro for no
apparent reason.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0cf9d27e

drbd: fix incomplete error message · 220df4d0

由 Lars Ellenberg 提交于 12月 09, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

220df4d0

drbd: Removed an unnecessary #undef · 7e458c32

由 Andreas Gruenbacher 提交于 12月 08, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7e458c32

drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmap · 3719094e

由 Philipp Reisner 提交于 11月 10, 2010

* C_STARTING_SYNC_S, C_STARTING_SYNC_T In these states the bitmap gets
  written to disk. Locking out of app-IO is done by using the
  drbd_queue_bitmap_io() and drbd_bitmap_io() functions these days.
  It is no longer necessary to lock out app-IO based on the connection
  state.
  App-IO that may come in after the BITMAP_IO flag got cleared before the
  state transition to C_SYNC_(SOURCE|TARGET) does not get mirrored, sets
  a bit in the local bitmap, that is already set, therefore changes nothing.

* C_WF_BITMAP_S In this state we send updates (P_OUT_OF_SYNC packets).
  With that we make sure they have the same number of bits when going
  into the C_SYNC_(SOURCE|TARGET) connection state.

* C_UNCONNECTED: The receiver starts, no need to lock out IO.

* C_DISCONNECTING: in drbd_disconnect() we had a wait_event()
  to wait until ap_bio_cnt reaches 0. Removed that.

* C_TIMEOUT, C_BROKEN_PIPE, C_NETWORK_FAILURE
  C_PROTOCOL_ERROR, C_TEAR_DOWN: Same as C_DISCONNECTING

* C_WF_REPORT_PARAMS: IO still possible since that is still
  like C_WF_CONNECTION.

And we do not need to send barriers in C_WF_BITMAP_S connection state.

Allow concurrent accesses to the bitmap when receiving the bitmap.
Everything gets ORed anyways.

A drbd_free_tl_hash() is in after_state_chg_work(). At that point
all the work items of the last connections must have been processed.

Introduced a call to drbd_free_tl_hash() into drbd_free_mdev()
for paranoia reasons.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

3719094e

drbd: Implemented priority inheritance for resync requests · e3555d85

由 Philipp Reisner 提交于 11月 07, 2010

We only issue resync requests if there is no significant application IO
going on. = Application IO has higher priority than resnyc IO.

If application IO can not be started because the resync process locked
an resync_lru entry, start the IO operations necessary to release the
lock ASAP.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e3555d85

P
drbd: When proxy's buffer drained off go into regular resync mode · c4752ef1
由 Philipp Reisner 提交于 10月 27, 2010
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
c4752ef1

drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNC · 73a01a18

由 Philipp Reisner 提交于 10月 27, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

73a01a18

drbd: Implemented two new connection states Ahead/Behind · 67531718

由 Philipp Reisner 提交于 10月 27, 2010

In this connection mode, the ahead node no longer replicates
application IO. The behind's disk becomes out dated.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

67531718

drbd: Renamed write_flags_to_bio() to wire_flags_to_bio() · 688593c5

由 Lars Ellenberg 提交于 11月 17, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

688593c5

drbd: properly use max_hw_sectors to limit the our bio size · 1816a2b4

由 Lars Ellenberg 提交于 11月 11, 2010

To ease tracking of bios in some hash tables, we want it to
not cross certain boundaries (128k, used to be 32k).
We limit the maximum bio size using queue parameters.

Historically some defines and variables we use there have been named
max_segment_size, which was misguided. Rename them to max_bio_size,
and use [blk_]queue_max_hw_sectors where appropriate.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1816a2b4

drbd: detect modification of in-flight buffers · 470be44a

由 Lars Ellenberg 提交于 11月 10, 2010

With data-integrity digest enabled, double-check on the sending side
for modifications by upper layers of buffers under write back,
so we can tell it appart from corruption on the "wire".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

470be44a

drbd: further converge progress display of resync and online-verify · 5f9915bb

由 Lars Ellenberg 提交于 11月 09, 2010

Show progressbar and ETA always, with proc_details >= 1 also show the
current sector position for both resync and online-verify on both nodes.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5f9915bb

L
drbd: use the resync controller for online-verify requests as well · 2649f080
由 Lars Ellenberg 提交于 11月 05, 2010
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
2649f080

drbd: advance progress step marks for online-verify · ea5442af

由 Lars Ellenberg 提交于 11月 05, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

ea5442af

drbd: initialize online-verify progress tracking on verify target · de228bba

由 Lars Ellenberg 提交于 11月 05, 2010

For partial (resumed) online verify, initialize the resync step marks
once we know what the online verify start sector is.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

de228bba

drbd: improve online-verify progress tracking · 30b743a2

由 Lars Ellenberg 提交于 11月 05, 2010

For a partial (resumed) online-verify, initialize rs_total not to total
bits, but to number of bits to check in this run, to match the meaning
rs_total has for actual resync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

30b743a2

block: kill off REQ_UNPLUG · 721a9602

由 Jens Axboe 提交于 3月 09, 2011

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

721a9602

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

28 11月, 2010 1 次提交

drbd: don't recvmsg with zero length · c13f7e1a

由 Lars Ellenberg 提交于 10月 29, 2010

This should fix a performance degradation we observed recently.

If we don't expect any subheader, we should not call into the tcp stack,
as that may add considerable latency if there is no data available at
this point.

For a synthetic synchronous write load with single outstanding writes,
this additional latency when processing the "unplug remote" packet
added up to a performance degradation factor >= 10.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c13f7e1a

18 11月, 2010 1 次提交

BKL: remove extraneous #include <smp_lock.h> · 451a3c24

由 Arnd Bergmann 提交于 11月 17, 2010

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

451a3c24

23 10月, 2010 1 次提交
- P
  drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code · 2451fc3b
  由 Philipp Reisner 提交于 8月 24, 2010
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
  2451fc3b
22 10月, 2010 2 次提交

drbd: fix potential data divergence after multiple failures · 6719fb03

由 Lars Ellenberg 提交于 10月 18, 2010

If we get an IO-error during an activity log transaction,
if we failed to write the bitmap of the evicted extent,
we must not write the transaction itself.
If we failed to write the transaction,
we must not even submit the corresponding bio,
as its extent is not yet marked in the activity log.

Otherwise, if this was a disconneted Primary (degraded cluster), which
now lost its disk as well, and we later re-attach the same backend
storage, we possibly "forget" to resync some parts of the disk that
potentially have been changed.

On the receiving side, when receiving from a peer with unhealthy disk,
checking for pdsk == D_DISKLESS is not enough, we need to set out of
sync and do AL transactions for everything pdsk < D_INCONSISTENT on the
receiving side.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6719fb03

drbd: fix potential deadlock on detach · 82f59cc6

由 Lars Ellenberg 提交于 10月 16, 2010

If we have contention in drbd_al_begin_iod (heavy randon IO),
an administrative request to detach the disk may deadlock
for similar reasons as the recently fixed deadlock if detaching
because of IO-error.

The approach taken here is to either go through the intermediate
cleanup state D_FAILED, or first lock out application io,
don't just go directly to D_DISKLESS.

We need an additional state bit (WAS_IO_ERROR) to distinguish
the -> D_FAILED because of IO-error from other failures.

Sanitize D_ATTACHING -> D_FAILED to D_ATTACHING -> D_DISKLESS.
If only attaching, ldev may be missing still, but would be referenced
from within the after_state_ch for -> D_FAILED, potentially
dereferencing a NULL pointer.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

82f59cc6

15 10月, 2010 2 次提交

drbd: add some more explicit drbd_md_sync · 856c50c7

由 Lars Ellenberg 提交于 10月 14, 2010

It sometimes may take a while for the after state change work to be
scheduled, which does drbd_md_sync. At convenient places, we should do
explicit drbd_md_sync to have the new state information on disk as soon
as possible.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

856c50c7

drbd: cleanup useless leftover warn/error printk's · 0f8488e1

由 Lars Ellenberg 提交于 10月 13, 2010

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0f8488e1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功