提交 · 1f3e509b761d6d8f91acbf7da39624d086e1f2eb · openanolis / cloud-kernel

08 11月, 2012 40 次提交

P
drbd: pull prepare_listen_socket() out of drbd_wait_for_connect() · 1f3e509b
由 Philipp Reisner 提交于 7月 12, 2012
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
1f3e509b

drbd: Remove drbd_accept() and use kernel_accept() instead · 7e0f096b

由 Philipp Reisner 提交于 7月 12, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7e0f096b

drbd: Move the call to listen() out of drbd_accept() · 2820fd39

由 Philipp Reisner 提交于 7月 12, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

2820fd39

drbd: grammar fix in log message · 1882e22d

由 Lars Ellenberg 提交于 5月 07, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1882e22d

drbd: fix spelling, remove boring development log message · 3ea35df8

由 Philipp Reisner 提交于 4月 06, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

3ea35df8

drbd: Ensure that data_size is not 0 before using data_size-1 as index · e4bad1bc

由 Philipp Reisner 提交于 4月 06, 2012

This could be exploited by a peer which runs modified code.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e4bad1bc

drbd: Delay/reject other state changes while establishing a connection · a1096a6e

由 Philipp Reisner 提交于 4月 06, 2012

Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a1096a6e

drbd: Fixed processing of disk-barrier, disk-flushes and disk-drain · 27eb13e9

由 Philipp Reisner 提交于 3月 30, 2012

Since drbd_bump_write_ordering() is called in the attaching
process while the disk state is D_ATTACHING, it was not
considering these three flags during attach.

A call to this function was missing form drbd_adm_disk_opts().

Fixed both issues.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

27eb13e9

drbd: ignore volume number for drbd barrier packet exchange · 9ed57dcb

由 Lars Ellenberg 提交于 3月 26, 2012

Transfer log epochs, and therefore P_BARRIER packets,
are per resource, not per volume.
We must not associate them with "some random volume".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9ed57dcb

drbd: fix potential deadlock during "restart" of conflicting writes · 2312f0b3

由 Lars Ellenberg 提交于 11月 24, 2011

w_restart_write(), run from worker context, calls __drbd_make_request()
and further drbd_al_begin_io(, delegate=true), which then
potentially deadlocks.  The previous patch moved a BUG_ON to expose
such call paths, which would now be triggered.

Also, if we call __drbd_make_request() from resource worker context,
like w_restart_write() did, and that should block for whatever reason
(!drbd_state_is_stable(), resource suspended, ...),
we potentially deadlock the whole resource, as the worker
is needed for state changes and other things.

Create a dedicated retry workqueue for this instead.

Also make sure that inc_ap_bio()/dec_ap_bio() are properly paired,
even if do_retry() needs to retry itself,
in case __drbd_make_request() returns != 0.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

2312f0b3

drbd: Fix a potential write ordering issue on SyncTarget nodes · d93f6302

由 Lars Ellenberg 提交于 3月 26, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d93f6302

drbd: Fix module refcount leak in drbd_accept() · dd9b3604

由 Philipp Reisner 提交于 2月 23, 2012

drbd_accept was modelled after kernel_accept
with drbd commit 53eb779 in July 2008.

Only, kernel_accept was then broken, and only fixed later
with kernel commit 1b08534e in Dec 2008:
net: Fix module refcount leak in kernel_accept()

Impact: protocol families provided as modules, e.g. ipv6 or ib_sdp,
would soon have their reference count become negative, preventing
them from being unloaded (likely), or worse, hit zero without actually
being unused, allowing them to be unloaded while still in use (unlikely,
but if triggered, causing a kernel crash).
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

dd9b3604

drbd: Fixed compat issue with disconnecting 8.4 from a primary 8.3 · 4d0fc3fd

由 Philipp Reisner 提交于 1月 20, 2012

For compatibility reasons 8.4 has to send P_STATE_CHG_REQ (instead
of P_CONN_ST_CHG_REQ) when disconnecting.

In the receiving code path we missed to convert the old
answer (P_STATE_CHG_REPLY) back to 8.4 logic. Therefore
the CL_ST_CHG_SUCCESS or CL_ST_CHG_FAIL bit in the flags word
of mdev got set, while the state code was waiting for
the CONN_WD_ST_CHG_OKAY or CONN_WD_ST_CHG_FAIL bits in tconn.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4d0fc3fd

drbd: Restore late assigning of tconn->data.sock and meta.sock · 7da35862

由 Philipp Reisner 提交于 12月 19, 2011

With commit from Mon Mar 28 16:33:12 2011 +0200
"drbd: drbd_connect(): Initialize struct drbd_socket before sending anything"

tconn->data.sock and tconn->meta.sock get assigned early, in
conn_connect.

The early assigning can trigger an OOPS, because it may released the socket
without acquiring the mutex protecting the socket. An other thread (worker)
might use setsockopt() on the socket while it gets free()ed.

Restored the (proven) 8.3 behavior of assigning these sockets after the two
connections are established.

Credits for reporting the issue are going to Arne Redlich.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7da35862

drbd: fix race between disconnect and receive_state · b8853dbd

由 Philipp Reisner 提交于 12月 13, 2011

If the asender thread, or request_timer_fn(), or some other part of
the code, decided to drop the connection (because of timeout or other),
but the receiver just now was processing a P_STATE packet, there was a
chance that receive_state() would do a hard state change
"re-establishing" an already failed connection without additional handshake.

Log excerpt:
  Remote failed to finish a request within ko-count * timeout
  peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
  asender terminated
  ...
  peer( Unknown -> Secondary ) conn( Timeout -> Connected ) pdsk( DUnknown -> UpToDate ) peer_isp( 0 -> 1 )
  ...
  Connection closed
  peer( Secondary -> Unknown ) conn( Connected -> Unconnected ) pdsk( UpToDate -> DUnknown ) peer_isp( 1 -> 0 )
  receiver terminated

Impact:
while the connection state is erroneously "Connected",
requests may be queued and even sent,
which would never be acknowledged,
and may have been missed by the cleanup.
These requests would never be completed.

The next drbd_suspend_io() will then lock up,
waiting forever for these requests to complete.

Fixed in several code paths:
  Make sure the connection state is NetworkFailure or worse
  before starting the cleanup in drbd_disconnect().
  This should make sure the cleanup won't miss any requests.

  Disallow receive_state() to "upgrade" the connection state
  from an error state. This will make sure the "illegal" state
  transition won't happen.

  For all connection failure states,
  relax the safe-guard in sanitize_state() again
  to silently mask out those state changes
  (e.g. Timeout -> Connected becomes Timeout -> Timeout).

 Note by Philipp Reisner:
  The 3rd chunk described as "relax the safe-guard..."
  is not there in 8.4 as it is relaxed to the maximum in
  8.4 already
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b8853dbd

drbd: Load balancing of read requests · 380207d0

由 Philipp Reisner 提交于 11月 11, 2011

New config option for the disk secition "read-balancing", with
the values: prefer-local, prefer-remote, round-robin, when-congested-remote.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

380207d0

P
drbd: Get rid of "ASSERTION FAILED: tconn->current_epoch->list not empty" · d10b4ea3
由 Philipp Reisner 提交于 11月 30, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
d10b4ea3

drbd: add missing rcu locks around recently introduced idr_for_each · 615e087f

由 Lars Ellenberg 提交于 11月 17, 2011

Recent commit
 drbd: Move write_ordering from mdev to tconn
introduced a new idr_for_each loop over all volumes,
but did not take necessary rcu locks or krefs.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

615e087f

drbd: Fix the WO=drain implementation for multiple volumes · 77fede51

由 Philipp Reisner 提交于 11月 10, 2011

Wait until IO is drained in all volumes.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

77fede51

drbd: Switch drbd_may_finish_epoch() from mdev to tconn · 1e9dd291

由 Philipp Reisner 提交于 11月 10, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1e9dd291

drbd: Move list of epochs from mdev to tconn · 12038a3a

由 Philipp Reisner 提交于 11月 09, 2011

This is necessary since the transfer_log on the sending is also
per tconn.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

12038a3a

drbd: Prepare epochs per connection · 1d2783d5

由 Philipp Reisner 提交于 11月 10, 2011

An epoch object needs a pointer to the mdev it was received for.
This is necessary to be able to send the barrier ack packet for
the same volume as the original barrier packet was assigned to.

This prepares the next step, in which the (receiver side)
epoch list is moved from the device (mdev) to the connection (tconn)
object.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1d2783d5

drbd: Move write_ordering from mdev to tconn · 4b0007c0

由 Philipp Reisner 提交于 11月 09, 2011

This is necessary in order to prepare the move of the (receiver side)
epoch list from the device (mdev) to the connection (tconn) objects.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4b0007c0

drbd: Fixed an obvious copy-n-paste mistake · 36baf611

由 Philipp Reisner 提交于 11月 10, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

36baf611

drbd: Fixes from the drbd-8.3 branch · 43de7c85

由 Philipp Reisner 提交于 11月 10, 2011

* drbd-8.3:
  drbd: O_SYNC gives EIO on ramdisks for some kernels (eg. RHEL6).
  drbd: send intermediate state change results to the peer
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

43de7c85

drbd: Silenced compiler warnings · 376694a0

由 Philipp Reisner 提交于 11月 07, 2011

Since version 4.6.1 gcc warns about variables that get
a value assigned, but which are never read later on.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

376694a0

drbd: fix "stalled" empty resync · 9bcd2521

由 Philipp Reisner 提交于 9月 29, 2011

With sync-after dependencies, given "lucky" timing of pause/unpause
events, and the end of an empty (0 bits set) resync was sometimes not
detected on the SyncTarget, leading to a "stalled" SyncSource state.

Fixed this by expecting not only "Inconsistent -> UpToDate" but also
"Consistent -> UpToDate" transitions for the peer disk state
to end a resync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9bcd2521

drbd: Consider the discard-my-data flag for all volumes [bugz 359] · 08b165ba

由 Philipp Reisner 提交于 9月 05, 2011

...not only for the first volume
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

08b165ba

drbd: Cleanup all epoch objects upon connection loss · 85d73513

由 Philipp Reisner 提交于 7月 18, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

85d73513

drbd: Bugfix for the connection behavior · 823bd832

由 Philipp Reisner 提交于 11月 08, 2012

If we get into the C_BROKEN_PIPE cstate once, the state engine set the
thi->t_state of the receiver thread to restarting.  But with the while loop
in drbdd_init() a new connection gets established. After the call into
drbdd() returns immediately since the thi->t_state is not RUNNING.  The
restart of drbd_init() then resets thi->t_state to RUNNING.

I.e. after entering C_BROKEN_PIPE once, the next successful established
connection gets wasted.

The two parts of the fix:
  * Do not cause the thread to restart if we detect the issue
    with the sockets while we are in C_WF_CONNECTION.

  * Make sure that all actions that would have set us to C_BROKEN_PIPE
    happen before the state change to C_WF_REPORT_PARAMS.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

823bd832

drbd: Fix the data-integrity-alg setting · 7d4c782c

由 Andreas Gruenbacher 提交于 7月 17, 2011

The last data-integrity-alg fix made data integrity checking work when the
algorithm was changed for an established connection, but the common case of
configuring the algorithm before connecting was still broken. Fix that.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7d4c782c

A
drbd: receive_protocol(): We cannot change our own data-integrity-alg setting here · accdbcc5
由 Andreas Gruenbacher 提交于 7月 15, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
accdbcc5

drbd: Be consistent in reporting incompatibilities in P_PROTOCOL settings · d505d9be

由 Andreas Gruenbacher 提交于 7月 15, 2011

Refer to the settings by the names which drbdsetup and drbd.conf are using.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d505d9be

A
drbd: receive_protocol(): Make the program flow less confusing · fbc12f45
由 Andreas Gruenbacher 提交于 7月 15, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
fbc12f45
A
drbd: receive_protocol(): Give variables more easily searchable names · b792c35c
由 Andreas Gruenbacher 提交于 7月 15, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
b792c35c
A
drbd: Print memory address in hex instead of decimal in error message · 5af172ed
由 Andreas Gruenbacher 提交于 7月 15, 2011
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
5af172ed

drbd: Fixed removal of volumes/devices from connected resources · 369bea63

由 Philipp Reisner 提交于 7月 06, 2011

When removing a volume/device we need to switch the connection
status of the peer back into WFReportParams.

  Before this fix it was left in Connected state. That means that
  the peer device continued to inform us about state changes, etc...
  But we deleted that minor -> protocol error.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

369bea63

drbd: detach from frozen backing device · cdfda633

由 Philipp Reisner 提交于 7月 05, 2011

* drbd-8.3:
  documentation: Documented detach's --force and disk's --disk-timeout
  drbd: Implemented the disk-timeout option
  drbd: Force flag for the detach operation
  drbd: Allow new IOs while the local disk in in FAILED state
  drbd: Bitmap IO functions can not return prematurely if the disk breaks
  drbd: Added a kref to bm_aio_ctx
  drbd: Hold a reference to ldev while doing meta-data IO
  drbd: Keep a reference to the bio until the completion handler finished
  drbd: Implemented wait_until_done_or_disk_failure()
  drbd: Replaced md_io_mutex by an atomic: md_io_in_use
  drbd: moved md_io into mdev
  drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
  drbd: Keep a reference to barrier acked requests
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

cdfda633

drbd: Improve the "unexpected packet" error messages · 2fcb8f30

由 Andreas Gruenbacher 提交于 7月 03, 2011

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

2fcb8f30

drbd: Rename --dry-run to --tentative · 6dff2902

由 Andreas Gruenbacher 提交于 6月 28, 2011

drbdadm already has a --dry-run option, so this option cannot directly be
passed through to drbdsetup.  Rename the drbdsetup option to resolve this
conflict.

For backward compatibility, make --dry-run an alias of --tentative.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6dff2902

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功