- 09 11月, 2012 6 次提交
-
-
由 Lars Ellenberg 提交于
To avoid confusion with REQ_DISCARD aka TRIM, rename our "discard concurrent write acks" from P_DISCARD_WRITE to P_SUPERSEDED. At the same time, rename the drbd request event DISCARD_WRITE to CONFLICT_RESOLVED. It already triggers both successful completion or restart of the request, depending on our RQ_POSTPONED flag. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Don't drop a request from the transfer log just because it was NEG_ACKED. We need it around to be able to verify P_BARRIER_ACKs against the transver log. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Almost all code paths calling start_new_tl_epoch() guarded it with if (... current_tle_writes > 0 ... ). Just move that inside start_new_tl_epoch(). Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Requests of an acked epoch are stored on the barrier_acked_requests list. In case the private bio of such a request completes while IO on the drbd device is suspended [req_mod(completed_ok)] then the request stays there. When thawing IO because the fence_peer handler returned, then we use tl_clear() to apply the connection_lost_while_pending event to all requests on the transfer-log and the barrier_acked_requests list. Up to now the connection_lost_while_pending event was not applied on requests on the barrier_acked_requests list. Fixed that. I.e. now the connection_lost_while_pending and resend events are applied to requests on the barrier_acked_requests list. For that it is necessary that the resend event finishes (local only) READS correctly. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
DRBD has a concept of request epochs or reorder-domains, which are separated on the wire by P_BARRIER packets. Older DRBD is not able to handle zero-sized requests at all, so we need to map empty flushes to these drbd barriers. These are the equivalent of empty flushes, and by default trigger flushes on the receiving side anyways (unless not supported or explicitly disabled), so there is no need to handle this differently in newer drbd either. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
In 8.4, we may have bios spanning two activity log extents. Fixup drbd_al_begin_io() and drbd_al_complete_io() to deal with zero sized bios. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 08 11月, 2012 34 次提交
-
-
由 Lars Ellenberg 提交于
Aborting local requests (not waiting for completion from the lower level disk) is dangerous: if the master bio has been completed to upper layers, data pages may be re-used for other things already. If local IO is still pending and later completes, this may cause crashes or corrupt unrelated data. Only abort local IO if explicitly requested. Intended use case is a lower level device that turned into a tarpit, not completing io requests, not even doing error completion. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
We must not look at mdev->actlog, unless we have a get_ldev() reference. It also does not make much sense to try to disconnect or pull-ahead of the peer, if we don't have good local data. Only even consider congestion policies, if our local disk is D_UP_TO_DATE. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Sometimes, a lower level block device turns into a tar-pit, not completing requests at all, not even doing error completion. We can force-detach from such a tar-pit block device, either by disk-timeout, or by drbdadm detach --force. Queueing for retry only from the request destruction path (kref hit 0) makes it impossible to retry affected read requests from the peer, until the local IO completion happened, as the locally submitted bio holds a reference on the drbd request object. If we can only complete READs when the local completion finally happens, we would not need to force-detach in the first place. Instead, queue for retry where we otherwise had done the error completion. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked and adapted from drbd 9 devel branch This looks cleaner to me, and also gets rid of the other ugly if-inside-case-fall-through. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked and adapted from drbd 9 devel branch The logic for when to get or put a reference is in mod_rq_state(). To not get confused in the freeze/thaw respectively resend/restart paths, or when cleaning up requests waiting for P_BARRIER_ACK, this also introduces additional state flags: RQ_COMPLETION_SUSP, and RQ_EXP_BARR_ACK. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked and adapted from drbd 9 devel branch completion_ref will count pending events necessary for completion. kref is for destruction. This only introduces these new members of struct drbd_request, a followup patch will make actual use of them. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The previous commit causes __drbd_make_request() to always return 0. Change it to void. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked and adapted from drbd 9 devel branch READs will be interesting to at most one connection, WRITEs should be interesting for all established connections. Introduce some helper functions to hopefully make this easier to follow. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked and adapted from drbd 9 devel branch DRBD requests (struct drbd_request) are already on the per resource transfer log list, and carry their epoch number. We do not need to additionally link them on other ring lists in other structs. The drbd sender thread can recognize itself when to send a P_BARRIER, by tracking the currently processed epoch, and how many writes have been processed for that epoch. If the epoch of the request to be processed does not match the currently processed epoch, any writes have been processed in it, a P_BARRIER for this last processed epoch is send out first. The new epoch then becomes the currently processed epoch. To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK, the sender thread also needs to handle the case when the current epoch was closed already, but no new requests are queued yet, and send out P_BARRIER as soon as possible. This is done by comparing the per resource "current transfer log epoch" (tconn->current_tle_nr) with the per connection "currently processed epoch number" (tconn->send.current_epoch_nr), while waiting for new requests to be processed in wait_for_work(). Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked and adapted from drbd 9 devel branch In 8.4, we don't distinguish between "resource work" and "connection work" yet, we have one worker for both, as we still have only one connection. We only ever used the "data.work", no need to keep the "meta.work" around. Move tconn->data.work to tconn->sender_work. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
cherry-picked from drbd 9 devel branch. In preparation of multiple connections, the "barrier number" or "epoch number" needs to be tracked per-resource, not per connection. The sequence number space will not be reset anymore. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Meanwhile, this is used to restart failed READ requests as well. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The commit drbd: simplify retry path of failed READ requests simplified it too much: it just did not do anything for local read errors. Add the missing req_may_be_completed_not_susp() to the READ_COMPLETED_WITH_ERROR case. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
DRBD can freeze IO, due to fencing policy (fencing resource-and-stonith), or because we lost access to data (on-no-data-accessible suspend-io). Resuming from there (re-connect, or re-attach, or explicit admin intervention) should "just work". Unfortunately, if the re-attach/re-connect did not happen within the timeout, since the commit drbd: Implemented real timeout checking for request processing time if so configured, the request_timer_fn() would timeout and detach/disconnect virtually immediately. This change tracks the most recent attach and connect, and does not timeout within <configured timeout interval> after attach/connect. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
complete_conflicting_writes() should not cause -EIO. It should not timeout either, or care for connection states. Connection timeout is detected elsewhere, and it's cleanup path is supposed to remove any pending requests or peer_requests from the write_requests tree. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If a local or remote READ request fails, just push it back to the retry workqueue. It will re-enter __drbd_make_request, and be re-assigned to a suitable local or remote path, or failed, if we do not have access to good data anymore. This obsoletes w_read_retry_remote(), and eliminates two goto...retry blocks in __req_mod() Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
In preparation for multiple connections and reference counting, separate the code paths for completion of the master bio and destruction of the request object. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
w_restart_write(), run from worker context, calls __drbd_make_request() and further drbd_al_begin_io(, delegate=true), which then potentially deadlocks. The previous patch moved a BUG_ON to expose such call paths, which would now be triggered. Also, if we call __drbd_make_request() from resource worker context, like w_restart_write() did, and that should block for whatever reason (!drbd_state_is_stable(), resource suspended, ...), we potentially deadlock the whole resource, as the worker is needed for state changes and other things. Create a dedicated retry workqueue for this instead. Also make sure that inc_ap_bio()/dec_ap_bio() are properly paired, even if do_retry() needs to retry itself, in case __drbd_make_request() returns != 0. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
When we have a write request and a state change C_WF_BITMAP_S -> C_SYNC_SOURCE at the same time, and it happens that the line remote = remote && drbd_should_do_remote(s); stills sees C_WF_BITMAP_S, and send_oos = rw == WRITE && drbd_should_send_oos(s); already sees C_SYNC_SOURCE both are 0. This causes the write to not be mirrored, but marked as out-of-sync on the Sync_Source node. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Without this, iostat frequently sees bogus svctime and >= 100% "utilization". Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
...and not all volumes of the resource Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
With Linux-3.2 generic_make_request() will no longer loop over the request function until it finally returns 0. Move this loop into our drbd_make_request() function. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
ap_in_flight only counts writes. NEG_ACKED is an action on a request that might be called for reads and writes. This bug was there forever, but it becomes much more relevant with the read balincing code. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
New config option for the disk secition "read-balancing", with the values: prefer-local, prefer-remote, round-robin, when-congested-remote. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
That is necessary since the whole transfer log is per connection(tconn) and not per device(mdev). This bug caused list corruption on the worker list. When a barrier is queued for sending in the context of one device, another device did not see the CREATE_BARRIER bit, and queued the same object again -> list corruption. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-