- 30 10月, 2012 7 次提交
-
-
由 Lars Ellenberg 提交于
If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Lars Ellenberg 提交于
- struct drbd_conf { ... unsigned long flags; ... } + struct drbd_conf { ... unsigned long drbd_flags[N]; ... } And introduce wrapper functions for test/set/clear bit operations on this member. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Lars Ellenberg 提交于
The intention of force-detach is to be able to deal with a completely unresponsive lower level IO stack, which does not even deliver error completions anymore, but no completion at all. In all other cases, we must still wait for the meta data IO completion. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Lars Ellenberg 提交于
This has not yet been observed, but conceivably, when using GFP_KERNEL allocations from drbd_md_sync(), drbd_flush_after_epoch() or receive_SyncParam(), we could trigger additional IO to our own device, or an other device in a criss-cross setup, and end up in a local deadlock, or potentially a distributed deadlock in a criss-cross setup involving the peer blocked in a similar way waiting for us to make progress. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Philipp Reisner 提交于
Disconnecting is a cluster wide state change. In case the peer node agrees to the state transition, it sends back the fact on the meta-data connection and closes both sockets. In case the node node that initiated the state transfer sees the closing action on the data-socket, before the P_STATE_CHG_REPLY packet, it was going into one of the network failure states. At least with the fencing option set to something else thatn "dont-care", the unclean shutdown of the connection causes a short IO freeze or a fence operation. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Lars Ellenberg 提交于
We now can schedule only a specific range of sectors for online verify, or interrupt a running verify without interrupting the connection. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Philipp Reisner 提交于
There is at least the worker context, the receiver context, the context of receiving netlink packts and processes reading a sysfs attribute that access the uuids. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 16 8月, 2012 1 次提交
-
-
由 Philipp Reisner 提交于
We need to write the whole bitmap after we moved the meta data due to an online resize operation. With the support for one peta byte devices bitmap IO was optimized to only write out touched pages. This optimization must be turned off when writing the bitmap after an online resize. This issue was introduced with drbd-8.3.10. The impact of this bug is that after an online resize, the next resync could become larger than expected. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 24 7月, 2012 5 次提交
-
-
由 Lars Ellenberg 提交于
We capped our max_bio_size respectively max_hw_sectors with min_t(int, lower level limit, our limit); unfortunately, some drivers, e.g. the kvm virtio block driver, initialize their limits to "-1U", and that is of course a smaller "int" value than our limit. Impact: we started to request 16 MB resync requests, which lead to protocol error and a reconnect loop. Fix all relevant constants and parameters to be unsigned int. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If you do back to back wait-sync/invalidate on a Primary in a tight loop, during application IO load, you could trigger a race: kernel: block drbd6: FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending? Fix this by changing the order of the drbd_queue_work() and the wake_up() in dec_ap_pending(), and adding the additional drbd_flush_workqueue() before requesting the full sync. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If the drbd worker thread is synchronously waiting for some userland callback, we don't want some casual pageout to block on us. Have drbd_congested() report congestion in that case. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Aborting local requests (not waiting for completion from the lower level disk) is dangerous: if the master bio has been completed to upper layers, data pages may be re-used for other things already. If local IO is still pending and later completes, this may cause crashes or corrupt unrelated data. Only abort local IO if explicitly requested. Intended use case is a lower level device that turned into a tarpit, not completing io requests, not even doing error completion. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 09 5月, 2012 16 次提交
-
-
由 Lars Ellenberg 提交于
Don't rely on availability of bios from the global fs_bio_set, we should use our own bio_set for meta data IO. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Symptom: messages similar to "FIXME asender in bm_change_bits_to, bitmap locked for 'write from resync_finished' by worker" If a resync or verify is finished (or aborted), a full bitmap writeout is triggered. If we have ongoing local IO, the bitmap may still change during that writeout, pending and not yet processed acks may cause bits to be cleared, while new writes may cause bits to be to be set. To fix this, introduce the drbd_bm_write_copy_pages() variant. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
DRBD can freeze IO, due to fencing policy (fencing resource-and-stonith), or because we lost access to data (on-no-data-accessible suspend-io). Resuming from there (re-connect, or re-attach, or explicit admin intervention) should "just work". Unfortunately, if the re-attach/re-connect did not happen within the timeout, since the commit drbd: Implemented real timeout checking for request processing time if so configured, the request_timer_fn() would timeout and detach/disconnect virtually immediately. This change tracks the most recent attach and connect, and does not timeout within <configured timeout interval> after attach/connect. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Changes to the role and disk state should be delayed or rejected while we establish a connection. This is necessary, since the peer will base its resync decision on the UUIDs and the state we sent in the drbd_connect() function. The most prominent example for this race is becoming primary after sending state and UUIDs and before the state changes to C_WF_CONNECTION. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
If the backing device is already frozen during attach, we failed to recognize that. The current disk-timeout code works on top of the drbd_request objects. During attach we do not allow IO and therefore never generate a drbd_request object but block before that in drbd_make_request(). This patch adds the timeout to all drbd_md_sync_page_io(). Before this patch we used to go from D_ATTACHING directly to D_DISKLESS if IO failed during attach. We can no longer do this since we have to stay in D_FAILED until all IO ops issued to the backing device returned. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
DRBD state changes schedule after_state_ch() actions to a worker thread, which decides on the old and new states of that change, whether to send an informational state update packet (P_STATE) to the peer. If it decides to drbd_send_state(), it would however always send the _curent_ state, which, if a second state change happens before the after_state_ch() of the first ran, may "fast-forward" the peer's view about this node. In most cases that is harmless, but sometimes this can confuse DRBD, for example into not actually starting a necessary resync if you do a very tight detach/attach loop on a Connected Secondary. Fix this by always sending the "new" state of the respective state transition which scheduled this after_state_ch() work. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
The last bunch of commits prepared the 'detach from tar pit' feature. With that we can be for long time in disk state FAILED. We need to accept new IO requests during that time. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
The new function drbd_md_get_buffer() aborts waiting for the buffer in case the disk failes in the meantime. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 David Howells 提交于
Fix warnings of the following nature in the drbd header: In file included from drivers/block/drbd/drbd_bitmap.c:32: drivers/block/drbd/drbd_int.h: In function 'drbd_get_syncer_progress': drivers/block/drbd/drbd_int.h:2234: warning: comparison is always false due to limited range of data where mdev->rs_total (an unsigned long) is being compared to 1ULL << 32, which is always false on a 32-bit machine. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
-
- 13 1月, 2012 1 次提交
-
-
由 Rusty Russell 提交于
module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: NMauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 15 9月, 2011 2 次提交
-
-
由 Jesper Juhl 提交于
It was pointed out by 'make versioncheck' that some includes of linux/version.h are not needed in drivers/block/. This patch removes them. Signed-off-by: NJesper Juhl <jj@chaosbits.net> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
由 Joe Perches 提交于
Use the normal include style. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 12 9月, 2011 1 次提交
-
-
由 Christoph Hellwig 提交于
There is very little benefit in allowing to let a ->make_request instance update the bios device and sector and loop around it in __generic_make_request when we can archive the same through calling generic_make_request from the driver and letting the loop in generic_make_request handle it. Note that various drivers got the return value from ->make_request and returned non-zero values for errors. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NNeilBrown <neilb@suse.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 24 5月, 2011 6 次提交
-
-
由 Andrew Morton 提交于
In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type Forward declarations of enums do not work. Fix it unpleasantly by moving the prototype. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NLars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Bart Van Assche 提交于
Found these with the help of ispell -l. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
-
由 Lars Ellenberg 提交于
An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
The old (optimistic) implementation could shrink the bio size on an primary device. Shrinking the bio size on a primary device is bad. Since there we might get BIOs with the old (bigger) size shortly after we published the new size. The new implementation is more conservative, and eventually increases the max_bio_size on a primary device (which is valid). It does so, when it knows the local limit AND the remote limit. We cache the last seen max_bio_size of the peer in the meta data, and rely on that, to make the operation of single nodes more efficient. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 23 5月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
After discovering that wide use of prefetch on modern CPUs could be a net loss instead of a win, net drivers which were relying on the implicit inclusion of prefetch.h via the list headers showed up in the resulting cleanup fallout. Give them an explicit include via the following $0.02 script. ========================================= #!/bin/bash MANUAL="" for i in `git grep -l 'prefetch(.*)' .` ; do grep -q '<linux/prefetch.h>' $i if [ $? = 0 ] ; then continue fi ( echo '?^#include <linux/?a' echo '#include <linux/prefetch.h>' echo . echo w echo q ) | ed -s $i > /dev/null 2>&1 if [ $? != 0 ]; then echo $i needs manual fixup MANUAL="$i $MANUAL" fi done echo ------------------- 8\<---------------------- echo vi $MANUAL ========================================= Signed-off-by: NPaul <paul.gortmaker@windriver.com> [ Fixed up some incorrect #include placements, and added some non-network drivers and the fib_trie.c case - Linus ] Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-