- 10 3月, 2011 8 次提交
-
-
由 Lars Ellenberg 提交于
commit e2041475e6ddb081734d161f6421977323f5a9b9 drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmap Contained a bad chunk that tried to optimize away drbd barriers during bitmap exchange, but accidentally dropped them for normal mode as well. Impact: depending on activity log size and access pattern, activity log extents may not be recycled in time, causeing IO to block indefinetely. Fix: skip drbd barriers only if there is no connection to send them on, or the request being completed has not been on the network at all. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
* C_STARTING_SYNC_S, C_STARTING_SYNC_T In these states the bitmap gets written to disk. Locking out of app-IO is done by using the drbd_queue_bitmap_io() and drbd_bitmap_io() functions these days. It is no longer necessary to lock out app-IO based on the connection state. App-IO that may come in after the BITMAP_IO flag got cleared before the state transition to C_SYNC_(SOURCE|TARGET) does not get mirrored, sets a bit in the local bitmap, that is already set, therefore changes nothing. * C_WF_BITMAP_S In this state we send updates (P_OUT_OF_SYNC packets). With that we make sure they have the same number of bits when going into the C_SYNC_(SOURCE|TARGET) connection state. * C_UNCONNECTED: The receiver starts, no need to lock out IO. * C_DISCONNECTING: in drbd_disconnect() we had a wait_event() to wait until ap_bio_cnt reaches 0. Removed that. * C_TIMEOUT, C_BROKEN_PIPE, C_NETWORK_FAILURE C_PROTOCOL_ERROR, C_TEAR_DOWN: Same as C_DISCONNECTING * C_WF_REPORT_PARAMS: IO still possible since that is still like C_WF_CONNECTION. And we do not need to send barriers in C_WF_BITMAP_S connection state. Allow concurrent accesses to the bitmap when receiving the bitmap. Everything gets ORed anyways. A drbd_free_tl_hash() is in after_state_chg_work(). At that point all the work items of the last connections must have been processed. Introduced a call to drbd_free_tl_hash() into drbd_free_mdev() for paranoia reasons. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Since inc_ap_bio() might sleep already Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
In this connection mode, the ahead node no longer replicates application IO. The behind's disk becomes out dated. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
To ease tracking of bios in some hash tables, we want it to not cross certain boundaries (128k, used to be 32k). We limit the maximum bio size using queue parameters. Historically some defines and variables we use there have been named max_segment_size, which was misguided. Rename them to max_bio_size, and use [blk_]queue_max_hw_sectors where appropriate. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Jens Axboe 提交于
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 23 10月, 2010 1 次提交
-
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 22 10月, 2010 3 次提交
-
-
由 Philipp Reisner 提交于
That assertion's condition needed adjustment for today's semantics Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If we don't rate limit it, and you happen to log err level messages via serial console, an IO error on a disconnected Primary may cause serious unresponsiveness. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If we get an IO-error during an activity log transaction, if we failed to write the bitmap of the evicted extent, we must not write the transaction itself. If we failed to write the transaction, we must not even submit the corresponding bio, as its extent is not yet marked in the activity log. Otherwise, if this was a disconneted Primary (degraded cluster), which now lost its disk as well, and we later re-attach the same backend storage, we possibly "forget" to resync some parts of the disk that potentially have been changed. On the receiving side, when receiving from a peer with unhealthy disk, checking for pdsk == D_DISKLESS is not enough, we need to set out of sync and do AL transactions for everything pdsk < D_INCONSISTENT on the receiving side. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 15 10月, 2010 3 次提交
-
-
由 Philipp Reisner 提交于
There are three ways to get IO suspended: * Loss of any access to data * Fence-peer-handler running * User requested to suspend IO Track those in different bits, so that one condition clearing its state bit does not interfere with the other two conditions. Only when the user resumes IO he overrules all three bits. The fact is hidden from the user, he sees only a single suspend bit. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
When the complete device is marked as out of sync, we can disable updates of the on disk AL. Currently AL updates are only disabled if one uses the "invalidate-remote" command on an unconnected, primary device, or when at attach time all bits in the bitmap are set. As of now, AL updated do not get disabled when a all bits becomes set due to application writes to an unconnected DRBD device. While this is a missing feature, it is not considered important, and might get added later. BTW, after initializing a "one legged" DRBD device drbdadm create-md resX drbdadm -- --force primary resX AL updates also get disabled, until the first connect. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The commit 288f422e drbd: Track all IO requests on the TL, not writes only moved a list_add_tail(req, ) into a region where req may have just been freed due to conflict detection. Fix this by adding a proper cleanup section for that code path. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 14 10月, 2010 9 次提交
-
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
When no data is accessible (no connection to the peer, nor a local disk) allow the user to select to freeze all IO operations instead of getting IO errors. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
If IO was frozen for a temporal network outage, resend the content of the transfer-log into the newly established connection. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
With that the drbd_fail_pending_reads() function becomes obsolete. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 08 8月, 2010 1 次提交
-
-
由 Christoph Hellwig 提交于
Remove the current bio flags and reuse the request flags for the bio, too. This allows to more easily trace the type of I/O from the filesystem down to the block driver. There were two flags in the bio that were missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've renamed two request flags that had a superflous RW in them. Note that the flags are in bio.h despite having the REQ_ name - as blkdev.h includes bio.h that is the only way to go for now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 01 6月, 2010 3 次提交
-
-
由 Philipp Reisner 提交于
The "Local READ/WRITE failed" messages are too verbose. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Lars Ellenberg 提交于
"canceled" w_read_retry_remote never completed, if they have been canceled after drbd_disconnect connection teardown cleanup has already run (or we are currently not connected anyways). Fixed by not queueing a remote retry if we already know it won't work (pdsk not uptodate), and cleanup ourselves on "cancel", in case we hit a race with drbd_disconnect. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 18 5月, 2010 3 次提交
-
-
由 Philipp Reisner 提交于
If we detect late (= after grabing mdev->req_lock) that IO got frozen, we return 1 to generic_make_request(), which simply will retry to make a request for that bio. In the subsequent call of generic_make_request() into drbd_make_request_26() we sleep in inc_ap_bio(). Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Now that the peer may handle multi-bio EEs, we can ignore the peer's limit, and concentrate on the limits of the local IO stack. This is safe accross drbd protocol versions, as our queue_max_sectors() will be adjusted accordingly. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The condition does not fit the commend (I may well be Primary, even if I lost the disk earlier and now the connection). And this is catched below anyways, where it also gets logged. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 04 12月, 2009 1 次提交
-
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 04 11月, 2009 1 次提交
-
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 28 10月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 05 10月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 02 10月, 2009 3 次提交
-
-
由 Jens Axboe 提交于
They should be reimplemented in the current scheme. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Lars Ellenberg 提交于
It is force-included on the gcc command line since at least 2.6.15. Explicit include lines seem to break compilation now in certain configurations. Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NKamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: NSam Ravnborg <sam@ravnborg.org>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-