- 06 12月, 2012 5 次提交
-
-
由 Lars Ellenberg 提交于
We no longer need the connector. But we need libcrc32c. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
This was introduces when moving the code over from the 8.3 codebase with commit 328e0f12Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
drbd_set_role(, R_PRIMARY, ) does the state change to Primary, some more housekeeping, and possibly generates a new UUID set. All of this holding the "state_mutex". The connection handshake involves sending of various state information, including the current data generation UUID set, and two connection state changes from C_WF_CONNECTION to C_WF_REPORT_PARAMS further to a number of different outcomes, resync being one of them. If the connection handshake happens between the state change to Primary and the generation of the new UUIDs, the resync decision based on the old UUID set may be confused, depending on circumstances. Make sure that, before we do the handshake, any promotion to Primary role will either be complete (including the housekeeping stuff), or can see, and serialize with, the ongoing handshake, based on the "STATE_SENT" bit, which is set when we start the handshake, and cleared only when we leave C_WF_REPORT_PARAMS again. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
We need to propagate the configuration into the flag bits, or it won't be effective. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Smatch complained about it this redundanct check. The check was introduced in 2006-09-13. On 2007-07-24 the body of the function was enclosed by get_ldev()/put_ldev() reference counting. Since then the check is useless and miss leading. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
- 01 12月, 2012 1 次提交
-
-
由 Jens Axboe 提交于
Compiling drbd yields: drivers/block/drbd/drbd_state.c: In function ‘_conn_request_state’: drivers/block/drbd/drbd_state.c:1804:5: error: macro "wait_event_lock_irq" passed 4 arguments, but takes just 3 drivers/block/drbd/drbd_state.c:1801:3: error: ‘wait_event_lock_irq’ undeclared (first use in this function) drivers/block/drbd/drbd_state.c:1801:3: note: each undeclared identifier is reported only once for each function it appears in drivers/block/drbd/drbd_state.c: At top level: drivers/block/drbd/drbd_state.c:1734:1: warning: ‘_conn_rq_cond’ defined but not used [-Wunused-function] Due to drbd having copied the MD definition for wait_event_lock_irq() as well. Kill them. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 11月, 2012 1 次提交
-
-
由 Lukas Czerner 提交于
Currently there is not limitation of number of requests in the loop bio list. This can lead into some nasty situations when the caller spawns tons of bio requests taking huge amount of memory. This is even more obvious with discard where blkdev_issue_discard() will submit all bios for the range and wait for them to finish afterwards. On really big loop devices and slow backing file system this can lead to OOM situation as reported by Dave Chinner. With this patch we will wait in loop_make_request() if the number of bios in the loop bio list would exceed 'nr_congestion_on'. We'll wake up the process as we process the bios form the list. Some threshold hysteresis is in place to avoid high frequency oscillation. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Reported-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 11月, 2012 2 次提交
-
-
由 Roger Pau Monne 提交于
Free the page allocated for the persistent grant. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
由 Roger Pau Monne 提交于
Move the code that frees persistent grants from the red-black tree to a function. This will make it easier for other consumers to move this to a common place. Signed-off-by: NRoger Pau Monné <roger.pau@citrix.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
- 23 11月, 2012 10 次提交
-
-
由 Selvan Mani 提交于
Hi Jens, Another tiny patch. Removed __packed before the struct smart_attr and added __packed at end of the structure to fix padding issue. Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ed Cashin 提交于
Calling the request handler directly on a plugged queue defeats the performance improvements provided by the plugging mechanism. Use the __blk_run_queue function instead of calling the request handler directly, so that we don't interfere with the block layer's ability to plug the queue. Signed-off-by: NEd Cashin <ecashin@coraid.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Wei Yongjun 提交于
The dereference to port should be moved below the NULL test. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If we're building a 32-bit kernel and CONFIG_LBADF isn't set, sector_t is 32-bits wide. The shifts by 32 and 40 are thus larger than we support. Cast the sector offset to a u64 to avoid these warnings. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Selvan Mani 提交于
Previous commit use value 3 for erasemode mask. Changing the mask to correct value to 2 Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Selvan Mani 提交于
Earlier lba address was assigned directly to lba_low and lba_low_ex, which would result in a different number (bytes reversed) in big-endian systems. Now assigning lba address byte-by-byte to fis. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Selvan Mani 提交于
The mtip driver lifted this code from elsewhere and then added a special handling check for SEC_ERASE_UNIT. If the caller tries to do a security erase but passes no output data for the command then outbuf is not allocated and the driver duly explodes. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NAlan Cox <alan@linux.intel.com> Signed-off-by: NSelvan Mani <smani@micron.com> Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jiri Kosina 提交于
We need to first destroy the floppy_wq workqueue before cleaning up the queue. Otherwise we might race with still pending work with the workqueue, but all the block queue already gone. This might lead to various oopses, such as CPU 0 Pid: 6, comm: kworker/u:0 Not tainted 3.7.0-rc4 #1 Bochs Bochs RIP: 0010:[<ffffffff8134eef5>] [<ffffffff8134eef5>] blk_peek_request+0xd5/0x1c0 RSP: 0000:ffff88000dc7dd88 EFLAGS: 00010092 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88000f602688 RSI: ffffffff81fd95d8 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88000dc7dd98 R08: ffffffff81fd95c8 R09: 0000000000000000 R10: ffffffff81fd9480 R11: 0000000000000001 R12: 6b6b6b6b6b6b6b6b R13: ffff88000dc7dfd8 R14: ffff88000dc7dfd8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001e11000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:0 (pid: 6, threadinfo ffff88000dc7c000, task ffff88000dc5ecc0) Stack: 0000000000000000 0000000000000000 ffff88000dc7ddb8 ffffffff8134efee ffff88000dc7ddb8 0000000000000000 ffff88000dc7dde8 ffffffff814aef3c ffffffff81e75d80 ffff88000dc0c640 ffff88000fbfb000 ffffffff814aed90 Call Trace: [<ffffffff8134efee>] blk_fetch_request+0xe/0x30 [<ffffffff814aef3c>] redo_fd_request+0x1ac/0x400 [<ffffffff814aed90>] ? start_motor+0x130/0x130 [<ffffffff8106b526>] process_one_work+0x136/0x450 [<ffffffff8106af65>] ? manage_workers+0x205/0x2e0 [<ffffffff8106bb6d>] worker_thread+0x14d/0x420 [<ffffffff8106ba20>] ? rescuer_thread+0x1a0/0x1a0 [<ffffffff8107075a>] kthread+0xba/0xc0 [<ffffffff810706a0>] ? __kthread_parkme+0x80/0x80 [<ffffffff818b553a>] ret_from_fork+0x7a/0xb0 [<ffffffff810706a0>] ? __kthread_parkme+0x80/0x80 Code: 0f 84 c0 00 00 00 83 f8 01 0f 85 e2 00 00 00 81 4b 40 00 00 80 00 48 89 df e8 58 f8 ff ff be fb ff ff ff fe ff ff <49> 8b 1c 24 49 39 dc 0f 85 2e ff ff ff 41 0f b6 84 24 28 04 00 RIP [<ffffffff8134eef5>] blk_peek_request+0xd5/0x1c0 RSP <ffff88000dc7dd88> Reported-by: NFengguang Wu <fengguang.wu@intel.com> Tested-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Akinobu Mita 提交于
Use check_signature() to find a signature in the mmio address. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Akinobu Mita 提交于
- Remove unnecessary correction of bit and address - Use BITS_TO_LONGS macro to calculate bitmap size - Use bitmap_zero() Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 11月, 2012 21 次提交
-
-
由 Akinobu Mita 提交于
Use copy_highpage() to copy from one page to another. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The 8.3.12 commit drbd: Bugfix for the connection behavior fixes a "wasted established connection", if a former connection attempt failed during its early stages. However it opened a window for a regression, if a connection attempt fails during its last stages. The result was a terminated receiver thread, that left behind the supposedly transient "C_UNCONNECTED" state. Any later requests to change the connection state fail, as they wait for the connection state to "stabilize". Fix: short circuit and keep retrying to restablish a new connection, if we don't reach C_WF_REPORT_PARAMS. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Jing Wang 提交于
Signed-off-by: NJing Wang <windsdaemon@gmail.com> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
If the disk has failed already, there is no point trying to change the bitmap. drbd_set_out_of_sync() already had this safeguard, time to add it to drbd_set_in_sync() as well. This also prevents some warning messages, like FIXME asender in bm_change_bits_to, bitmap locked for 'detach' by worker if our disk fails during resync, while there are some resync acks queued up. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
recent commit drbd: always write bitmap on detach introduced a bitmap writeout during detach, which obviously needs some meta data device to write to. Unfortunately, that same error path may be taken if we fail to attach, e.g. due to UUID mismatch, after we changed state to D_ATTACHING, but before the lower level device pointer is even assigned. We need to test for presence of mdev->ldev. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The intention of force-detach is to be able to deal with a completely unresponsive lower level IO stack, which does not even deliver error completions anymore, but no completion at all. In all other cases, we must still wait for the meta data IO completion. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
This has not yet been observed, but conceivably, when using GFP_KERNEL allocations from drbd_md_sync(), drbd_flush_after_epoch() or receive_SyncParam(), we could trigger additional IO to our own device, or an other device in a criss-cross setup, and end up in a local deadlock, or potentially a distributed deadlock in a criss-cross setup involving the peer blocked in a similar way waiting for us to make progress. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
The former comment arguing that GFP_KERNEL was good enough was wrong: it did not take resize into account at all, and assumed the only path leading here was the normal attach on a still secondary device, so no deadlock would be possible. Both resize on a Primary, or attach on a diskless Primary, could potentially deadlock. drbd_bm_resize() is called while IO to the respective device is suspended, so we must use GFP_NOIO to avoid potential deadlock. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
Using list_move_tail() instead of list_del() + list_add_tail(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
"aborting" requests, or force-detaching the disk, is intended for completely blocked/hung local backing devices which do no longer complete requests at all, not even do error completions. In this situation, usually a hard-reset and failover is the only way out. By "aborting", basically faking a local error-completion, we allow for a more graceful swichover by cleanly migrating services. Still the affected node has to be rebooted "soon". By completing these requests, we allow the upper layers to re-use the associated data pages. If later the local backing device "recovers", and now DMAs some data from disk into the original request pages, in the best case it will just put random data into unused pages; but typically it will corrupt meanwhile completely unrelated data, causing all sorts of damage. Which means delayed successful completion, especially for READ requests, is a reason to panic(). We assume that a delayed *error* completion is OK, though we still will complain noisily about it. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
is_valid_transition() might return SS_NOTHING_TO_DO. The condition function _req_st_cond() returned SS_NOTHING_TO_DO, which caused the wait_event to abort too early. Therefore drbd_req_state() did not consume the next CL_ST_CHG_SUCCESS or SS_CW_FAILED_BY_PEER causing serve disruption of the state machine logic... Detaching from a single volue was one way to trigger this bug. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Lars Ellenberg 提交于
We use the RQ_POSTPONED flag to mark a request for several reasons. It may be a conflicting request in a dual-primary setup, where conflict detection and resolution on the peer decided that this request needs to be re-submitted, it needs to re-enter drbd_make_request() to fix the data divergence caused by these conflicting, partially overlapping, quasi-simultaneous requests. In this case we need to mark the corresponding area as out-of-sync, before we call drbd_al_complete_io(). We also use the RQ_POSTPONED flag to just "push back" a request, before even processing it, if IO is suspended for some reason. In this case, as this request was neither submitted nor sent yet, we must not touch the bitmap. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
A postponed request might has RQ_IN_ACT_LOG already set, but is POSTPONED before it gets something in the RQ_LOCAL_MASK set. Up to now this caused a left-over active extent. Fix that by only testing for the RQ_IN_ACT_LOG bit in drbd_req_destroy() Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Without this, the meta-data gets updates after 5 seconds by the md_sync_timer. Better to do it immeditaly after a state change. If the asender detects a network failure, it may take a bit until the worker processes the according after-conn-state-change work item. The worker might be blocked in sending something, i.e. it takes until it gets into its timeout. That is 6 seconds by default which is longer than the 5 seconds of the md_sync_timer. Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
* Postponed requests should not set or clear out-of-sync marks * When a request gets postponed we need to drop its reference mdev->local_cnt (put_ldev()). Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-
由 Philipp Reisner 提交于
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com> Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
-