提交 · bf709c8552bcbbbc66ecc11555a781e814a037d8 · openanolis / cloud-kernel

08 11月, 2012 40 次提交

drbd: cleanup, remove two unused global flags · bf709c85

由 Lars Ellenberg 提交于 7月 30, 2012

The two unused "global flags" in 8.3 are "per volume" flags in 8.4.
Still, they are unused, so lose them.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

bf709c85

drbd: fix null pointer dereference with on-congestion policy when diskless · 3b9ef85e

由 Lars Ellenberg 提交于 7月 30, 2012

We must not look at mdev->actlog, unless we have a get_ldev() reference.
It also does not make much sense to try to disconnect or pull-ahead of
the peer, if we don't have good local data.

Only even consider congestion policies, if our local disk is D_UP_TO_DATE.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

3b9ef85e

drbd: take error path in drbd_adm_down if interrupted by signal · 27012382

由 Lars Ellenberg 提交于 7月 24, 2012

drbd_adm_down() does adm_detach(), which can fail with various error
codes, or be interrupted by a signal.

The interrupted by signal case was not properly handled,
leading to
	block drbd0: ASSERT( mdev->state.disk == D_DISKLESS &&
	                     mdev->state.conn == C_STANDALONE ) in drbd/drbd_worker.c
and further to destroying objects while still in use, and resulting crashes.

Detect the interruption, and take the error path out.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

27012382

drbd: allow read requests to be retried after force-detach · 9a278a79

由 Lars Ellenberg 提交于 7月 24, 2012

Sometimes, a lower level block device turns into a tar-pit,
not completing requests at all, not even doing error completion.

We can force-detach from such a tar-pit block device,
either by disk-timeout, or by drbdadm detach --force.

Queueing for retry only from the request destruction path (kref hit 0)
makes it impossible to retry affected read requests from the peer,
until the local IO completion happened, as the locally submitted
bio holds a reference on the drbd request object.

If we can only complete READs when the local completion finally
happens, we would not need to force-detach in the first place.

Instead, queue for retry where we otherwise had done the error completion.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9a278a79

drbd: __req_mod: make DISCARD_WRITE and independend case · 934722a2

由 Lars Ellenberg 提交于 7月 24, 2012

cherry-picked and adapted from drbd 9 devel branch

This looks cleaner to me,
and also gets rid of the other ugly if-inside-case-fall-through.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

934722a2

drbd: base completion and destruction of requests on ref counts · a0d856df

由 Lars Ellenberg 提交于 1月 24, 2012

cherry-picked and adapted from drbd 9 devel branch

The logic for when to get or put a reference is in mod_rq_state().

To not get confused in the freeze/thaw respectively resend/restart
paths, or when cleaning up requests waiting for P_BARRIER_ACK, this
also introduces additional state flags:
RQ_COMPLETION_SUSP, and RQ_EXP_BARR_ACK.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a0d856df

drbd: introduce completion_ref and kref to struct drbd_request · b406777e

由 Lars Ellenberg 提交于 1月 24, 2012

cherry-picked and adapted from drbd 9 devel branch

completion_ref will count pending events necessary for completion.
kref is for destruction.

This only introduces these new members of struct drbd_request,
a followup patch will make actual use of them.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b406777e

drbd: __drbd_make_request() is now void · 5df69ece

由 Lars Ellenberg 提交于 1月 24, 2012

The previous commit causes __drbd_make_request() to always return 0.
Change it to void.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5df69ece

drbd: better separate WRITE and READ code paths in drbd_make_request · 5da9c836

由 Lars Ellenberg 提交于 3月 29, 2012

cherry-picked and adapted from drbd 9 devel branch

READs will be interesting to at most one connection,
WRITEs should be interesting for all established connections.

Introduce some helper functions to hopefully make this easier to follow.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5da9c836

drbd: remove struct drbd_tl_epoch objects (barrier works) · b6dd1a89

由 Lars Ellenberg 提交于 11月 28, 2011

cherry-picked and adapted from drbd 9 devel branch

DRBD requests (struct drbd_request) are already on the per resource
transfer log list, and carry their epoch number. We do not need to
additionally link them on other ring lists in other structs.

The drbd sender thread can recognize itself when to send a P_BARRIER,
by tracking the currently processed epoch, and how many writes
have been processed for that epoch.

If the epoch of the request to be processed does not match the currently
processed epoch, any writes have been processed in it, a P_BARRIER for
this last processed epoch is send out first.
The new epoch then becomes the currently processed epoch.

To not get stuck in drbd_al_begin_io() waiting for P_BARRIER_ACK,
the sender thread also needs to handle the case when the current
epoch was closed already, but no new requests are queued yet,
and send out P_BARRIER as soon as possible.

This is done by comparing the per resource "current transfer log epoch"
(tconn->current_tle_nr) with the per connection "currently processed
epoch number" (tconn->send.current_epoch_nr), while waiting for
new requests to be processed in wait_for_work().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b6dd1a89

drbd: move the drbd_work_queue from drbd_socket to drbd_connection · d5b27b01

由 Lars Ellenberg 提交于 11月 14, 2011

cherry-picked and adapted from drbd 9 devel branch
In 8.4, we don't distinguish between "resource work" and "connection
work" yet, we have one worker for both, as we still have only one connection.

We only ever used the "data.work",
no need to keep the "meta.work" around.

Move tconn->data.work to tconn->sender_work.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

d5b27b01

drbd: allow to dequeue batches of work at a time · 8c0785a5

由 Lars Ellenberg 提交于 10月 19, 2011

cherry-picked and adapted from drbd 9 devel branch

In 8.4, we still use drbd_queue_work_front(),
so in normal operation, we can not dequeue batches,
but only single items.

Still, followup commits will wake the worker
without explicitly queueing a work item,
so up() is replaced by a simple wake_up().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

8c0785a5

drbd: transfer log epoch numbers are now per resource · b379c41e

由 Lars Ellenberg 提交于 11月 17, 2011

cherry-picked from drbd 9 devel branch.

In preparation of multiple connections, the "barrier number" or
"epoch number" needs to be tracked per-resource, not per connection.
The sequence number space will not be reset anymore.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b379c41e

drbd: rename drbd_restart_write to drbd_restart_request · 9d05e7c4

由 Lars Ellenberg 提交于 7月 17, 2012

Meanwhile, this is used to restart failed READ requests as well.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9d05e7c4

L
drbd: fix wrong assert in completion/retry path of failed local reads · 629663c9
由 Lars Ellenberg 提交于 6月 08, 2012
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
629663c9

drbd: fix local read error hung forever · ab53b90e

由 Lars Ellenberg 提交于 6月 08, 2012

The commit
    drbd: simplify retry path of failed READ requests
simplified it too much:
it just did not do anything for local read errors.

Add the missing req_may_be_completed_not_susp() to the
READ_COMPLETED_WITH_ERROR case.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

ab53b90e

drbd: fix access of unallocated pages and kernel panic · 1b6f1974

由 Lars Ellenberg 提交于 6月 08, 2012

BUG: unable to handle kernel NULL pointer dereference at (null)
...
 [<d1e17561>] ? _drbd_bm_set_bits+0x151/0x240 [drbd]
 [<d1e236f8>] ? receive_bitmap+0x4f8/0xbc0 [drbd]

This fixes an off-by-one error in the receive_bitmap() path,
if run-length encoded bitmap transfer is enabled.

If the bitmap is an exact multiple of PAGE_SIZE, which means the visible
capacity of the drbd device is an exact multiple of 128 MiB (for 4k page
size), and bitmap compression (use-rle) is enabled (which became default
with 8.4), and the very last bit is dirty and reported in an rle
comressed bitmap packet, we ended up trying to kmap_atomic a page pointer
that does not exist (bitmap->bm_pages[last index + 1]).

bug introduced by:
    Date:   Fri Jul 24 15:33:24 2009 +0200
    set bits: optimize for complete last word, fix off-by-one-word corner case

made effective by:
    Date:   Thu Dec 16 00:32:38 2010 +0100
    drbd: get rid of unused debug code

    Long time ago, we had paranoia code in the bitmap that allocated one
    extra word, assigned a magic value, and checked on every occasion that
    the magic value was still unchanged.

    That debug code is unused, the extra long word complicates code a bit.
    Get rid of it.

No-one triggered this bug in the last few years, because a large subset
of our userbase is unaffected:
 * typically the last few blocks of a device are not modified
   frequently, and remain unset
 * use-rle was disabled by default in drbd < 8.4
 * those with slightly "odd" device sizes, or
 * drbd internal meta data (which will skew the device size slightly,
   thus makes it harder to have a bug relevant device size)
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1b6f1974

P
drbd: Keep the listening socket open while trying to connect to the peer · 7a426fd8
由 Philipp Reisner 提交于 7月 12, 2012
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
7a426fd8
P
drbd: pull prepare_listen_socket() out of drbd_wait_for_connect() · 1f3e509b
由 Philipp Reisner 提交于 7月 12, 2012
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
1f3e509b

drbd: New disk option al-updates · 9a51ab1c

由 Philipp Reisner 提交于 2月 20, 2012

By disabling al-updates one might increase performace. The price for
that is that in case a crashed primary (that had al-updates disabled)
is reintegraded, it will receive a full-resync instead of a bitmap
based resync.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9a51ab1c

drbd: Stop using NLA_PUT*(). · 26ec9287

由 Andreas Gruenbacher 提交于 7月 11, 2012

These macros no longer exist in kernel version v3.5-rc1.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

26ec9287

drbd: Remove drbd_accept() and use kernel_accept() instead · 7e0f096b

由 Philipp Reisner 提交于 7月 12, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

7e0f096b

drbd: Move the call to listen() out of drbd_accept() · 2820fd39

由 Philipp Reisner 提交于 7月 12, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

2820fd39

drbd: use bitmap_parse instead of __bitmap_parse · c5b005ab

由 Philipp Reisner 提交于 4月 30, 2012

The buffer 'sc.cpu_mask' is a kernel buffer.  If bitmap_parse is used
instead of __bitmap_parse the extra parameter that indicates a kernel
buffer is not needed.
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c5b005ab

drbd: grammar fix in log message · 1882e22d

由 Lars Ellenberg 提交于 5月 07, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1882e22d

drbd: bm_page_async_io: properly initialize page->private · f66ee697

由 Lars Ellenberg 提交于 5月 07, 2012

If bm_page_async_io is advised to use a new page for I/O
(BM_AIO_COPY_PAGES is set), it will get it from a mempool.
Once the mempool has to dip into its reserves the page is
not reinitialized, i.e. page->private contains garbage, which
will lead to various problems once the I/O completes (dereferences
of NULL pointers, the submitting thread getting stuck in D-state,
 ...).
Signed-off-by: NArne Redlich <arne.redlich@googlemail.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f66ee697

drbd: allow bitmap to change during writeout from resync_finished · a220d291

由 Lars Ellenberg 提交于 5月 07, 2012

Symptom: messages similar to
 "FIXME asender in bm_change_bits_to,
  bitmap locked for 'write from resync_finished' by worker"

If a resync or verify is finished (or aborted), a full bitmap writeout
is triggered.  If we have ongoing local IO, the bitmap may still change
during that writeout, pending and not yet processed acks may cause bits
to be cleared, while new writes may cause bits to be to be set.

To fix this, introduce the drbd_bm_write_copy_pages() variant.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a220d291

drbd: fix race between drbdadm invalidate/verify and finishing resync · 5016b82a

由 Lars Ellenberg 提交于 5月 07, 2012

When a resync or online verify is finished or aborted,
drbd does a bulk write-out of changed bitmap pages.

If *in that very moment* a new verify or resync is triggered,
this can race:
 ASSERT( !test_bit(BITMAP_IO, &mdev->flags) ) in drbd_main.c
 FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending?
and similar.

This can be observed with e.g. tight invalidate loops in test scripts,
and probably has no real-life implication.

Still, that race can be solved by first quiescen the device,
before starting a new resync or verify.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5016b82a

drbd: fix resend/resubmit of frozen IO · 07be15b1

由 Lars Ellenberg 提交于 5月 07, 2012

DRBD can freeze IO, due to fencing policy (fencing resource-and-stonith),
or because we lost access to data (on-no-data-accessible suspend-io).

Resuming from there (re-connect, or re-attach, or explicit admin
intervention) should "just work".

Unfortunately, if the re-attach/re-connect did not happen within
the timeout, since the commit

  drbd: Implemented real timeout checking for request processing time

if so configured, the request_timer_fn() would timeout and
detach/disconnect virtually immediately.

This change tracks the most recent attach and connect, and does not
timeout within <configured timeout interval> after attach/connect.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

07be15b1

drbd: fix spelling, remove boring development log message · 3ea35df8

由 Philipp Reisner 提交于 4月 06, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

3ea35df8

drbd: Ensure that data_size is not 0 before using data_size-1 as index · e4bad1bc

由 Philipp Reisner 提交于 4月 06, 2012

This could be exploited by a peer which runs modified code.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e4bad1bc

drbd: Delay/reject other state changes while establishing a connection · a1096a6e

由 Philipp Reisner 提交于 4月 06, 2012

Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

a1096a6e

drbd: Fixed processing of disk-barrier, disk-flushes and disk-drain · 27eb13e9

由 Philipp Reisner 提交于 3月 30, 2012

Since drbd_bump_write_ordering() is called in the attaching
process while the disk state is D_ATTACHING, it was not
considering these three flags during attach.

A call to this function was missing form drbd_adm_disk_opts().

Fixed both issues.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

27eb13e9

drbd: ignore volume number for drbd barrier packet exchange · 9ed57dcb

由 Lars Ellenberg 提交于 3月 26, 2012

Transfer log epochs, and therefore P_BARRIER packets,
are per resource, not per volume.
We must not associate them with "some random volume".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

9ed57dcb

drbd: complete_conflicting_writes() should not care about connections · 648e46b5

由 Lars Ellenberg 提交于 3月 26, 2012

complete_conflicting_writes() should not cause -EIO.
It should not timeout either, or care for connection states.

Connection timeout is detected elsewhere, and it's cleanup path is
supposed to remove any pending requests or peer_requests from the
write_requests tree.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

648e46b5

drbd: simplify retry path of failed READ requests · 4439c400

由 Lars Ellenberg 提交于 3月 26, 2012

If a local or remote READ request fails, just push it back to the retry
workqueue.  It will re-enter __drbd_make_request, and be re-assigned to
a suitable local or remote path, or failed, if we do not have access to
good data anymore.

This obsoletes w_read_retry_remote(),
and eliminates two goto...retry blocks in __req_mod()
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4439c400

drbd: move put_ldev from __req_mod() to the endio callback · 2415308e

由 Lars Ellenberg 提交于 3月 26, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

2415308e

drbd: factor out master_bio completion and drbd_request destruction paths · 6870ca6d

由 Lars Ellenberg 提交于 3月 26, 2012

In preparation for multiple connections and reference counting,
separate the code paths for completion of the master bio
and destruction of the request object.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

6870ca6d

L
drbd: conflicting writes: make wake_up of waiting peer_requests explicit · 8d6cdd78
由 Lars Ellenberg 提交于 3月 26, 2012
```
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
```
8d6cdd78

drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE · 0afd569a

由 Lars Ellenberg 提交于 3月 26, 2012

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

0afd569a

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功