提交 · ba3c6fb87d2df008eed8faaf01bb198e512fa72f · openeuler / Kernel

11 7月, 2014 3 次提交

drbd: close race when detaching from disk · ba3c6fb8

由 Lars Ellenberg 提交于 2月 11, 2014

BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: bd_release+0x21/0x70
Process drbd_w_t7146
Call Trace:
 close_bdev_exclusive
 drbd_free_ldev		[drbd]
 drbd_ldev_destroy	[drbd]
 w_after_state_ch	[drbd]

Race probably went like this:
  state.disk = D_FAILED

... first one to hit zero during D_FAILED:
   put_ldev() /* ----------------> 0 */
     i = atomic_dec_return()
     if (i == 0)
       if (state.disk == D_FAILED)
         schedule_work(go_diskless)
                                /* 1 <------ */ get_ldev_if_state()
   go_diskless()
      do_some_pre_cleanup()                     corresponding put_ldev():
      force_state(D_DISKLESS)   /* 0 <------ */ i = atomic_dec_return()
                                                if (i == 0)
        atomic_inc() /* ---------> 1 */
        state.disk = D_DISKLESS
        schedule_work(after_state_ch)           /* execution pre-empted by IRQ ? */

   after_state_ch()
     put_ldev()
       i = atomic_dec_return()  /* 0 */
       if (i == 0)
         if (state.disk == D_DISKLESS)            if (state.disk == D_DISKLESS)
           drbd_ldev_destroy()                      drbd_ldev_destroy();

Trying to fix this by checking the disk state *before* the
atomic_dec_return(), which implies memory barriers, and by inserting
extra memory barriers around the state assignment in __drbd_set_state().
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

ba3c6fb8

drbd: fix resync finished detection · 5ab7d2c0

由 Lars Ellenberg 提交于 1月 27, 2014

This fixes one recent regresion,
and one long existing bug.

The bug:
drbd_try_clear_on_disk_bm() assumed that all "count" bits have to be
accounted in the resync extent corresponding to the start sector.

Since we allow application requests to cross our "extent" boundaries,
this assumption is no longer true, resulting in possible misaccounting,
scary messages
("BAD! sector=12345s enr=6 rs_left=-7 rs_failed=0 count=58 cstate=..."),
and potentially, if the last bit to be cleared during resync would
reside in previously misaccounted resync extent, the resync would never
be recognized as finished, but would be "stalled" forever, even though
all blocks are in sync again and all bits have been cleared...

The regression was introduced by
    drbd: get rid of atomic update on disk bitmap works

For an "empty" resync (rs_total == 0), we must not "finish" the
resync on the SyncSource before the SyncTarget knows all relevant
information (sync uuid).  We need to wait for the full round-trip,
the SyncTarget will then explicitly notify us.

Also for normal, non-empty resyncs (rs_total > 0), the resync-finished
condition needs to be tested before the schedule() in wait_for_work, or
it is likely to be missed.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

5ab7d2c0

drbd: get rid of atomic update on disk bitmap works · c7a58db4

由 Lars Ellenberg 提交于 12月 20, 2013

Just trigger the occasional lazy bitmap write-out during resync
from the central wait_for_work() helper.

Previously, during resync, bitmap pages would be written out separately,
synchronously, one at a time, at least 8 times each (every 512 bytes
worth of bitmap cleared).

Now we trigger "merge friendly" bulk write out of all cleared pages
every two seconds during resync, and once the resync is finished.
Most pages will be written out only once.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c7a58db4

10 7月, 2014 3 次提交

drbd: rename drbd_free_bc() to drbd_free_ldev() · 28995af5

由 Philipp Reisner 提交于 11月 22, 2013

Since the member of drbd_device is called ldev
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

28995af5

drbd: device->ldev is not guaranteed on an D_ATTACHING disk · 8fe39aac

由 Philipp Reisner 提交于 11月 22, 2013

Some parts of the code assumed that get_ldev_if_state(device, D_ATTACHING)
is sufficient to access the ldev member of the device object. That was
wrong. ldev may not be there or might be freed at any time if the device
has a disk state of D_ATTACHING.

bm_rw()
  Documented that drbd_bm_read() is only called from drbd_adm_attach.
  drbd_bm_write() is only called when a reference is held, and it is
  documented that a caller has to hold a reference before calling
  drbd_bm_write()

drbd_bm_write_page()
  Use get_ldev() instead of get_ldev_if_state(device, D_ATTACHING)

drbd_bmio_set_n_write()
  No longer use get_ldev_if_state(device, D_ATTACHING). All callers
  hold a reference to ldev now.

drbd_bmio_clear_n_write()
  All callers where holding a reference of ldev anyways. Remove the
  misleading get_ldev_if_state(device, D_ATTACHING)

drbd_reconsider_max_bio_size()
  Removed the get_ldev_if_state(device, D_ATTACHING). All callers
  now pass a struct drbd_backing_dev* when they have a proper
  reference, or a NULL pointer.
  Before this fix, the receiver could trigger a NULL pointer
  deref when in drbd_reconsider_max_bio_size()

drbd_bump_write_ordering()
  Used get_ldev_if_state(device, D_ATTACHING) with the wrong assumption.
  Remove it, and allow the caller to pass in a struct drbd_backing_dev*
  when the caller knows that accessing this bdev is safe.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

8fe39aac

drbd: Move write_ordering from connection to resource · e9526580

由 Philipp Reisner 提交于 11月 22, 2013

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

e9526580

01 5月, 2014 8 次提交

drbd: use list_first_entry_or_null in first_peer_device/first_connection · ec4a3407

由 Lars Ellenberg 提交于 4月 28, 2014

If there are no peer_devices or connections, I'd rather have NULL
than some "arbitrary" address pretending to point to a struct.

Helps to avoid hard to debug symptoms, in case we ever try to use
and dereference a drbd_connection or drbd_peer_device
where we in fact don't have any connection at all.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ec4a3407

drbd: add back some fairness to AL transactions · e4d7d6f4

由 Lars Ellenberg 提交于 4月 28, 2014

When batching more updates to the activity log into single transactions,
we lost the ability for new requests to force themselves into the active
set: all preparation steps became non-blocking, and if all currently
hot extents keep busy, they could starve out new incoming requests
to cold extents for quite a while.

This can only happen if your IO backend accepts more IO operations per
average DRBD replication round trip time than you have al-extents
configured.

If we have incoming requests to cold extents,
at least do one blocking update per transaction.

In an artificial worst-case workload on SSD with an asynchronous 600 ms
replication link, with al-extents = 7 (the minimum we allow), and
concurrent full resynch, without this patch, some write requests have
been observed to be starved for 40 seconds.
With this patch, application observed a worst case latency of twice the
replication round trip time.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e4d7d6f4

drbd: Enable QUEUE_FLAG_DISCARD only if the peer can recieve P_TRIM · 20c68fde

由 Lars Ellenberg 提交于 4月 28, 2014

Allow the user of REQ_DISCARD.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

20c68fde

drbd: prepare receiving side for REQ_DISCARD · a0fb3c47

由 Lars Ellenberg 提交于 4月 28, 2014

If the receiver needs to serve a discard request on a queue that does
not announce to be discard cabable, it falls back to do synchronous
blkdev_issue_zeroout().

We expect only "reasonably" large (up to one activity log extent?)
discard requests.

We do this to not to not block the receiver for too long in this
fallback code path, and to not set/clear too many bits inside one
spinlock_irq_save() in drbd_set_in_sync/drbd_set_out_of_sync,
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a0fb3c47

drbd: allow parallel promote/demote actions · 9e276872

由 Lars Ellenberg 提交于 4月 28, 2014

We plan to use genl_family->parallel_ops = true in the future,
but need to review all possible interactions first.

For now, only selectively drop genl_lock() in drbd_set_role(),
instead serializing on our own internal resource->conf_update mutex.

We now can be promoted/demoted on many resources in parallel,
which may significantly improve cluster failover times
when fencing is required.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9e276872

drbd: perpare for genetlink parallel_ops · a910b123

由 Lars Ellenberg 提交于 4月 28, 2014

Because all administrative requests via genetlink have been globally
serialized via genl_lock(), we used to have one static struct
drbd_config_context "admin context".

Move this on-stack to the respective callback functions.

This will allow us to selectively drop the genl_lock()
(or use genl_family->parallel_ops) in the future.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a910b123

drbd: don't let application IO pre-empt resync too often · e8299874

由 Lars Ellenberg 提交于 4月 28, 2014

Before, application IO could pre-empt resync activity
for up to hardcoded 20 seconds per resync request.
A very busy server could throttle the effective resync bandwidth
down to one request per 20 seconds.

Now, we only let application IO pre-empt resync traffic
while the current resync rate estimate is above c-min-rate.

If you disable the c-min-rate throttle feature (set c-min-rate = 0),
application IO will no longer pre-empt resync traffic at all.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e8299874

drbd: Remove drbd_wrappers.h · d40e5671

由 Philipp Reisner 提交于 4月 28, 2014

Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d40e5671

17 2月, 2014 26 次提交

drbd: Add drbd_thread->resource and make drbd_thread->connection optional · 2457b6d5

由 Andreas Gruenbacher 提交于 7月 21, 2011

In the drbd_thread "infrastructure" functions, only use the resource instead of
the connection. Make the connection field of drbd_thread optional. This will
allow to introduce threads which are not associated with a connection.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

2457b6d5

drbd: Make w_make_resync_request() static · 4d010392

由 Andreas Gruenbacher 提交于 8月 25, 2011

Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

4d010392

A
drbd: struct drbd_peer_request: Use drbd_work instead of drbd_device_work · a8cd15ba
由 Andreas Gruenbacher 提交于 8月 25, 2011
```
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
a8cd15ba

drbd: Turn conn_flush_workqueue() into drbd_flush_workqueue() · b5043c5e

由 Andreas Gruenbacher 提交于 7月 28, 2011

The new function can flush any work queue, not just the work queue of the data
socket of a connection.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

b5043c5e

drbd: Create a dedicated struct drbd_device_work · 84b8c06b

由 Andreas Gruenbacher 提交于 7月 28, 2011

drbd_device_work is a work item that has a reference to a device,
while drbd_work is a more generic work item that does not carry
a reference to a device.

All callbacks get a pointer to a drbd_work instance, those callbacks
that expect a drbd_device_work use the container_of macro to get it.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

84b8c06b

drbd: Rename w_prev_work_done -> w_complete · 8682eae9

由 Andreas Gruenbacher 提交于 7月 25, 2011

Also move it to drbd_receiver.c and make it static.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

8682eae9

A
drbd: Move string function prototypes from linux/drbd.h to drbd_string.h · d9f65229
由 Andreas Gruenbacher 提交于 9月 01, 2011
```
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
d9f65229

drbd: Kill drbd_task_to_thread_name() · c60b0251

由 Andreas Gruenbacher 提交于 8月 10, 2011

Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

c60b0251

drbd: Pass a peer device to a number of fuctions · 69a22773

由 Andreas Gruenbacher 提交于 8月 09, 2011

These functions actually operate on a peer device, or
need a peer device.

drbd_prepare_command(), drbd_send_command(), drbd_send_sync_param()
drbd_send_uuids(), drbd_gen_and_send_sync_uuid(), drbd_send_sizes()
drbd_send_state(), drbd_send_current_state(), and drbd_send_state_req()
drbd_send_sr_reply(), drbd_send_ack(), drbd_send_drequest(),
drbd_send_drequest_csum(), drbd_send_ov_request(), drbd_send_dblock()
drbd_send_block(), drbd_send_out_of_sync(), recv_dless_read()
drbd_drain_block(), receive_bitmap_plain(), recv_resync_read()
read_in_block(), read_for_csum(), drbd_alloc_pages(), drbd_alloc_peer_req()
need_peer_seq(), update_peer_seq(), wait_for_and_update_peer_seq()
drbd_sync_handshake(), drbd_asb_recover_{0,1,2}p(), drbd_connected()
drbd_disconnected(), decode_bitmap_c() and recv_bm_rle_bits()
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

69a22773

drbd: Replace vnr_to_mdev() with conn_peer_device() · 9f4fe9ad

由 Andreas Gruenbacher 提交于 8月 09, 2011

The new function returns a peer device, which allows us to eliminate a few
instances of first_peer_device().
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

9f4fe9ad

A
drbd: drbd_csum_bio(), drbd_csum_ee(): Remove unused device argument · 79a3c8d3
由 Andreas Gruenbacher 提交于 8月 09, 2011
```
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
79a3c8d3

drbd: Function prototype cleanups · 753c6191

由 Andreas Gruenbacher 提交于 7月 22, 2011

Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

753c6191

drbd: Move cpu_mask from connection to resource · 625a6ba2

由 Andreas Gruenbacher 提交于 7月 22, 2011

Also fix drbd_calc_cpu_mask() to spread resources equally over all online cpus
independent of device minor numbers.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

625a6ba2

A
drbd: Move susp, susp_nod, susp_fen from connection to resource · 6bbf53ca
由 Andreas Gruenbacher 提交于 7月 08, 2011
```
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
6bbf53ca

drbd: Move conf_mutex from connection to resource · 0500813f

由 Andreas Gruenbacher 提交于 7月 07, 2011

Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

0500813f

A
drbd: drbd_create_device(): Take a resource instead of a connection argument · 59515a2e
由 Andreas Gruenbacher 提交于 7月 06, 2011
```
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
59515a2e
A
drbd: Rename drbd_{create,delete}_minor -> drbd_{create,delete}_device · f82795d6
由 Andreas Gruenbacher 提交于 7月 03, 2011
```
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
```
f82795d6

drbd: Add explicit device parameter to D_ASSERT · 0b0ba1ef

由 Andreas Gruenbacher 提交于 6月 27, 2011

The implicit dependency on a variable inside the macro is problematic.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

0b0ba1ef

drbd: Replace and remove the obsolete conn_() macros · 1ec861eb

由 Andreas Gruenbacher 提交于 7月 06, 2011

With the polymorphic drbd_() macros, we no longer need the connection
specific variants.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

1ec861eb

drbd: Turn drbd_printk() into a polymorphic macro · 3b52beff

由 Andreas Gruenbacher 提交于 7月 06, 2011

This allows drbd_alert(), drbd_err(), drbd_warn(), and drbd_info() to work for
a resource, device, or connection so that we don't have to introduce three
separate sets of macros for that.

The drbd_printk() macro itself is pretty ugly, but that problem is limited to
one place in the code. Using drbd_printk() on an object type which it doesn't
understand results in an undefined drbd_printk_with_wrong_object_type symbol.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

3b52beff

drbd: Remove the terrible DEV hack · d0180171

由 Andreas Gruenbacher 提交于 7月 03, 2011

DRBD was using dev_err() and similar all over the code; instead of having to
write dev_err(disk_to_dev(device->vdisk), ...) to convert a drbd_device into a
kernel device, a DEV macro was used which implicitly references the device
variable. This is terrible; introduce separate drbd_err() and similar macros
with an explicit device parameter instead.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

d0180171

drbd: Turn connection->volumes into connection->peer_devices · c06ece6b

由 Andreas Gruenbacher 提交于 6月 21, 2011

Let connection->peer_devices point to peer devices; connection->volumes was
pointing to devices.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

c06ece6b

drbd: Move resource options from connection to resource · eb6bea67

由 Andreas Gruenbacher 提交于 6月 21, 2011

Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

eb6bea67

drbd: Replace conn_get_by_name() with drbd_find_resource() · 4bc76048

由 Andreas Gruenbacher 提交于 6月 13, 2011

So far, connections and resources always come in pairs, but in the future with
multiple connections per resource, the names will stick with the resources.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

4bc76048

drbd: Add struct drbd_resource->devices · 803ea134

由 Andreas Gruenbacher 提交于 6月 09, 2011

This allows to access the volumes of a resource by number.
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

803ea134

drbd: Add struct drbd_device->resource · d8628a86

由 Andreas Gruenbacher 提交于 6月 09, 2011

Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>

d8628a86

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功