• L
    drbd: fix potential deadlock when trying to detach during handshake · 33d32fa7
    Lars Ellenberg 提交于
    When requesting a detach, we first suspend IO, and also inhibit meta-data IO
    by means of drbd_md_get_buffer(), because we don't want to "fail" the disk
    while there is IO in-flight: the transition into D_FAILED for detach purposes
    may get misinterpreted as actual IO error in a confused endio function.
    
    We wrap it all into wait_event(), to retry in case the drbd_req_state()
    returns SS_IN_TRANSIENT_STATE, as it does for example during an ongoing
    connection handshake.
    
    In that example, the receiver thread may need to grab drbd_md_get_buffer()
    during the handshake to make progress.  To avoid potential deadlock with
    detach, detach needs to grab and release the meta data buffer inside of
    that wait_event retry loop. To avoid lock inversion between
    mutex_lock(&device->state_mutex) and drbd_md_get_buffer(device),
    introduce a new enum chg_state_flag CS_INHIBIT_MD_IO, and move the
    call to drbd_md_get_buffer() inside the state_mutex grabbed in
    drbd_req_state().
    Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
    Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
    Signed-off-by: NJens Axboe <axboe@kernel.dk>
    33d32fa7
drbd_nl.c 146.0 KB