• D
    xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks · 4b95b43b
    Dave Chinner 提交于
    mainline inclusion
    from mainline-v5.16-rc3
    commit cd6f79d1
    category: bugfix
    bugzilla: 187526,https://gitee.com/openeuler/kernel/issues/I4KIAO
    
    Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cd6f79d1fb324968a3bae92f82eeb7d28ca1fd22
    
    --------------------------------
    
    Brian reported a null pointer dereference failure during unmount in
    xfs/006. He tracked the problem down to the AIL being torn down
    before a log shutdown had completed and removed all the items from
    the AIL. The failure occurred in this path while unmount was
    proceeding in another task:
    
     xfs_trans_ail_delete+0x102/0x130 [xfs]
     xfs_buf_item_done+0x22/0x30 [xfs]
     xfs_buf_ioend+0x73/0x4d0 [xfs]
     xfs_trans_committed_bulk+0x17e/0x2f0 [xfs]
     xlog_cil_committed+0x2a9/0x300 [xfs]
     xlog_cil_process_committed+0x69/0x80 [xfs]
     xlog_state_shutdown_callbacks+0xce/0xf0 [xfs]
     xlog_force_shutdown+0xdf/0x150 [xfs]
     xfs_do_force_shutdown+0x5f/0x150 [xfs]
     xlog_ioend_work+0x71/0x80 [xfs]
     process_one_work+0x1c5/0x390
     worker_thread+0x30/0x350
     kthread+0xd7/0x100
     ret_from_fork+0x1f/0x30
    
    This is processing an EIO error to a log write, and it's
    triggering a force shutdown. This causes the log to be shut down,
    and then it is running attached iclog callbacks from the shutdown
    context. That means the fs and log has already been marked as
    xfs_is_shutdown/xlog_is_shutdown and so high level code will abort
    (e.g. xfs_trans_commit(), xfs_log_force(), etc) with an error
    because of shutdown.
    
    The umount would have been blocked waiting for a log force
    completion inside xfs_log_cover() -> xfs_sync_sb(). The first thing
    for this situation to occur is for xfs_sync_sb() to exit without
    waiting for the iclog buffer to be comitted to disk. The
    above trace is the completion routine for the iclog buffer, and
    it is shutting down the filesystem.
    
    xlog_state_shutdown_callbacks() does this:
    
    {
            struct xlog_in_core     *iclog;
            LIST_HEAD(cb_list);
    
            spin_lock(&log->l_icloglock);
            iclog = log->l_iclog;
            do {
                    if (atomic_read(&iclog->ic_refcnt)) {
                            /* Reference holder will re-run iclog callbacks. */
                            continue;
                    }
                    list_splice_init(&iclog->ic_callbacks, &cb_list);
    >>>>>>           wake_up_all(&iclog->ic_write_wait);
    >>>>>>           wake_up_all(&iclog->ic_force_wait);
            } while ((iclog = iclog->ic_next) != log->l_iclog);
    
            wake_up_all(&log->l_flush_wait);
            spin_unlock(&log->l_icloglock);
    
    >>>>>>  xlog_cil_process_committed(&cb_list);
    }
    
    This wakes any thread waiting on IO completion of the iclog (in this
    case the umount log force) before shutdown processes all the pending
    callbacks.  That means the xfs_sync_sb() waiting on a sync
    transaction in xfs_log_force() on iclog->ic_force_wait will get
    woken before the callbacks attached to that iclog are run. This
    results in xfs_sync_sb() returning an error, and so unmount unblocks
    and continues to run whilst the log shutdown is still in progress.
    
    Normally this is just fine because the force waiter has nothing to
    do with AIL operations. But in the case of this unmount path, the
    log force waiter goes on to tear down the AIL because the log is now
    shut down and so nothing ever blocks it again from the wait point in
    xfs_log_cover().
    
    Hence it's a race to see who gets to the AIL first - the unmount
    code or xlog_cil_process_committed() killing the superblock buffer.
    
    To fix this, we just have to change the order of processing in
    xlog_state_shutdown_callbacks() to run the callbacks before it wakes
    any task waiting on completion of the iclog.
    Reported-by: NBrian Foster <bfoster@redhat.com>
    Fixes: aad7272a ("xfs: separate out log shutdown callback processing")
    Signed-off-by: NDave Chinner <dchinner@redhat.com>
    Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
    Signed-off-by: NGuo Xuenan <guoxuenan@huawei.com>
    Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
    Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
    4b95b43b
xfs_log.c 106.7 KB