• P
    io_uring: fix shared sqpoll cancellation hangs · 734551df
    Pavel Begunkov 提交于
    [  736.982891] INFO: task iou-sqp-4294:4295 blocked for more than 122 seconds.
    [  736.982897] Call Trace:
    [  736.982901]  schedule+0x68/0xe0
    [  736.982903]  io_uring_cancel_sqpoll+0xdb/0x110
    [  736.982908]  io_sqpoll_cancel_cb+0x24/0x30
    [  736.982911]  io_run_task_work_head+0x28/0x50
    [  736.982913]  io_sq_thread+0x4e3/0x720
    
    We call io_uring_cancel_sqpoll() one by one for each ctx either in
    sq_thread() itself or via task works, and it's intended to cancel all
    requests of a specified context. However the function uses per-task
    counters to track the number of inflight requests, so it counts more
    requests than available via currect io_uring ctx and goes to sleep for
    them to appear (e.g. from IRQ), that will never happen.
    
    Cancel a bit more than before, i.e. all ctxs that share sqpoll
    and continue to use shared counters. Don't forget that we should not
    remove ctx from the list before running that task_work sqpoll-cancel,
    otherwise the function wouldn't be able to find the context and will
    hang.
    Reported-by: NJoakim Hassila <joj@mac.com>
    Reported-by: NJens Axboe <axboe@kernel.dk>
    Fixes: 37d1e2e3 ("io_uring: move SQPOLL thread io-wq forked worker")
    Cc: stable@vger.kernel.org
    Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/1bded7e6c6b32e0bae25fce36be2868e46b116a0.1618752958.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
    734551df
io_uring.c 242.5 KB