1. 15 3月, 2021 5 次提交
    • P
      io_uring: fix concurrent parking · 9e138a48
      Pavel Begunkov 提交于
      If io_sq_thread_park() of one task got rescheduled right after
      set_bit(), before it gets back to mutex_lock() there can happen
      park()/unpark() by another task with SQPOLL locking again and
      continuing running never seeing that first set_bit(SHOULD_PARK),
      so won't even try to put the mutex down for parking.
      
      It will get parked eventually when SQPOLL drops the lock for reschedule,
      but may be problematic and will get in the way of further fixes.
      
      Account number of tasks waiting for parking with a new atomic variable
      park_pending and adjust SHOULD_PARK accordingly. It doesn't entirely
      replaces SHOULD_PARK bit with this atomic var because it's convenient
      to have it as a bit in the state and will help to do optimisations
      later.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9e138a48
    • P
      io_uring: halt SQO submission on ctx exit · f6d54255
      Pavel Begunkov 提交于
      io_sq_thread_finish() is called in io_ring_ctx_free(), so SQPOLL task is
      potentially running submitting new requests. It's not a disaster because
      of using a "try" variant of percpu_ref_get, but is far from nice.
      
      Remove ctx from the sqd ctx list earlier, before cancellation loop, so
      SQPOLL can't find it and so won't submit new requests.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f6d54255
    • P
      io_uring: replace sqd rw_semaphore with mutex · 09a6f4ef
      Pavel Begunkov 提交于
      The only user of read-locking of sqd->rw_lock is sq_thread itself, which
      is by definition alone, so we don't really need rw_semaphore, but mutex
      will do. Replace it with a mutex, and kill read-to-write upgrading and
      extra task_work handling in io_sq_thread().
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      09a6f4ef
    • P
      io_uring: fix complete_post use ctx after free · 180f829f
      Pavel Begunkov 提交于
      If io_req_complete_post() put not a final ref, we can't rely on the
      request's ctx ref, and so ctx may potentially be freed while
      complete_post() is in io_cqring_ev_posted()/etc.
      
      In that case get an additional ctx reference, and put it in the end, so
      protecting following io_cqring_ev_posted(). And also prolong ctx
      lifetime until spin_unlock happens, as we do with mutexes, so added
      percpu_ref_get() doesn't race with ctx free.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      180f829f
    • P
      io_uring: fix ->flags races by linked timeouts · efe814a4
      Pavel Begunkov 提交于
      It's racy to modify req->flags from a not owning context, e.g. linked
      timeout calling req_set_fail_links() for the master request might race
      with that request setting/clearing flags while being executed
      concurrently. Just remove req_set_fail_links(prev) from
      io_link_timeout_fn(), io_async_find_and_cancel() and functions down the
      line take care of setting the fail bit.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      efe814a4
  2. 14 3月, 2021 1 次提交
    • J
      io_uring: convert io_buffer_idr to XArray · 9e15c3a0
      Jens Axboe 提交于
      Like we did for the personality idr, convert the IO buffer idr to use
      XArray. This avoids a use-after-free on removal of entries, since idr
      doesn't like doing so from inside an iterator, and it nicely reduces
      the amount of code we need to support this feature.
      
      Fixes: 5a2e745d ("io_uring: buffer registration infrastructure")
      Cc: stable@vger.kernel.org
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: yangerkun <yangerkun@huawei.com>
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9e15c3a0
  3. 13 3月, 2021 4 次提交
    • J
      io_uring: allow IO worker threads to be frozen · 16efa4fc
      Jens Axboe 提交于
      With the freezer using the proper signaling to notify us of when it's
      time to freeze a thread, we can re-enable normal freezer usage for the
      IO threads. Ensure that SQPOLL, io-wq, and the io-wq manager call
      try_to_freeze() appropriately, and remove the default setting of
      PF_NOFREEZE from create_io_thread().
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      16efa4fc
    • P
      io_uring: fix OP_ASYNC_CANCEL across tasks · 58f99373
      Pavel Begunkov 提交于
      IORING_OP_ASYNC_CANCEL tries io-wq cancellation only for current task.
      If it fails go over tctx_list and try it out for every single tctx.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      58f99373
    • P
      io_uring: cancel sqpoll via task_work · 521d6a73
      Pavel Begunkov 提交于
      1) The first problem is io_uring_cancel_sqpoll() ->
      io_uring_cancel_task_requests() basically doing park(); park(); and so
      hanging.
      
      2) Another one is more subtle, when the master task is doing cancellations,
      but SQPOLL task submits in-between the end of the cancellation but
      before finish() requests taking a ref to the ctx, and so eternally
      locking it up.
      
      3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and
      same io_uring_cancel_sqpoll() from the owner task, they race for
      tctx->wait events. And there probably more of them.
      
      Instead do SQPOLL cancellations from within SQPOLL task context via
      task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal
      park()/unpark() during cancellation, which is ugly, subtle and anyway
      doesn't allow to do io_run_task_work() properly.
      
      io_uring_cancel_sqpoll() is called only from SQPOLL task context and
      under sqd locking, so all parking is removed from there. And so,
      io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by
      SQPOLL task, and that spare us from some headache.
      
      Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll,
      which is not used anymore.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      521d6a73
    • P
      io_uring: prevent racy sqd->thread checks · 26984fbf
      Pavel Begunkov 提交于
      SQPOLL thread to which we're trying to attach may be going away, it's
      not nice but a more serious problem is if io_sq_offload_create() sees
      sqd->thread==NULL, and tries to init it with a new thread. There are
      tons of ways it can be exploited or fail.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      26984fbf
  4. 12 3月, 2021 4 次提交
  5. 10 3月, 2021 13 次提交
  6. 08 3月, 2021 9 次提交
  7. 07 3月, 2021 1 次提交
  8. 06 3月, 2021 1 次提交
  9. 05 3月, 2021 2 次提交
    • J
      io_uring: make SQPOLL thread parking saner · 86e0d676
      Jens Axboe 提交于
      We have this weird true/false return from parking, and then some of the
      callers decide to look at that. It can lead to unbalanced parks and
      sqd locking. Have the callers check the thread status once it's parked.
      We know we have the lock at that point, so it's either valid or it's NULL.
      
      Fix race with parking on thread exit. We need to be careful here with
      ordering of the sdq->lock and the IO_SQ_THREAD_SHOULD_PARK bit.
      
      Rename sqd->completion to sqd->parked to reflect that this is the only
      thing this completion event doesn.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86e0d676
    • J
      io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return · b5b0ecb7
      Jens Axboe 提交于
      The callback can only be armed, if we get -EIOCBQUEUED returned. It's
      important that we clear the WAITQ bit for other cases, otherwise we can
      queue for async retry and filemap will assume that we're armed and
      return -EAGAIN instead of just blocking for the IO.
      
      Cc: stable@vger.kernel.org # 5.9+
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b5b0ecb7