1. 12 3月, 2021 2 次提交
    • J
      io_uring: perform IOPOLL reaping if canceler is thread itself · d052d1d6
      Jens Axboe 提交于
      We bypass IOPOLL completion polling (and reaping) for the SQPOLL thread,
      but if it's the thread itself invoking cancelations, then we still need
      to perform it or no one will.
      
      Fixes: 9936c7c2 ("io_uring: deduplicate core cancellations sequence")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d052d1d6
    • J
      io_uring: force creation of separate context for ATTACH_WQ and non-threads · 5c2469e0
      Jens Axboe 提交于
      Earlier kernels had SQPOLL threads that could share across anything, as
      we grabbed the context we needed on a per-ring basis. This is no longer
      the case, so only allow attaching directly if we're in the same thread
      group. That is the common use case. For non-group tasks, just setup a
      new context and thread as we would've done if sharing wasn't set. This
      isn't 100% ideal in terms of CPU utilization for the forked and share
      case, but hopefully that isn't much of a concern. If it is, there are
      plans in motion for how to improve that. Most importantly, we want to
      avoid app side regressions where sharing worked before and now doesn't.
      With this patch, functionality is equivalent to previous kernels that
      supported IORING_SETUP_ATTACH_WQ with SQPOLL.
      Reported-by: NStefan Metzmacher <metze@samba.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5c2469e0
  2. 10 3月, 2021 13 次提交
  3. 08 3月, 2021 9 次提交
  4. 07 3月, 2021 1 次提交
  5. 06 3月, 2021 1 次提交
  6. 05 3月, 2021 6 次提交
    • J
      io_uring: make SQPOLL thread parking saner · 86e0d676
      Jens Axboe 提交于
      We have this weird true/false return from parking, and then some of the
      callers decide to look at that. It can lead to unbalanced parks and
      sqd locking. Have the callers check the thread status once it's parked.
      We know we have the lock at that point, so it's either valid or it's NULL.
      
      Fix race with parking on thread exit. We need to be careful here with
      ordering of the sdq->lock and the IO_SQ_THREAD_SHOULD_PARK bit.
      
      Rename sqd->completion to sqd->parked to reflect that this is the only
      thing this completion event doesn.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      86e0d676
    • J
      io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return · b5b0ecb7
      Jens Axboe 提交于
      The callback can only be armed, if we get -EIOCBQUEUED returned. It's
      important that we clear the WAITQ bit for other cases, otherwise we can
      queue for async retry and filemap will assume that we're armed and
      return -EAGAIN instead of just blocking for the IO.
      
      Cc: stable@vger.kernel.org # 5.9+
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b5b0ecb7
    • J
      io_uring: don't keep looping for more events if we can't flush overflow · ca0a2651
      Jens Axboe 提交于
      It doesn't make sense to wait for more events to come in, if we can't
      even flush the overflow we already have to the ring. Return -EBUSY for
      that condition, just like we do for attempts to submit with overflow
      pending.
      
      Cc: stable@vger.kernel.org # 5.11
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ca0a2651
    • J
      io_uring: move to using create_io_thread() · 46fe18b1
      Jens Axboe 提交于
      This allows us to do task creation and setup without needing to use
      completions to try and synchronize with the starting thread. Get rid of
      the old io_wq_fork_thread() wrapper, and the 'wq' and 'worker' startup
      completion events - we can now do setup before the task is running.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      46fe18b1
    • P
      io_uring: reliably cancel linked timeouts · dd59a3d5
      Pavel Begunkov 提交于
      Linked timeouts are fired asynchronously (i.e. soft-irq), and use
      generic cancellation paths to do its stuff, including poking into io-wq.
      The problem is that it's racy to access tctx->io_wq, as
      io_uring_task_cancel() and others may be happening at this exact moment.
      Mark linked timeouts with REQ_F_INLIFGHT for now, making sure there are
      no timeouts before io-wq destraction.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dd59a3d5
    • P
      io_uring: cancel-match based on flags · b05a1bcd
      Pavel Begunkov 提交于
      Instead of going into request internals, like checking req->file->f_op,
      do match them based on REQ_F_INFLIGHT, it's set only when we want it to
      be reliably cancelled.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b05a1bcd
  7. 04 3月, 2021 8 次提交
    • J
      io_uring: ensure that threads freeze on suspend · e4b4a13f
      Jens Axboe 提交于
      Alex reports that his system fails to suspend using 5.12-rc1, with the
      following dump:
      
      [  240.650300] PM: suspend entry (deep)
      [  240.650748] Filesystems sync: 0.000 seconds
      [  240.725605] Freezing user space processes ...
      [  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
      [  260.739497] task:iou-mgr-446     state:S stack:    0 pid:  516 ppid:   439 flags:0x00004224
      [  260.739504] Call Trace:
      [  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
      [  260.739515]  ? pick_next_task_fair+0x197/0x1cde
      [  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
      [  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
      [  260.739525]  ? __schedule+0x57/0x6d6
      [  260.739529]  ? del_timer_sync+0xb9/0x115
      [  260.739533]  ? schedule+0x63/0xd5
      [  260.739536]  ? schedule_timeout+0x219/0x356
      [  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
      [  260.739544]  ? io_wq_manager+0x73/0xb1
      [  260.739549]  ? io_wq_create+0x262/0x262
      [  260.739553]  ? ret_from_fork+0x22/0x30
      [  260.739557] task:iou-mgr-517     state:S stack:    0 pid:  522 ppid:   439 flags:0x00004224
      [  260.739561] Call Trace:
      [  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
      [  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
      [  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
      [  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
      [  260.739574]  ? __schedule+0x5b7/0x6d6
      [  260.739578]  ? del_timer_sync+0x70/0x115
      [  260.739581]  ? schedule_timeout+0x211/0x356
      [  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
      [  260.739588]  ? io_wq_check_workers+0x15/0x11f
      [  260.739592]  ? io_wq_manager+0x69/0xb1
      [  260.739596]  ? io_wq_create+0x262/0x262
      [  260.739600]  ? ret_from_fork+0x22/0x30
      [  260.739603] task:iou-wrk-517     state:S stack:    0 pid:  523 ppid:   439 flags:0x00004224
      [  260.739607] Call Trace:
      [  260.739609]  ? __schedule+0x5b7/0x6d6
      [  260.739614]  ? schedule+0x63/0xd5
      [  260.739617]  ? schedule_timeout+0x219/0x356
      [  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
      [  260.739624]  ? task_thread.isra.0+0x148/0x3af
      [  260.739628]  ? task_thread_unbound+0xa/0xa
      [  260.739632]  ? task_thread_bound+0x7/0x7
      [  260.739636]  ? ret_from_fork+0x22/0x30
      [  260.739647] OOM killer enabled.
      [  260.739648] Restarting tasks ... done.
      [  260.740077] PM: suspend exit
      
      Play nice and ensure that any thread we create will call try_to_freeze()
      at an opportune time so that memory suspend can proceed. For the io-wq
      worker threads, mark them as PF_NOFREEZE. They could potentially be
      blocked for a long time.
      Reported-by: NAlex Xu (Hello71) <alex_y_xu@yahoo.ca>
      Tested-by: NAlex Xu (Hello71) <alex_y_xu@yahoo.ca>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e4b4a13f
    • P
      io_uring: remove extra in_idle wake up · b23fcf47
      Pavel Begunkov 提交于
      io_dismantle_req() is always followed by io_put_task(), which already do
      proper in_idle wake ups, so we can skip waking the owner task in
      io_dismantle_req(). The rules are simpler now, do io_put_task() shortly
      after ending a request, and it will be fine.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b23fcf47
    • P
      io_uring: inline __io_queue_async_work() · ebf93667
      Pavel Begunkov 提交于
      __io_queue_async_work() is only called from io_queue_async_work(),
      inline it.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ebf93667
    • P
      io_uring: inline io_req_clean_work() · f85c310a
      Pavel Begunkov 提交于
      Inline io_req_clean_work(), less code and easier to analyse
      tctx dependencies and refs usage.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f85c310a
    • P
      io_uring: choose right tctx->io_wq for try cancel · 64c72123
      Pavel Begunkov 提交于
      When we cancel SQPOLL, @task in io_uring_try_cancel_requests() will
      differ from current. Use the right tctx from passed in @task, and don't
      forget that it can be NULL when the io_uring ctx exits.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      64c72123
    • J
      io_uring: fix -EAGAIN retry with IOPOLL · 3e6a0d3c
      Jens Axboe 提交于
      We no longer revert the iovec on -EIOCBQUEUED, see commit ab2125df,
      and this started causing issues for IOPOLL on devies that run out of
      request slots. Turns out what outside of needing a revert for those, we
      also had a bug where we didn't properly setup retry inside the submission
      path. That could cause re-import of the iovec, if any, and that could lead
      to spurious results if the application had those allocated on the stack.
      
      Catch -EAGAIN retry and make the iovec stable for IOPOLL, just like we do
      for !IOPOLL retries.
      
      Cc: <stable@vger.kernel.org> # 5.9+
      Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
      Reported-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3e6a0d3c
    • P
      io_uring: remove sqo_task · 16270893
      Pavel Begunkov 提交于
      Now, sqo_task is used only for a warning that is not interesting anymore
      since sqo_dead is gone, remove all of that including ctx->sqo_task.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      16270893
    • P
      io_uring: kill sqo_dead and sqo submission halting · 70aacfe6
      Pavel Begunkov 提交于
      As SQPOLL task doesn't poke into ->sqo_task anymore, there is no need to
      kill the sqo when the master task exits. Before it was necessary to
      avoid races accessing sqo_task->files with removing them.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      [axboe: don't forget to enable SQPOLL before exit, if started disabled]
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      70aacfe6