1. 11 1月, 2021 1 次提交
  2. 10 1月, 2021 4 次提交
    • P
      io_uring: stop SQPOLL submit on creator's death · d9d05217
      Pavel Begunkov 提交于
      When the creator of SQPOLL io_uring dies (i.e. sqo_task), we don't want
      its internals like ->files and ->mm to be poked by the SQPOLL task, it
      have never been nice and recently got racy. That can happen when the
      owner undergoes destruction and SQPOLL tasks tries to submit new
      requests in parallel, and so calls io_sq_thread_acquire*().
      
      That patch halts SQPOLL submissions when sqo_task dies by introducing
      sqo_dead flag. Once set, the SQPOLL task must not do any submission,
      which is synchronised by uring_lock as well as the new flag.
      
      The tricky part is to make sure that disabling always happens, that
      means either the ring is discovered by creator's do_exit() -> cancel,
      or if the final close() happens before it's done by the creator. The
      last is guaranteed by the fact that for SQPOLL the creator task and only
      it holds exactly one file note, so either it pins up to do_exit() or
      removed by the creator on the final put in flush. (see comments in
      uring_flush() around file->f_count == 2).
      
      One more place that can trigger io_sq_thread_acquire_*() is
      __io_req_task_submit(). Shoot off requests on sqo_dead there, even
      though actually we don't need to. That's because cancellation of
      sqo_task should wait for the request before going any further.
      
      note 1: io_disable_sqo_submit() does io_ring_set_wakeup_flag() so the
      caller would enter the ring to get an error, but it still doesn't
      guarantee that the flag won't be cleared.
      
      note 2: if final __userspace__ close happens not from the creator
      task, the file note will pin the ring until the task dies.
      
      Fixed: b1b6b5a3 ("kernel/io_uring: cancel io_uring before task works")
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d9d05217
    • P
      io_uring: add warn_once for io_uring_flush() · 6b5733eb
      Pavel Begunkov 提交于
      files_cancel() should cancel all relevant requests and drop file notes,
      so we should never have file notes after that, including on-exit fput
      and flush. Add a WARN_ONCE to be sure.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6b5733eb
    • P
      io_uring: inline io_uring_attempt_task_drop() · 4f793dc4
      Pavel Begunkov 提交于
      A simple preparation change inlining io_uring_attempt_task_drop() into
      io_uring_flush().
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4f793dc4
    • P
      io_uring: io_rw_reissue lockdep annotations · 55e6ac1e
      Pavel Begunkov 提交于
      We expect io_rw_reissue() to take place only during submission with
      uring_lock held. Add a lockdep annotation to check that invariant.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      55e6ac1e
  3. 07 1月, 2021 4 次提交
  4. 06 1月, 2021 1 次提交
  5. 05 1月, 2021 4 次提交
  6. 31 12月, 2020 2 次提交
  7. 30 12月, 2020 1 次提交
    • J
      io_uring: don't assume mm is constant across submits · 77788775
      Jens Axboe 提交于
      If we COW the identity, we assume that ->mm never changes. But this
      isn't true of multiple processes end up sharing the ring. Hence treat
      id->mm like like any other process compontent when it comes to the
      identity mapping. This is pretty trivial, just moving the existing grab
      into io_grab_identity(), and including a check for the match.
      
      Cc: stable@vger.kernel.org # 5.10
      Fixes: 1e6fa521 ("io_uring: COW io_identity on mismatch")
      Reported-by: Christian Brauner <christian.brauner@ubuntu.com>:
      Tested-by: Christian Brauner <christian.brauner@ubuntu.com>:
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      77788775
  8. 23 12月, 2020 2 次提交
  9. 22 12月, 2020 1 次提交
  10. 21 12月, 2020 3 次提交
  11. 19 12月, 2020 1 次提交
  12. 18 12月, 2020 1 次提交
    • P
      io_uring: close a small race gap for files cancel · dfea9fce
      Pavel Begunkov 提交于
      The purpose of io_uring_cancel_files() is to wait for all requests
      matching ->files to go/be cancelled. We should first drop files of a
      request in io_req_drop_files() and only then make it undiscoverable for
      io_uring_cancel_files.
      
      First drop, then delete from list. It's ok to leave req->id->files
      dangling, because it's not dereferenced by cancellation code, only
      compared against. It would potentially go to sleep and be awaken by
      following in io_req_drop_files() wake_up().
      
      Fixes: 0f212204 ("io_uring: don't rely on weak ->files references")
      Cc: <stable@vger.kernel.org> # 5.5+
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dfea9fce
  13. 17 12月, 2020 7 次提交
  14. 13 12月, 2020 2 次提交
  15. 11 12月, 2020 1 次提交
  16. 10 12月, 2020 5 次提交
    • P
      io_uring: fix io_cqring_events()'s noflush · 59850d22
      Pavel Begunkov 提交于
      Checking !list_empty(&ctx->cq_overflow_list) around noflush in
      io_cqring_events() is racy, because if it fails but a request overflowed
      just after that, io_cqring_overflow_flush() still will be called.
      
      Remove the second check, it shouldn't be a problem for performance,
      because there is cq_check_overflow bit check just above.
      
      Cc: <stable@vger.kernel.org> # 5.5+
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      59850d22
    • P
      io_uring: fix racy IOPOLL flush overflow · 634578f8
      Pavel Begunkov 提交于
      It's not safe to call io_cqring_overflow_flush() for IOPOLL mode without
      hodling uring_lock, because it does synchronisation differently. Make
      sure we have it.
      
      As for io_ring_exit_work(), we don't even need it there because
      io_ring_ctx_wait_and_kill() already set force flag making all overflowed
      requests to be dropped.
      
      Cc: <stable@vger.kernel.org> # 5.5+
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      634578f8
    • P
      io_uring: fix racy IOPOLL completions · 31bff9a5
      Pavel Begunkov 提交于
      IOPOLL allows buffer remove/provide requests, but they doesn't
      synchronise by rules of IOPOLL, namely it have to hold uring_lock.
      
      Cc: <stable@vger.kernel.org> # 5.7+
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      31bff9a5
    • X
      io_uring: always let io_iopoll_complete() complete polled io · dad1b124
      Xiaoguang Wang 提交于
      Abaci Fuzz reported a double-free or invalid-free BUG in io_commit_cqring():
      [   95.504842] BUG: KASAN: double-free or invalid-free in io_commit_cqring+0x3ec/0x8e0
      [   95.505921]
      [   95.506225] CPU: 0 PID: 4037 Comm: io_wqe_worker-0 Tainted: G    B
      W         5.10.0-rc5+ #1
      [   95.507434] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [   95.508248] Call Trace:
      [   95.508683]  dump_stack+0x107/0x163
      [   95.509323]  ? io_commit_cqring+0x3ec/0x8e0
      [   95.509982]  print_address_description.constprop.0+0x3e/0x60
      [   95.510814]  ? vprintk_func+0x98/0x140
      [   95.511399]  ? io_commit_cqring+0x3ec/0x8e0
      [   95.512036]  ? io_commit_cqring+0x3ec/0x8e0
      [   95.512733]  kasan_report_invalid_free+0x51/0x80
      [   95.513431]  ? io_commit_cqring+0x3ec/0x8e0
      [   95.514047]  __kasan_slab_free+0x141/0x160
      [   95.514699]  kfree+0xd1/0x390
      [   95.515182]  io_commit_cqring+0x3ec/0x8e0
      [   95.515799]  __io_req_complete.part.0+0x64/0x90
      [   95.516483]  io_wq_submit_work+0x1fa/0x260
      [   95.517117]  io_worker_handle_work+0xeac/0x1c00
      [   95.517828]  io_wqe_worker+0xc94/0x11a0
      [   95.518438]  ? io_worker_handle_work+0x1c00/0x1c00
      [   95.519151]  ? __kthread_parkme+0x11d/0x1d0
      [   95.519806]  ? io_worker_handle_work+0x1c00/0x1c00
      [   95.520512]  ? io_worker_handle_work+0x1c00/0x1c00
      [   95.521211]  kthread+0x396/0x470
      [   95.521727]  ? _raw_spin_unlock_irq+0x24/0x30
      [   95.522380]  ? kthread_mod_delayed_work+0x180/0x180
      [   95.523108]  ret_from_fork+0x22/0x30
      [   95.523684]
      [   95.523985] Allocated by task 4035:
      [   95.524543]  kasan_save_stack+0x1b/0x40
      [   95.525136]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [   95.525882]  kmem_cache_alloc_trace+0x17b/0x310
      [   95.533930]  io_queue_sqe+0x225/0xcb0
      [   95.534505]  io_submit_sqes+0x1768/0x25f0
      [   95.535164]  __x64_sys_io_uring_enter+0x89e/0xd10
      [   95.535900]  do_syscall_64+0x33/0x40
      [   95.536465]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   95.537199]
      [   95.537505] Freed by task 4035:
      [   95.538003]  kasan_save_stack+0x1b/0x40
      [   95.538599]  kasan_set_track+0x1c/0x30
      [   95.539177]  kasan_set_free_info+0x1b/0x30
      [   95.539798]  __kasan_slab_free+0x112/0x160
      [   95.540427]  kfree+0xd1/0x390
      [   95.540910]  io_commit_cqring+0x3ec/0x8e0
      [   95.541516]  io_iopoll_complete+0x914/0x1390
      [   95.542150]  io_do_iopoll+0x580/0x700
      [   95.542724]  io_iopoll_try_reap_events.part.0+0x108/0x200
      [   95.543512]  io_ring_ctx_wait_and_kill+0x118/0x340
      [   95.544206]  io_uring_release+0x43/0x50
      [   95.544791]  __fput+0x28d/0x940
      [   95.545291]  task_work_run+0xea/0x1b0
      [   95.545873]  do_exit+0xb6a/0x2c60
      [   95.546400]  do_group_exit+0x12a/0x320
      [   95.546967]  __x64_sys_exit_group+0x3f/0x50
      [   95.547605]  do_syscall_64+0x33/0x40
      [   95.548155]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The reason is that once we got a non EAGAIN error in io_wq_submit_work(),
      we'll complete req by calling io_req_complete(), which will hold completion_lock
      to call io_commit_cqring(), but for polled io, io_iopoll_complete() won't
      hold completion_lock to call io_commit_cqring(), then there maybe concurrent
      access to ctx->defer_list, double free may happen.
      
      To fix this bug, we always let io_iopoll_complete() complete polled io.
      
      Cc: <stable@vger.kernel.org> # 5.5+
      Reported-by: NAbaci Fuzz <abaci@linux.alibaba.com>
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
      Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dad1b124
    • P
      io_uring: add timeout update · 9c8e11b3
      Pavel Begunkov 提交于
      Support timeout updates through IORING_OP_TIMEOUT_REMOVE with passed in
      IORING_TIMEOUT_UPDATE. Updates doesn't support offset timeout mode.
      Oirignal timeout.off will be ignored as well.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      [axboe: remove now unused 'ret' variable]
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9c8e11b3