1. 22 2月, 2021 2 次提交
    • P
      io_uring: clear request count when freeing caches · 8e5c66c4
      Pavel Begunkov 提交于
      BUG: KASAN: double-free or invalid-free in io_req_caches_free.constprop.0+0x3ce/0x530 fs/io_uring.c:8709
      
      Workqueue: events_unbound io_ring_exit_work
      Call Trace:
       [...]
       __cache_free mm/slab.c:3424 [inline]
       kmem_cache_free_bulk+0x4b/0x1b0 mm/slab.c:3744
       io_req_caches_free.constprop.0+0x3ce/0x530 fs/io_uring.c:8709
       io_ring_ctx_free fs/io_uring.c:8764 [inline]
       io_ring_exit_work+0x518/0x6b0 fs/io_uring.c:8846
       process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
       worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
       kthread+0x3b1/0x4a0 kernel/kthread.c:292
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
      
      Freed by task 11900:
       [...]
       kmem_cache_free_bulk+0x4b/0x1b0 mm/slab.c:3744
       io_req_caches_free.constprop.0+0x3ce/0x530 fs/io_uring.c:8709
       io_uring_flush+0x483/0x6e0 fs/io_uring.c:9237
       filp_close+0xb4/0x170 fs/open.c:1286
       close_files fs/file.c:403 [inline]
       put_files_struct fs/file.c:418 [inline]
       put_files_struct+0x1d0/0x350 fs/file.c:415
       exit_files+0x7e/0xa0 fs/file.c:435
       do_exit+0xc27/0x2ae0 kernel/exit.c:820
       do_group_exit+0x125/0x310 kernel/exit.c:922
       [...]
      
      io_req_caches_free() doesn't zero submit_state->free_reqs, so io_uring
      considers just freed requests to be good and sound and will reuse or
      double free them. Zero the counter.
      
      Reported-by: syzbot+30b4936dcdb3aafa4fb4@syzkaller.appspotmail.com
      Fixes: 41be53e9 ("io_uring: kill cached requests from exiting task closing the ring")
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8e5c66c4
    • P
      io_uring: run task_work on io_uring_register() · b6c23dd5
      Pavel Begunkov 提交于
      Do run task_work before io_uring_register(), that might make a first
      quiesce round much nicer. We generally do that for any syscall invocation
      to avoid spurious -EINTR/-ERESTARTSYS, for task_work that we generate.
      This patch brings io_uring_register() inline with the two other io_uring
      syscalls.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b6c23dd5
  2. 21 2月, 2021 7 次提交
    • P
      io_uring: fix leaving invalid req->flags · ebf4a5db
      Pavel Begunkov 提交于
      sqe->flags are subset of req flags, so incorrectly copied may span into
      in-kernel flags and wreck havoc, e.g. by setting REQ_F_INFLIGHT.
      
      Fixes: 5be9ad1e ("io_uring: optimise io_init_req() flags setting")
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ebf4a5db
    • P
      io_uring: wait potential ->release() on resurrect · 88f171ab
      Pavel Begunkov 提交于
      There is a short window where percpu_refs are already turned zero, but
      we try to do resurrect(). Play nicer and wait for ->release() to happen
      in this case and proceed as everything is ok. One downside for ctx refs
      is that we can ignore signal_pending() on a rare occasion, but someone
      else should check for it later if needed.
      
      Cc: <stable@vger.kernel.org> # 5.5+
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      88f171ab
    • P
      io_uring: keep generic rsrc infra generic · f2303b1f
      Pavel Begunkov 提交于
      io_rsrc_ref_quiesce() is a generic resource function, though now it
      was wired to allocate and initialise ref nodes with file-specific
      callbacks/etc. Keep it sane by passing in as a parameters everything we
      need for initialisations, otherwise it will hurt us badly one day.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f2303b1f
    • P
      io_uring: zero ref_node after killing it · e6cb007c
      Pavel Begunkov 提交于
      After a rsrc/files reference node's refs are killed, it must never be
      used. And that's how it works, it either assigns a new node or kills the
      whole data table.
      
      Let's explicitly NULL it, that shouldn't be necessary, but if something
      would go wrong I'd rather catch a NULL dereference to using a dangling
      pointer.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      e6cb007c
    • J
      io_uring: make the !CONFIG_NET helpers a bit more robust · 99a10081
      Jens Axboe 提交于
      With the prep and prep async split, we now have potentially 3 helpers
      that need to be defined for !CONFIG_NET. Add some helpers to do just
      that.
      
      Fixes the following compile error on !CONFIG_NET:
      
      fs/io_uring.c:6171:10: error: implicit declaration of function
      'io_sendmsg_prep_async'; did you mean 'io_req_prep_async'?
      [-Werror=implicit-function-declaration]
         return io_sendmsg_prep_async(req);
                   ^~~~~~~~~~~~~~~~~~~~~
      	     io_req_prep_async
      
      Fixes: 93642ef8 ("io_uring: split sqe-prep and async setup")
      Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      99a10081
    • H
      io_uring: don't hold uring_lock when calling io_run_task_work* · 8bad28d8
      Hao Xu 提交于
      Abaci reported the below issue:
      [  141.400455] hrtimer: interrupt took 205853 ns
      [  189.869316] process 'usr/local/ilogtail/ilogtail_0.16.26' started with executable stack
      [  250.188042]
      [  250.188327] ============================================
      [  250.189015] WARNING: possible recursive locking detected
      [  250.189732] 5.11.0-rc4 #1 Not tainted
      [  250.190267] --------------------------------------------
      [  250.190917] a.out/7363 is trying to acquire lock:
      [  250.191506] ffff888114dbcbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __io_req_task_submit+0x29/0xa0
      [  250.192599]
      [  250.192599] but task is already holding lock:
      [  250.193309] ffff888114dbfbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_register+0xad/0x210
      [  250.194426]
      [  250.194426] other info that might help us debug this:
      [  250.195238]  Possible unsafe locking scenario:
      [  250.195238]
      [  250.196019]        CPU0
      [  250.196411]        ----
      [  250.196803]   lock(&ctx->uring_lock);
      [  250.197420]   lock(&ctx->uring_lock);
      [  250.197966]
      [  250.197966]  *** DEADLOCK ***
      [  250.197966]
      [  250.198837]  May be due to missing lock nesting notation
      [  250.198837]
      [  250.199780] 1 lock held by a.out/7363:
      [  250.200373]  #0: ffff888114dbfbe8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_register+0xad/0x210
      [  250.201645]
      [  250.201645] stack backtrace:
      [  250.202298] CPU: 0 PID: 7363 Comm: a.out Not tainted 5.11.0-rc4 #1
      [  250.203144] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  250.203887] Call Trace:
      [  250.204302]  dump_stack+0xac/0xe3
      [  250.204804]  __lock_acquire+0xab6/0x13a0
      [  250.205392]  lock_acquire+0x2c3/0x390
      [  250.205928]  ? __io_req_task_submit+0x29/0xa0
      [  250.206541]  __mutex_lock+0xae/0x9f0
      [  250.207071]  ? __io_req_task_submit+0x29/0xa0
      [  250.207745]  ? 0xffffffffa0006083
      [  250.208248]  ? __io_req_task_submit+0x29/0xa0
      [  250.208845]  ? __io_req_task_submit+0x29/0xa0
      [  250.209452]  ? __io_req_task_submit+0x5/0xa0
      [  250.210083]  __io_req_task_submit+0x29/0xa0
      [  250.210687]  io_async_task_func+0x23d/0x4c0
      [  250.211278]  task_work_run+0x89/0xd0
      [  250.211884]  io_run_task_work_sig+0x50/0xc0
      [  250.212464]  io_sqe_files_unregister+0xb2/0x1f0
      [  250.213109]  __io_uring_register+0x115a/0x1750
      [  250.213718]  ? __x64_sys_io_uring_register+0xad/0x210
      [  250.214395]  ? __fget_files+0x15a/0x260
      [  250.214956]  __x64_sys_io_uring_register+0xbe/0x210
      [  250.215620]  ? trace_hardirqs_on+0x46/0x110
      [  250.216205]  do_syscall_64+0x2d/0x40
      [  250.216731]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  250.217455] RIP: 0033:0x7f0fa17e5239
      [  250.218034] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05  3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
      [  250.220343] RSP: 002b:00007f0fa1eeac48 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
      [  250.221360] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0fa17e5239
      [  250.222272] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000008
      [  250.223185] RBP: 00007f0fa1eeae20 R08: 0000000000000000 R09: 0000000000000000
      [  250.224091] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [  250.224999] R13: 0000000000021000 R14: 0000000000000000 R15: 00007f0fa1eeb700
      
      This is caused by calling io_run_task_work_sig() to do work under
      uring_lock while the caller io_sqe_files_unregister() already held
      uring_lock.
      To fix this issue, briefly drop uring_lock when calling
      io_run_task_work_sig(), and there are two things to concern:
      
      - hold uring_lock in io_ring_ctx_free() around io_sqe_files_unregister()
          this is for consistency of lock/unlock.
      - add new fixed rsrc ref node before dropping uring_lock
          it's not safe to do io_uring_enter-->percpu_ref_get() with a dying one.
      - check if rsrc_data->refs is dying to avoid parallel io_sqe_files_unregister
      Reported-by: NAbaci <abaci@linux.alibaba.com>
      Fixes: 1ffc5422 ("io_uring: fix io_sqe_files_unregister() hangs")
      Suggested-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NHao Xu <haoxu@linux.alibaba.com>
      [axboe: fixes from Pavel folded in]
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8bad28d8
    • P
      io_uring: fail io-wq submission from a task_work · a3df7698
      Pavel Begunkov 提交于
      In case of failure io_wq_submit_work() needs to post an CQE and so
      potentially take uring_lock. The safest way to deal with it is to do
      that from under task_work where we can safely take the lock.
      
      Also, as io_iopoll_check() holds the lock tight and releases it
      reluctantly, it will play nicer in the furuter with notifying an
      iopolling task about new such pending failed requests.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a3df7698
  3. 19 2月, 2021 12 次提交
  4. 18 2月, 2021 1 次提交
  5. 17 2月, 2021 1 次提交
  6. 16 2月, 2021 1 次提交
  7. 14 2月, 2021 3 次提交
  8. 13 2月, 2021 4 次提交
  9. 12 2月, 2021 9 次提交