提交 · cf27f3b14961845d816c49abc99aae4863207c77 · openeuler / Kernel

12 4月, 2021 21 次提交

io_uring: optimise tctx node checks/alloc · cf27f3b1

由 Pavel Begunkov 提交于 3月 19, 2021

First of all, w need to set tctx->sqpoll only when we add a new entry
into ->xa, so move it from the hot path. Also extract a hot path for
io_uring_add_task_file() as an inline helper.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cf27f3b1

io_uring: optimise io_uring_enter() · 33f993da

由 Pavel Begunkov 提交于 3月 19, 2021

Add unlikely annotations, because my compiler pretty much mispredicts
every first check, and apart jumping around in the fast path, it also
generates extra instructions, like in advance setting ret value.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

33f993da

io_uring: don't take ctx refs in task_work handler · 493f3b15

由 Pavel Begunkov 提交于 3月 19, 2021

__tctx_task_work() guarantees that ctx won't be killed while running
task_works, so we can remove now unnecessary ctx pinning for internally
armed polling.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

493f3b15

io_uring: transform ret == 0 for poll cancelation completions · 45ab03b1

由 Jens Axboe 提交于 2月 23, 2021

We can set canceled == true and complete out-of-line, ensure that we catch
that and correctly return -ECANCELED if the poll operation got canceled.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

45ab03b1

io_uring: correct comment on poll vs iopoll · b9b0e0d3

由 Jens Axboe 提交于 2月 23, 2021

The correct function is io_iopoll_complete(), which deals with completions
of IOPOLL requests, not io_poll_complete().
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b9b0e0d3

io_uring: cache async and regular file state for fixed files · 7b29f92d

由 Jens Axboe 提交于 3月 12, 2021

We have to dig quite deep to check for particularly whether or not a
file supports a fast-path nonblock attempt. For fixed files, we can do
this lookup once and cache the state instead.

This adds two new bits to track whether we support async read/write
attempt, and lines up the REQ_F_ISREG bit with those two. The file slot
re-uses the last 3 (or 2, for 32-bit) of the file pointer to cache that
state, and then we mask it in when we go and use a fixed file.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7b29f92d

io_uring: don't check for io_uring_fops for fixed files · d44f554e

由 Jens Axboe 提交于 3月 12, 2021

We don't allow them at registration time, so limit the check for needing
inflight tracking in io_file_get() to the non-fixed path.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d44f554e

io_uring: simplify io_sqd_update_thread_idle() · c9dca27d

由 Pavel Begunkov 提交于 3月 10, 2021

Use a more comprehensible() max instead of hand coding it with ifs in
io_sqd_update_thread_idle().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c9dca27d

io_uring: switch to atomic_t for io_kiocb reference count · abc54d63

由 Jens Axboe 提交于 2月 24, 2021

io_uring manipulates references twice for each request, and hence is very
sensitive to performance of the reference count. This commit borrows a
trick from:

commit f958d7b5
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Thu Apr 11 10:06:20 2019 -0700

    mm: make page ref count overflow check tighter and more explicit

and switches to atomic_t for references, while still retaining overflow
and underflow checks.

This is good for a 2-3% increase in peak IOPS on a single core. Before:

IOPS=2970879, IOS/call=31/31, inflight=128 (128)
IOPS=2952597, IOS/call=31/31, inflight=128 (128)
IOPS=2943904, IOS/call=31/31, inflight=128 (128)
IOPS=2930006, IOS/call=31/31, inflight=96 (96)

and after:

IOPS=3054354f, IOS/call=31/31, inflight=128 (128)
IOPS=3059038, IOS/call=31/31, inflight=128 (128)
IOPS=3060320, IOS/call=31/31, inflight=128 (128)
IOPS=3068256, IOS/call=31/31, inflight=96 (96)
Signed-off-by: NJens Axboe <axboe@kernel.dk>

abc54d63

io_uring: wrap io_kiocb reference count manipulation in helpers · de9b4cca

由 Jens Axboe 提交于 2月 24, 2021

No functional changes in this patch, just in preparation for handling the
references a bit more efficiently.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

de9b4cca

io_uring: simplify io_resubmit_prep() · 179ae0d1

由 Pavel Begunkov 提交于 2月 28, 2021

If not for async_data NULL check, io_resubmit_prep() is already an rw
specific version of io_req_prep_async(), but slower because 1) it always
goes through io_import_iovec() even if following io_setup_async_rw() the
result 2) instead of initialising iovec/iter in-place it does it
on-stack and then copies with io_setup_async_rw().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

179ae0d1

io_uring: merge defer_prep() and prep_async() · b7e298d2

由 Pavel Begunkov 提交于 2月 28, 2021

Merge two function and do renaming in favour of the second one, it
relays the meaning better.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b7e298d2

io_uring: rethink def->needs_async_data · 26f0505a

由 Pavel Begunkov 提交于 2月 28, 2021

needs_async_data controls allocation of async_data, and used in two
cases. 1) when async setup requires it (by io_req_prep_async() or
handler themselves), and 2) when op always needs additional space to
operate, like timeouts do.

Opcode preps already don't bother about the second case and do
allocation unconditionally, restrict needs_async_data to the first case
only and rename it into needs_async_setup.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
[axboe: update for IOPOLL fix]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

26f0505a

io_uring: untie alloc_async_data and needs_async_data · 6cb78689

由 Pavel Begunkov 提交于 2月 28, 2021

All opcode handlers pretty well know whether they need async data or
not, and can skip testing for needs_async_data. The exception is rw
the generic path, but those test the flag by hand anyway. So, check the
flag and make io_alloc_async_data() allocating unconditionally.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6cb78689

io_uring: refactor out send/recv async setup · 2e052d44

由 Pavel Begunkov 提交于 2月 28, 2021

IORING_OP_[SEND,RECV] don't need async setup neither will get into
io_req_prep_async(). Remove them from io_req_prep_async() and remove
needs_async_data checks from the related setup functions.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2e052d44

io_uring: use better types for cflags · 8c3f9cd1

由 Pavel Begunkov 提交于 2月 28, 2021

__io_cqring_fill_event() takes cflags as long to squeeze it into u32 in
an CQE, awhile all users pass int or unsigned. Replace it with unsigned
int and store it as u32 in struct io_completion to match CQE.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8c3f9cd1

io_uring: refactor provide/remove buffer locking · 9fb8cb49

由 Pavel Begunkov 提交于 2月 28, 2021

Always complete request holding the mutex instead of doing that strange
dancing with conditional ordering.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9fb8cb49

io_uring: add a helper failing not issued requests · f41db273

由 Pavel Begunkov 提交于 2月 28, 2021

Add a simple helper doing CQE posting, marking request for link-failure,
and putting both submission and completion references.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f41db273

io_uring: further deduplicate file slot selection · dafecf19

由 Pavel Begunkov 提交于 2月 28, 2021

io_fixed_file_slot() and io_file_from_index() behave pretty similarly,
DRY and call one from another.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dafecf19

io_uring: reuse io_req_task_queue_fail() · 2c4b8eb6

由 Pavel Begunkov 提交于 2月 28, 2021

Use io_req_task_queue_fail() on the fail path of io_req_task_queue().
It's unlikely to happen, so don't care about additional overhead, but
allows to keep all the req->result invariant in a single function.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2c4b8eb6

io_uring: avoid taking ctx refs for task-cancel · e83acd7d

由 Pavel Begunkov 提交于 2月 28, 2021

Don't bother to take a ctx->refs for io_req_task_cancel() because it
take uring_lock before putting a request, and the context is promised to
stay alive until unlock happens.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e83acd7d

09 4月, 2021 1 次提交

io_uring: fix rw req completion · 97284637

由 Pavel Begunkov 提交于 4月 08, 2021

WARNING: at fs/io_uring.c:8578 io_ring_exit_work.cold+0x0/0x18

As reissuing is now passed back by REQ_F_REISSUE and kiocb_done()
internally uses __io_complete_rw(), it may stop after setting the flag
so leaving a dangling request.

There are tricky edge cases, e.g. reading beyound file, boundary, so
the easiest way is to hand code reissue in kiocb_done() as
__io_complete_rw() was doing for us before.

Fixes: 230d50d4 ("io_uring: move reissue into regular IO path")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f602250d292f8a84cca9a01d747744d1e797be26.1617842918.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

97284637

08 4月, 2021 1 次提交

io_uring: clear F_REISSUE right after getting it · 6ad7f233

由 Pavel Begunkov 提交于 4月 08, 2021

There are lots of ways r/w request may continue its path after getting
REQ_F_REISSUE, it's not necessarily io-wq and can be, e.g. apoll,
and submitted via io_async_task_func() -> __io_req_task_submit()

Clear the flag right after getting it, so the next attempt is well
prepared regardless how the request will be executed.

Fixes: 230d50d4 ("io_uring: move reissue into regular IO path")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/11dcead939343f4e27cab0074d34afcab771bfa4.1617842918.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6ad7f233

03 4月, 2021 1 次提交

io_uring: fix !CONFIG_BLOCK compilation failure · e82ad485

由 Jens Axboe 提交于 4月 02, 2021

kernel test robot correctly pinpoints a compilation failure if
CONFIG_BLOCK isn't set:

fs/io_uring.c: In function '__io_complete_rw':
>> fs/io_uring.c:2509:48: error: implicit declaration of function 'io_rw_should_reissue'; did you mean 'io_rw_reissue'? [-Werror=implicit-function-declaration]
    2509 |  if ((res == -EAGAIN || res == -EOPNOTSUPP) && io_rw_should_reissue(req)) {
         |                                                ^~~~~~~~~~~~~~~~~~~~
         |                                                io_rw_reissue
    cc1: some warnings being treated as errors

Ensure that we have a stub declaration of io_rw_should_reissue() for
!CONFIG_BLOCK.

Fixes: 230d50d4 ("io_uring: move reissue into regular IO path")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e82ad485

02 4月, 2021 1 次提交

io_uring: move reissue into regular IO path · 230d50d4

由 Jens Axboe 提交于 4月 01, 2021

It's non-obvious how retry is done for block backed files, when it happens
off the kiocb done path. It also makes it tricky to deal with the iov_iter
handling.

Just mark the req as needing a reissue, and handling it from the
submission path instead. This makes it directly obvious that we're not
re-importing the iovec from userspace past the submit point, and it means
that we can just reuse our usual -EAGAIN retry path from the read/write
handling.

At some point in the future, we'll gain the ability to always reliably
return -EAGAIN through the stack. A previous attempt on the block side
didn't pan out and got reverted, hence the need to check for this
information out-of-band right now.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

230d50d4

01 4月, 2021 3 次提交

io_uring: fix EIOCBQUEUED iter revert · 07204f21

由 Pavel Begunkov 提交于 4月 01, 2021

iov_iter_revert() is done in completion handlers that happensf before
read/write returns -EIOCBQUEUED, no need to repeat reverting afterwards.
Moreover, even though it may appear being just a no-op, it's actually
races with 1) user forging a new iovec of a different size 2) reissue,
that is done via io-wq continues completely asynchronously.

Fixes: 3e6a0d3c ("io_uring: fix -EAGAIN retry with IOPOLL")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

07204f21

io_uring/io-wq: protect against sprintf overflow · 696ee88a

由 Pavel Begunkov 提交于 4月 01, 2021

task_pid may be large enough to not fit into the left space of
TASK_COMM_LEN-sized buffers and overflow in sprintf. We not so care
about uniqueness, so replace it with safer snprintf().
Reported-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1702c6145d7e1c46fbc382f28334c02e1a3d3994.1617267273.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

696ee88a

io_uring: don't mark S_ISBLK async work as unbounded · 4b982bd0

由 Jens Axboe 提交于 4月 01, 2021

S_ISBLK is marked as unbounded work for async preparation, because it
doesn't match S_ISREG. That is incorrect, as any read/write to a block
device is also a bounded operation. Fix it up and ensure that S_ISBLK
isn't marked unbounded.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4b982bd0

31 3月, 2021 1 次提交

io_uring: drop sqd lock before handling signals for SQPOLL · 82734c5b

由 Jens Axboe 提交于 3月 29, 2021

Don't call into get_signal() with the sqd mutex held, it'll fail if we're
freezing the task and we'll get complaints on locks still being held:

====================================
WARNING: iou-sqp-8386/8387 still has locks held!
5.12.0-rc4-syzkaller #0 Not tainted
------------------------------------
1 lock held by iou-sqp-8386/8387:
 #0: ffff88801e1d2470 (&sqd->lock){+.+.}-{3:3}, at: io_sq_thread+0x24c/0x13a0 fs/io_uring.c:6731

 stack backtrace:
 CPU: 1 PID: 8387 Comm: iou-sqp-8386 Not tainted 5.12.0-rc4-syzkaller #0
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:79 [inline]
  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
  try_to_freeze include/linux/freezer.h:66 [inline]
  get_signal+0x171a/0x2150 kernel/signal.c:2576
  io_sq_thread+0x8d2/0x13a0 fs/io_uring.c:6748

Fold the get_signal() case in with the parking checks, as we need to drop
the lock in both cases, and since we need to be checking for parking when
juggling the lock anyway.

Reported-by: syzbot+796d767eb376810256f5@syzkaller.appspotmail.com
Fixes: dbe1bdbb ("io_uring: handle signals for IO threads like a normal thread")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

82734c5b

29 3月, 2021 2 次提交

io_uring: handle setup-failed ctx in kill_timeouts · 51520426

由 Pavel Begunkov 提交于 3月 29, 2021

general protection fault, probably for non-canonical address
	0xdffffc0000000018: 0000 [#1] KASAN: null-ptr-deref
	in range [0x00000000000000c0-0x00000000000000c7]
RIP: 0010:io_commit_cqring+0x37f/0xc10 fs/io_uring.c:1318
Call Trace:
 io_kill_timeouts+0x2b5/0x320 fs/io_uring.c:8606
 io_ring_ctx_wait_and_kill+0x1da/0x400 fs/io_uring.c:8629
 io_uring_create fs/io_uring.c:9572 [inline]
 io_uring_setup+0x10da/0x2ae0 fs/io_uring.c:9599
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

It can get into wait_and_kill() before setting up ctx->rings, and hence
io_commit_cqring() fails. Mimic poll cancel and do it only when we
completed events, there can't be any requests if it failed before
initialising rings.

Fixes: 80c4cbdb ("io_uring: do post-completion chore on t-out cancel")
Reported-by: syzbot+0e905eb8228070c457a0@syzkaller.appspotmail.com
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/660261a48f0e7abf260c8e43c87edab3c16736fa.1617014345.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

51520426

io_uring: always go for cancellation spin on exec · 5a978dcf

由 Pavel Begunkov 提交于 3月 27, 2021

Always try to do cancellation in __io_uring_task_cancel() at least once,
so it actually goes and cleans its sqpoll tasks (i.e. via
io_sqpoll_cancel_sync()), otherwise sqpoll task may submit new requests
after cancellation and it's racy for many reasons.

Fixes: 521d6a73 ("io_uring: cancel sqpoll via task_work")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0a21bd6d794bb1629bc906dd57a57b2c2985a8ac.1616839147.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

5a978dcf

28 3月, 2021 6 次提交

io_uring: remove unsued assignment to pointer io · 2b8ed1c9

由 Colin Ian King 提交于 3月 26, 2021

There is an assignment to io that is never read after the assignment,
the assignment is redundant and can be removed.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2b8ed1c9

io_uring: don't cancel extra on files match · 78d9d7c2

由 Pavel Begunkov 提交于 3月 25, 2021

As tasks always wait and kill their io-wq on exec/exit, files are of no
more concern to us, so we don't need to specifically cancel them by hand
in those cases. Moreover we should not, because io_match_task() looks at
req->task->files now, which is always true and so leads to extra
cancellations, that wasn't a case before per-task io-wq.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0566c1de9b9dd417f5de345c817ca953580e0e2e.1616696997.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

78d9d7c2

io_uring: don't cancel-track common timeouts · 2482b58f

由 Pavel Begunkov 提交于 3月 25, 2021

Don't account usual timeouts (i.e. not linked) as REQ_F_INFLIGHT but
keep behaviour prior to dd59a3d5 ("io_uring: reliably cancel linked
timeouts").
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/104441ef5d97e3932113d44501fda0df88656b83.1616696997.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

2482b58f

io_uring: do post-completion chore on t-out cancel · 80c4cbdb

由 Pavel Begunkov 提交于 3月 25, 2021

Don't forget about io_commit_cqring() + io_cqring_ev_posted() after
exit/exec cancelling timeouts. Both functions declared only after
io_kill_timeouts(), so to avoid tons of forward declarations move
it down.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/72ace588772c0f14834a6a4185d56c445a366fb4.1616696997.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

80c4cbdb

io_uring: fix timeout cancel return code · 1ee4160c

由 Pavel Begunkov 提交于 3月 25, 2021

When we cancel a timeout we should emit a sensible return code, like
-ECANCELED but not 0, otherwise it may trick users.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7b0ad1065e3bd1994722702bd0ba9e7bc9b0683b.1616696997.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

1ee4160c

io_uring: handle signals for IO threads like a normal thread · dbe1bdbb

由 Jens Axboe 提交于 3月 25, 2021

We go through various hoops to disallow signals for the IO threads, but
there's really no reason why we cannot just allow them. The IO threads
never return to userspace like a normal thread, and hence don't go through
normal signal processing. Instead, just check for a pending signal as part
of the work loop, and call get_signal() to handle it for us if anything
is pending.

With that, we can support receiving signals, including special ones like
SIGSTOP.
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dbe1bdbb

26 3月, 2021 1 次提交

io_uring: maintain CQE order of a failed link · 90b87490

由 Pavel Begunkov 提交于 3月 25, 2021

Arguably we want CQEs of linked requests be in a strict order of
submission as it always was. Now if init of a request fails its CQE may
be posted before all prior linked requests including the head of the
link. Fix it by failing it last.

Fixes: de59bc10 ("io_uring: fail links more in io_submit_sqe()")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b7a96b05832e7ab23ad55f84092a2548c4a888b0.1616699075.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

90b87490

24 3月, 2021 1 次提交

io_uring: do ctx sqd ejection in a clear context · a185f1db

由 Pavel Begunkov 提交于 3月 23, 2021

WARNING: CPU: 1 PID: 27907 at fs/io_uring.c:7147 io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
CPU: 1 PID: 27907 Comm: iou-sqp-27905 Not tainted 5.12.0-rc4-syzkaller #0
RIP: 0010:io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
Call Trace:
 io_ring_ctx_wait_and_kill+0x214/0x700 fs/io_uring.c:8619
 io_uring_release+0x3e/0x50 fs/io_uring.c:8646
 __fput+0x288/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x1a0 kernel/task_work.c:140
 io_run_task_work fs/io_uring.c:2238 [inline]
 io_run_task_work fs/io_uring.c:2228 [inline]
 io_uring_try_cancel_requests+0x8ec/0xc60 fs/io_uring.c:8770
 io_uring_cancel_sqpoll+0x1cf/0x290 fs/io_uring.c:8974
 io_sqpoll_cancel_cb+0x87/0xb0 fs/io_uring.c:8907
 io_run_task_work_head+0x58/0xb0 fs/io_uring.c:1961
 io_sq_thread+0x3e2/0x18d0 fs/io_uring.c:6763
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

May happen that last ctx ref is killed in io_uring_cancel_sqpoll(), so
fput callback (i.e. io_uring_release()) is enqueued through task_work,
and run by same cancellation. As it's deeply nested we can't do parking
or taking sqd->lock there, because its state is unclear. So avoid
ctx ejection from sqd list from io_ring_ctx_wait_and_kill() and do it
in a clear context in io_ring_exit_work().

Fixes: f6d54255 ("io_uring: halt SQO submission on ctx exit")
Reported-by: syzbot+e3a3f84f5cecf61f0583@syzkaller.appspotmail.com
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e90df88b8ff2cabb14a7534601d35d62ab4cb8c7.1616496707.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a185f1db

22 3月, 2021 1 次提交

io_uring: fix provide_buffers sign extension · d81269fe

由 Pavel Begunkov 提交于 3月 19, 2021

io_provide_buffers_prep()'s "p->len * p->nbufs" to sign extension
problems. Not a huge problem as it's only used for access_ok() and
increases the checked length, but better to keep typing right.
Reported-by: NColin Ian King <colin.king@canonical.com>
Fixes: efe68c1c ("io_uring: validate the full range of provided buffers for access")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/562376a39509e260d8532186a06226e56eb1f594.1616149233.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d81269fe

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功