提交 · d434ab6db524ab1efd0afad4ffa1ee65ca6ac097 · openeuler / Kernel

11 1月, 2021 1 次提交

io_uring: drop mm and files after task_work_run · d434ab6d

由 Pavel Begunkov 提交于 1月 11, 2021

__io_req_task_submit() run by task_work can set mm and files, but
io_sq_thread() in some cases, and because __io_sq_thread_acquire_mm()
and __io_sq_thread_acquire_files() do a simple current->mm/files check
it may end up submitting IO with mm/files of another task.

We also need to drop it after in the end to drop potentially grabbed
references to them.

Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d434ab6d

10 1月, 2021 4 次提交

io_uring: stop SQPOLL submit on creator's death · d9d05217

由 Pavel Begunkov 提交于 1月 08, 2021

When the creator of SQPOLL io_uring dies (i.e. sqo_task), we don't want
its internals like ->files and ->mm to be poked by the SQPOLL task, it
have never been nice and recently got racy. That can happen when the
owner undergoes destruction and SQPOLL tasks tries to submit new
requests in parallel, and so calls io_sq_thread_acquire*().

That patch halts SQPOLL submissions when sqo_task dies by introducing
sqo_dead flag. Once set, the SQPOLL task must not do any submission,
which is synchronised by uring_lock as well as the new flag.

The tricky part is to make sure that disabling always happens, that
means either the ring is discovered by creator's do_exit() -> cancel,
or if the final close() happens before it's done by the creator. The
last is guaranteed by the fact that for SQPOLL the creator task and only
it holds exactly one file note, so either it pins up to do_exit() or
removed by the creator on the final put in flush. (see comments in
uring_flush() around file->f_count == 2).

One more place that can trigger io_sq_thread_acquire_*() is
__io_req_task_submit(). Shoot off requests on sqo_dead there, even
though actually we don't need to. That's because cancellation of
sqo_task should wait for the request before going any further.

note 1: io_disable_sqo_submit() does io_ring_set_wakeup_flag() so the
caller would enter the ring to get an error, but it still doesn't
guarantee that the flag won't be cleared.

note 2: if final __userspace__ close happens not from the creator
task, the file note will pin the ring until the task dies.

Fixed: b1b6b5a3 ("kernel/io_uring: cancel io_uring before task works")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d9d05217

io_uring: add warn_once for io_uring_flush() · 6b5733eb

由 Pavel Begunkov 提交于 1月 08, 2021

files_cancel() should cancel all relevant requests and drop file notes,
so we should never have file notes after that, including on-exit fput
and flush. Add a WARN_ONCE to be sure.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6b5733eb

io_uring: inline io_uring_attempt_task_drop() · 4f793dc4

由 Pavel Begunkov 提交于 1月 08, 2021

A simple preparation change inlining io_uring_attempt_task_drop() into
io_uring_flush().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4f793dc4

io_uring: io_rw_reissue lockdep annotations · 55e6ac1e

由 Pavel Begunkov 提交于 1月 08, 2021

We expect io_rw_reissue() to take place only during submission with
uring_lock held. Add a lockdep annotation to check that invariant.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

55e6ac1e

07 1月, 2021 4 次提交

io_uring: synchronise ev_posted() with waitqueues · b1445e59

由 Pavel Begunkov 提交于 1月 07, 2021

waitqueue_active() needs smp_mb() to be in sync with waitqueues
modification, but we miss it in io_cqring_ev_posted*() apart from
cq_wait() case.

Take an smb_mb() out of wq_has_sleeper() making it waitqueue_active(),
and place it a few lines before, so it can synchronise other
waitqueue_active() as well.

The patch doesn't add any additional overhead, so even if there are
no problems currently, it's just safer to have it this way.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b1445e59

io_uring: dont kill fasync under completion_lock · 4aa84f2f

由 Pavel Begunkov 提交于 1月 07, 2021

      CPU0                    CPU1
       ----                    ----
  lock(&new->fa_lock);
                               local_irq_disable();
                               lock(&ctx->completion_lock);
                               lock(&new->fa_lock);
  <Interrupt>
    lock(&ctx->completion_lock);

 *** DEADLOCK ***

Move kill_fasync() out of io_commit_cqring() to io_cqring_ev_posted(),
so it doesn't hold completion_lock while doing it. That saves from the
reported deadlock, and it's just nice to shorten the locking time and
untangle nested locks (compl_lock -> wq_head::lock).

Reported-by: syzbot+91ca3f25bd7f795f019c@syzkaller.appspotmail.com
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4aa84f2f

io_uring: trigger eventfd for IOPOLL · 80c18e4a

由 Pavel Begunkov 提交于 1月 07, 2021

Make sure io_iopoll_complete() tries to wake up eventfd, which currently
is skipped together with io_cqring_ev_posted() for non-SQPOLL IOPOLL.

Add an iopoll version of io_cqring_ev_posted(), duplicates a bit of
code, but they actually use different sets of wait queues may be for
better.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

80c18e4a

io_uring: Fix return value from alloc_fixed_file_ref_node · 3e2224c5

由 Matthew Wilcox (Oracle) 提交于 1月 06, 2021

alloc_fixed_file_ref_node() currently returns an ERR_PTR on failure.
io_sqe_files_unregister() expects it to return NULL and since it can only
return -ENOMEM, it makes more sense to change alloc_fixed_file_ref_node()
to behave that way.

Fixes: 1ffc5422 ("io_uring: fix io_sqe_files_unregister() hangs")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3e2224c5

06 1月, 2021 1 次提交

io_uring: Delete useless variable ‘id’ in io_prep_async_work · 170b3bbd

由 Ye Bin 提交于 1月 05, 2021

Fix follow warning:
fs/io_uring.c:1523:22: warning: variable ‘id’ set but not used
[-Wunused-but-set-variable]
  struct io_identity *id;
                        ^~
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NYe Bin <yebin10@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

170b3bbd

05 1月, 2021 4 次提交

io_uring: cancel more aggressively in exit_work · 90df0853

由 Pavel Begunkov 提交于 1月 04, 2021

While io_ring_exit_work() is running new requests of all sorts may be
issued, so it should do a bit more to cancel them, otherwise they may
just get stuck. e.g. in io-wq, in poll lists, etc.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

90df0853

io_uring: drop file refs after task cancel · de7f1d9e

由 Pavel Begunkov 提交于 1月 04, 2021

io_uring fds marked O_CLOEXEC and we explicitly cancel all requests
before going through exec, so we don't want to leave task's file
references to not our anymore io_uring instances.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

de7f1d9e

io_uring: patch up IOPOLL overflow_flush sync · 6c503150

由 Pavel Begunkov 提交于 1月 04, 2021

IOPOLL skips completion locking but keeps it under uring_lock, thus
io_cqring_overflow_flush() and so io_cqring_events() need additional
locking with uring_lock in some cases for IOPOLL.

Remove __io_cqring_overflow_flush() from io_cqring_events(), introduce a
wrapper around flush doing needed synchronisation and call it by hand.

Cc: stable@vger.kernel.org # 5.5+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6c503150

io_uring: synchronise IOPOLL on task_submit fail · 81b6d05c

由 Pavel Begunkov 提交于 1月 04, 2021

io_req_task_submit() might be called for IOPOLL, do the fail path under
uring_lock to comply with IOPOLL synchronisation based solely on it.

Cc: stable@vger.kernel.org # 5.5+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

81b6d05c

31 12月, 2020 2 次提交

io_uring: fix io_sqe_files_unregister() hangs · 1ffc5422

由 Pavel Begunkov 提交于 12月 30, 2020

io_sqe_files_unregister() uninterruptibly waits for enqueued ref nodes,
however requests keeping them may never complete, e.g. because of some
userspace dependency. Make sure it's interruptible otherwise it would
hang forever.

Cc: stable@vger.kernel.org # 5.6+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1ffc5422

io_uring: add a helper for setting a ref node · 1642b445

由 Pavel Begunkov 提交于 12月 30, 2020

Setting a new reference node to a file data is not trivial, don't repeat
it, add and use a helper.

Cc: stable@vger.kernel.org # 5.6+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1642b445

30 12月, 2020 1 次提交

io_uring: don't assume mm is constant across submits · 77788775

由 Jens Axboe 提交于 12月 29, 2020

If we COW the identity, we assume that ->mm never changes. But this
isn't true of multiple processes end up sharing the ring. Hence treat
id->mm like like any other process compontent when it comes to the
identity mapping. This is pretty trivial, just moving the existing grab
into io_grab_identity(), and including a check for the match.

Cc: stable@vger.kernel.org # 5.10
Fixes: 1e6fa521 ("io_uring: COW io_identity on mismatch")
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>:
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>:
Signed-off-by: NJens Axboe <axboe@kernel.dk>

77788775

23 12月, 2020 2 次提交

io_uring: hold uring_lock while completing failed polled io in io_wq_submit_work() · c07e6719

由 Xiaoguang Wang 提交于 12月 14, 2020

io_iopoll_complete() does not hold completion_lock to complete polled io,
so in io_wq_submit_work(), we can not call io_req_complete() directly, to
complete polled io, otherwise there maybe concurrent access to cqring,
defer_list, etc, which is not safe. Commit dad1b124 ("io_uring: always
let io_iopoll_complete() complete polled io") has fixed this issue, but
Pavel reported that IOPOLL apart from rw can do buf reg/unreg requests(
IORING_OP_PROVIDE_BUFFERS or IORING_OP_REMOVE_BUFFERS), so the fix is not
good.

Given that io_iopoll_complete() is always called under uring_lock, so here
for polled io, we can also get uring_lock to fix this issue.

Fixes: dad1b124 ("io_uring: always let io_iopoll_complete() complete polled io")
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
[axboe: don't deref 'req' after completing it']
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c07e6719

io_uring: fix double io_uring free · 9faadcc8

由 Pavel Begunkov 提交于 12月 21, 2020

Once we created a file for current context during setup, we should not
call io_ring_ctx_wait_and_kill() directly as it'll be done by fput(file)

Cc: stable@vger.kernel.org # 5.10
Reported-by: syzbot+c9937dfb2303a5f18640@syzkaller.appspotmail.com
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
[axboe: fix unused 'ret' for !CONFIG_UNIX]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9faadcc8

22 12月, 2020 1 次提交

io_uring: fix ignoring xa_store errors · a528b04e

由 Pavel Begunkov 提交于 12月 21, 2020

xa_store() may fail, check the result.

Cc: stable@vger.kernel.org # 5.10
Fixes: 0f212204 ("io_uring: don't rely on weak ->files references")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a528b04e

21 12月, 2020 3 次提交

io_uring: end waiting before task cancel attempts · f57555ed

由 Pavel Begunkov 提交于 12月 20, 2020

Get rid of TASK_UNINTERRUPTIBLE and waiting with finish_wait before
going for next iteration in __io_uring_task_cancel(), because
__io_uring_files_cancel() doesn't expect that sheduling is disallowed.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f57555ed

io_uring: always progress task_work on task cancel · 55583d72

由 Pavel Begunkov 提交于 12月 20, 2020

Might happen that __io_uring_cancel_task_requests() cancels nothing but
there are task_works pending. We need to always run them.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

55583d72

io_uring: make ctx cancel on exit targeted to actual ctx · 00c18640

由 Jens Axboe 提交于 12月 20, 2020

Before IORING_SETUP_ATTACH_WQ, we could just cancel everything on the
io-wq when exiting. But that's not the case if they are shared, so
cancel for the specific ctx instead.

Cc: stable@vger.kernel.org
Fixes: 24369c2e ("io_uring: add io-wq workqueue sharing")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

00c18640

19 12月, 2020 1 次提交

io_uring: fix 0-iov read buffer select · dd201662

由 Pavel Begunkov 提交于 12月 19, 2020

Doing vectored buf-select read with 0 iovec passed is meaningless and
utterly broken, forbid it.

Cc: <stable@vger.kernel.org> # 5.7+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dd201662

18 12月, 2020 1 次提交

io_uring: close a small race gap for files cancel · dfea9fce

由 Pavel Begunkov 提交于 12月 18, 2020

The purpose of io_uring_cancel_files() is to wait for all requests
matching ->files to go/be cancelled. We should first drop files of a
request in io_req_drop_files() and only then make it undiscoverable for
io_uring_cancel_files.

First drop, then delete from list. It's ok to leave req->id->files
dangling, because it's not dereferenced by cancellation code, only
compared against. It would potentially go to sleep and be awaken by
following in io_req_drop_files() wake_up().

Fixes: 0f212204 ("io_uring: don't rely on weak ->files references")
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dfea9fce

17 12月, 2020 7 次提交

io_uring: limit {io|sq}poll submit locking scope · 89448c47

由 Pavel Begunkov 提交于 12月 17, 2020

We don't need to take uring_lock for SQPOLL|IOPOLL to do
io_cqring_overflow_flush() when cq_overflow_list is empty, remove it
from the hot path.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

89448c47

io_uring: inline io_cqring_mark_overflow() · 09e88404

由 Pavel Begunkov 提交于 12月 17, 2020

There is only one user of it and the name is misleading, get rid of it
by inlining. By the way make overflow_flush's return value deduction
simpler.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

09e88404

io_uring: consolidate CQ nr events calculation · e23de15f

由 Pavel Begunkov 提交于 12月 17, 2020

Add a helper which calculates number of events in CQ. Handcoded version
of it in io_cqring_overflow_flush() is not the clearest thing, so it
makes it slightly more readable.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e23de15f

io_uring: remove racy overflow list fast checks · 9cd2be51

由 Pavel Begunkov 提交于 12月 17, 2020

list_empty_careful() is not racy only if some conditions are met, i.e.
no re-adds after del_init. io_cqring_overflow_flush() does list_move(),
so it's actually racy.

Remove those checks, we have ->cq_check_overflow for the fast path.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9cd2be51

io_uring: cancel reqs shouldn't kill overflow list · cda286f0

由 Pavel Begunkov 提交于 12月 17, 2020

io_uring_cancel_task_requests() doesn't imply that the ring is going
away, it may continue to work well after that. The problem is that it
sets ->cq_overflow_flushed effectively disabling the CQ overflow feature

Split setting cq_overflow_flushed from flush, and do the first one only
on exit. It's ok in terms of cancellations because there is a
io_uring->in_idle check in __io_cqring_fill_event().

It also fixes a race with setting ->cq_overflow_flushed in
io_uring_cancel_task_requests, whuch's is not atomic and a part of a
bitmask with other flags. Though, the only other flag that's not set
during init is drain_next, so it's not as bad for sane architectures.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Fixes: 0f212204 ("io_uring: don't rely on weak ->files references")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cda286f0

io_uring: hold mmap_sem for mm->locked_vm manipulation · 4bc4a912

由 Jens Axboe 提交于 12月 17, 2020

The kernel doesn't seem to have clear rules around this, but various
spots are using the mmap_sem to serialize access to modifying the
locked_vm count. Play it safe and lock the mm for write when accounting
or unaccounting locked memory.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4bc4a912

io_uring: break links on shutdown failure · a146468d

由 Jens Axboe 提交于 12月 14, 2020

Ensure that the return value of __sys_shutdown_sock() is used to
potentially break links to the request, if we fail.

Fixes: 36f4fa68 ("io_uring: add support for shutdown(2)")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a146468d

13 12月, 2020 2 次提交

io_uring: remove 'twa_signal_ok' deadlock work-around · 355fb9e2

由 Jens Axboe 提交于 10月 22, 2020

The TIF_NOTIFY_SIGNAL based implementation of TWA_SIGNAL is always safe
to use, regardless of context, as we won't be recursing into the signal
lock. So now that all archs are using that, we can drop this deadlock
work-around as it's always safe to use TWA_SIGNAL.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

355fb9e2

io_uring: JOBCTL_TASK_WORK is no longer used by task_work · 792ee0f6

由 Jens Axboe 提交于 10月 22, 2020

Remove the dead code, TWA_SIGNAL will never set JOBCTL_TASK_WORK at
this point.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

792ee0f6

11 12月, 2020 1 次提交

file: Rename __close_fd_get_file close_fd_get_file · 9fe83c43

由 Eric W. Biederman 提交于 11月 20, 2020

The function close_fd_get_file is explicitly a variant of
__close_fd[1].  Now that __close_fd has been renamed close_fd, rename
close_fd_get_file to be consistent with close_fd.

When __alloc_fd, __close_fd and __fd_install were introduced the
double underscore indicated that the function took a struct
files_struct parameter.  The function __close_fd_get_file never has so
the naming has always been inconsistent.  This just cleans things up
so there are not any lingering mentions or references __close_fd left
in the code.

[1] 80cd7956 ("binder: fix use-after-free due to ksys_close() during fdget()")
Link: https://lkml.kernel.org/r/20201120231441.29911-23-ebiederm@xmission.comSigned-off-by: NEric W. Biederman <ebiederm@xmission.com>

9fe83c43

10 12月, 2020 5 次提交

io_uring: fix io_cqring_events()'s noflush · 59850d22

由 Pavel Begunkov 提交于 12月 06, 2020

Checking !list_empty(&ctx->cq_overflow_list) around noflush in
io_cqring_events() is racy, because if it fails but a request overflowed
just after that, io_cqring_overflow_flush() still will be called.

Remove the second check, it shouldn't be a problem for performance,
because there is cq_check_overflow bit check just above.

Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

59850d22

io_uring: fix racy IOPOLL flush overflow · 634578f8

由 Pavel Begunkov 提交于 12月 06, 2020

It's not safe to call io_cqring_overflow_flush() for IOPOLL mode without
hodling uring_lock, because it does synchronisation differently. Make
sure we have it.

As for io_ring_exit_work(), we don't even need it there because
io_ring_ctx_wait_and_kill() already set force flag making all overflowed
requests to be dropped.

Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

634578f8

io_uring: fix racy IOPOLL completions · 31bff9a5

由 Pavel Begunkov 提交于 12月 06, 2020

IOPOLL allows buffer remove/provide requests, but they doesn't
synchronise by rules of IOPOLL, namely it have to hold uring_lock.

Cc: <stable@vger.kernel.org> # 5.7+
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

31bff9a5

io_uring: always let io_iopoll_complete() complete polled io · dad1b124

由 Xiaoguang Wang 提交于 12月 06, 2020

Abaci Fuzz reported a double-free or invalid-free BUG in io_commit_cqring():
[   95.504842] BUG: KASAN: double-free or invalid-free in io_commit_cqring+0x3ec/0x8e0
[   95.505921]
[   95.506225] CPU: 0 PID: 4037 Comm: io_wqe_worker-0 Tainted: G    B
W         5.10.0-rc5+ #1
[   95.507434] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[   95.508248] Call Trace:
[   95.508683]  dump_stack+0x107/0x163
[   95.509323]  ? io_commit_cqring+0x3ec/0x8e0
[   95.509982]  print_address_description.constprop.0+0x3e/0x60
[   95.510814]  ? vprintk_func+0x98/0x140
[   95.511399]  ? io_commit_cqring+0x3ec/0x8e0
[   95.512036]  ? io_commit_cqring+0x3ec/0x8e0
[   95.512733]  kasan_report_invalid_free+0x51/0x80
[   95.513431]  ? io_commit_cqring+0x3ec/0x8e0
[   95.514047]  __kasan_slab_free+0x141/0x160
[   95.514699]  kfree+0xd1/0x390
[   95.515182]  io_commit_cqring+0x3ec/0x8e0
[   95.515799]  __io_req_complete.part.0+0x64/0x90
[   95.516483]  io_wq_submit_work+0x1fa/0x260
[   95.517117]  io_worker_handle_work+0xeac/0x1c00
[   95.517828]  io_wqe_worker+0xc94/0x11a0
[   95.518438]  ? io_worker_handle_work+0x1c00/0x1c00
[   95.519151]  ? __kthread_parkme+0x11d/0x1d0
[   95.519806]  ? io_worker_handle_work+0x1c00/0x1c00
[   95.520512]  ? io_worker_handle_work+0x1c00/0x1c00
[   95.521211]  kthread+0x396/0x470
[   95.521727]  ? _raw_spin_unlock_irq+0x24/0x30
[   95.522380]  ? kthread_mod_delayed_work+0x180/0x180
[   95.523108]  ret_from_fork+0x22/0x30
[   95.523684]
[   95.523985] Allocated by task 4035:
[   95.524543]  kasan_save_stack+0x1b/0x40
[   95.525136]  __kasan_kmalloc.constprop.0+0xc2/0xd0
[   95.525882]  kmem_cache_alloc_trace+0x17b/0x310
[   95.533930]  io_queue_sqe+0x225/0xcb0
[   95.534505]  io_submit_sqes+0x1768/0x25f0
[   95.535164]  __x64_sys_io_uring_enter+0x89e/0xd10
[   95.535900]  do_syscall_64+0x33/0x40
[   95.536465]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   95.537199]
[   95.537505] Freed by task 4035:
[   95.538003]  kasan_save_stack+0x1b/0x40
[   95.538599]  kasan_set_track+0x1c/0x30
[   95.539177]  kasan_set_free_info+0x1b/0x30
[   95.539798]  __kasan_slab_free+0x112/0x160
[   95.540427]  kfree+0xd1/0x390
[   95.540910]  io_commit_cqring+0x3ec/0x8e0
[   95.541516]  io_iopoll_complete+0x914/0x1390
[   95.542150]  io_do_iopoll+0x580/0x700
[   95.542724]  io_iopoll_try_reap_events.part.0+0x108/0x200
[   95.543512]  io_ring_ctx_wait_and_kill+0x118/0x340
[   95.544206]  io_uring_release+0x43/0x50
[   95.544791]  __fput+0x28d/0x940
[   95.545291]  task_work_run+0xea/0x1b0
[   95.545873]  do_exit+0xb6a/0x2c60
[   95.546400]  do_group_exit+0x12a/0x320
[   95.546967]  __x64_sys_exit_group+0x3f/0x50
[   95.547605]  do_syscall_64+0x33/0x40
[   95.548155]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

The reason is that once we got a non EAGAIN error in io_wq_submit_work(),
we'll complete req by calling io_req_complete(), which will hold completion_lock
to call io_commit_cqring(), but for polled io, io_iopoll_complete() won't
hold completion_lock to call io_commit_cqring(), then there maybe concurrent
access to ctx->defer_list, double free may happen.

To fix this bug, we always let io_iopoll_complete() complete polled io.

Cc: <stable@vger.kernel.org> # 5.5+
Reported-by: NAbaci Fuzz <abaci@linux.alibaba.com>
Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dad1b124

io_uring: add timeout update · 9c8e11b3

由 Pavel Begunkov 提交于 11月 30, 2020

Support timeout updates through IORING_OP_TIMEOUT_REMOVE with passed in
IORING_TIMEOUT_UPDATE. Updates doesn't support offset timeout mode.
Oirignal timeout.off will be ignored as well.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
[axboe: remove now unused 'ret' variable]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9c8e11b3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功