提交 · 6567506b68b0cae3934f1a58b35d709f38fc2e90 · openeuler / Kernel

22 9月, 2022 14 次提交

io_uring: disallow defer-tw run w/ no submitters · 6567506b

由 Pavel Begunkov 提交于 9月 08, 2022

We try to restrict CQ waiters when IORING_SETUP_DEFER_TASKRUN is set,
but if nothing has been submitted yet it'll allow any waiter, which
violates the contract.

Fixes: c0e0d6ba ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/b4f0d3f14236d7059d08c5abe2661ef0b78b5528.1662652536.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6567506b

io_uring: further limit non-owner defer-tw cq waiting · 76de6749

由 Pavel Begunkov 提交于 9月 08, 2022

In case of DEFER_TASK_WORK we try to restrict waiters to only one task,
which is also the only submitter; however, we don't do it reliably,
which might be very confusing and backfire in the future. E.g. we
currently allow multiple tasks in io_iopoll_check().

Fixes: c0e0d6ba ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/94c83c0a7fe468260ee2ec31bdb0095d6e874ba2.1662652536.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

76de6749

io_uring: use io_cq_lock consistently · e9a88428

由 Pavel Begunkov 提交于 9月 08, 2022

There is one place when we forgot to change hand coded spin locking with
io_cq_lock(), change it to be more consistent. Note, the unlock part is
already __io_cq_unlock_post().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/91699b9a00a07128f7ca66136bdbbfc67a64659e.1662639236.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e9a88428

io_uring: kill an outdated comment · 385c609f

由 Pavel Begunkov 提交于 9月 08, 2022

Request referencing has changed a while ago and there is no notion left
of submission/completion references, kill an outdated comment.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/38902e7229d68cecd62702436d627d4858b0d9d4.1662639236.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

385c609f

io_uring: ensure iopoll runs local task work as well · dac6a0ea

由 Jens Axboe 提交于 9月 03, 2022

Combine the two checks we have for task_work running and whether or not
we need to shuffle the mutex into one, so we unify how task_work is run
in the iopoll loop. This helps ensure that local task_work is run when
needed, and also optimizes that path to avoid a mutex shuffle if it's
not needed.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dac6a0ea

io_uring: add local task_work run helper that is entered locked · 8ac5d85a

由 Jens Axboe 提交于 9月 03, 2022

We have a few spots that drop the mutex just to run local task_work,
which immediately tries to grab it again. Add a helper that just passes
in whether we're locked already.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8ac5d85a

io_uring: add iopoll infrastructure for io_uring_cmd · 5756a3a7

由 Kanchan Joshi 提交于 8月 23, 2022

Put this up in the same way as iopoll is done for regular read/write IO.
Make place for storing a cookie into struct io_uring_cmd on submission.
Perform the completion using the ->uring_cmd_iopoll handler.
Signed-off-by: NKanchan Joshi <joshi.k@samsung.com>
Signed-off-by: NPankaj Raghav <p.raghav@samsung.com>
Link: https://lore.kernel.org/r/20220823161443.49436-3-joshi.k@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

5756a3a7

io_uring: trace local task work run · f75d5036

由 Dylan Yudaken 提交于 8月 30, 2022

Add tracing for io_run_local_task_work
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-8-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

f75d5036

io_uring: signal registered eventfd to process deferred task work · 21a091b9

由 Dylan Yudaken 提交于 8月 30, 2022

Some workloads rely on a registered eventfd (via
io_uring_register_eventfd(3)) in order to wake up and process the
io_uring.

In the case of a ring setup with IORING_SETUP_DEFER_TASKRUN, that eventfd
also needs to be signalled when there are tasks to run.

This changes an old behaviour which assumed 1 eventfd signal implied at
least 1 CQE, however only when this new flag is set (and so old users will
not notice). This should be expected with the IORING_SETUP_DEFER_TASKRUN
flag as it is not guaranteed that every task will result in a CQE.
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-7-dylany@fb.com
[axboe: fold in call_rcu() serialization fix]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

21a091b9

io_uring: move io_eventfd_put · d8e9214f

由 Dylan Yudaken 提交于 8月 30, 2022

Non functional change: move this function above io_eventfd_signal so it
can be used from there
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-6-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d8e9214f

io_uring: add IORING_SETUP_DEFER_TASKRUN · c0e0d6ba

由 Dylan Yudaken 提交于 8月 30, 2022

Allow deferring async tasks until the user calls io_uring_enter(2) with
the IORING_ENTER_GETEVENTS flag. Enable this mode with a flag at
io_uring_setup time. This functionality requires that the later
io_uring_enter will be called from the same submission task, and therefore
restrict this flag to work only when IORING_SETUP_SINGLE_ISSUER is also
set.

Being able to hand pick when tasks are run prevents the problem where
there is current work to be done, however task work runs anyway.

For example, a common workload would obtain a batch of CQEs, and process
each one. Interrupting this to additional taskwork would add latency but
not gain anything. If instead task work is deferred to just before more
CQEs are obtained then no additional latency is added.

The way this is implemented is by trying to keep task work local to a
io_ring_ctx, rather than to the submission task. This is required, as the
application will want to wake up only a single io_ring_ctx at a time to
process work, and so the lists of work have to be kept separate.

This has some other benefits like not having to check the task continually
in handle_tw_list (and potentially unlocking/locking those), and reducing
locks in the submit & process completions path.

There are networking cases where using this option can reduce request
latency by 50%. For example a contrived example using [1] where the client
sends 2k data and receives the same data back while doing some system
calls (to trigger task work) shows this reduction. The reason ends up
being that if sending responses is delayed by processing task work, then
the client side sits idle. Whereas reordering the sends first means that
the client runs it's workload in parallel with the local task work.

[1]:
Using https://github.com/DylanZA/netbench/tree/defer_run
Client:
./netbench --client_only 1 --control_port 10000 --host <host> --tx "epoll --threads 16 --per_thread 1 --size 2048 --resp 2048 --workload 1000"
Server:
./netbench --server_only 1 --control_port 10000 --rx "io_uring --defer_taskrun 0 --workload 100" --rx "io_uring --defer_taskrun 1 --workload 100"
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-5-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

c0e0d6ba

io_uring: do not run task work at the start of io_uring_enter · 2327337b

由 Dylan Yudaken 提交于 8月 30, 2022

This is not needed, and it is normally better to wait for task work until
after submissions. This will allow greater batching if either work arrives
in the meanwhile, or if the submissions cause task work to be queued up.

For SQPOLL this also no longer runs task work, but this is handled inside
the SQPOLL loop anyway.

For IOPOLL io_iopoll_check will run task work anyway

And otherwise io_cqring_wait will run task work
Suggested-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-4-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

2327337b

io_uring: introduce io_has_work · b4c98d59

由 Dylan Yudaken 提交于 8月 30, 2022

This will be used later to know if the ring has outstanding work. Right
now just if there is overflow CQEs to copy to the main CQE ring, but later
will include deferred tasks
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-3-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b4c98d59

io_uring: remove unnecessary variable · 32d91f05

由 Dylan Yudaken 提交于 8月 30, 2022

'running' is set once and read once, so can easily just remove it
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-2-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

32d91f05

08 9月, 2022 1 次提交

io_uring: recycle kbuf recycle on tw requeue · 336d28a8

由 Pavel Begunkov 提交于 9月 06, 2022

When we queue a request via tw for execution it's not going to be
executed immediately, so when io_queue_async() hits IO_APOLL_READY
and queues a tw but doesn't try to recycle/consume the buffer some other
request may try to use the the buffer.

Fixes: c7fb1942 ("io_uring: add support for ring mapped supplied buffers")
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a19bc9e211e3184215a58e129b62f440180e9212.1662480490.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

336d28a8

01 9月, 2022 2 次提交

io_uring/net: simplify zerocopy send user API · b48c312b

由 Pavel Begunkov 提交于 9月 01, 2022

Following user feedback, this patch simplifies zerocopy send API. One of
the main complaints is that the current API is difficult with the
userspace managing notification slots, and then send retries with error
handling make it even worse.

Instead of keeping notification slots change it to the per-request
notifications model, which posts both completion and notification CQEs
for each request when any data has been sent, and only one CQE if it
fails. All notification CQEs will have IORING_CQE_F_NOTIF set and
IORING_CQE_F_MORE in completion CQEs indicates whether to wait a
notification or not.

IOSQE_CQE_SKIP_SUCCESS is disallowed with zerocopy sends for now.

This is less flexible, but greatly simplifies the user API and also the
kernel implementation. We reuse notif helpers in this patch, but in the
future there won't be need for keeping two requests.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/95287640ab98fc9417370afb16e310677c63e6ce.1662027856.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b48c312b

io_uring/notif: remove notif registration · 57f33224

由 Pavel Begunkov 提交于 9月 01, 2022

We're going to remove the userspace exposed zerocopy notification API,
remove notification registration.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6ff00b97be99869c386958a990593c9c31cf105b.1662027856.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

57f33224

24 8月, 2022 1 次提交

io_uring: conditional ->async_data allocation · 59169439

由 Pavel Begunkov 提交于 8月 24, 2022

There are opcodes that need ->async_data only in some cases and
allocation it unconditionally may hurt performance. Add an option to
opdef to make move the allocation part from the core io_uring to opcode
specific code.
Note, we can't just set opdef->async_size to zero because there are
other helpers that rely on it, e.g. io_alloc_async_data().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9dc62be9e88dd0ed63c48365340e8922d2498293.1661342812.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

59169439

13 8月, 2022 1 次提交

io_uring: add missing BUILD_BUG_ON() checks for new io_uring_sqe fields · 9c71d39a

由 Stefan Metzmacher 提交于 8月 11, 2022

Signed-off-by: NStefan Metzmacher <metze@samba.org>
Link: https://lore.kernel.org/r/ffcaf8dc4778db4af673822df60dbda6efdd3065.1660201408.git.metze@samba.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

9c71d39a

27 7月, 2022 2 次提交

io_uring: notification completion optimisation · 14b146b6

由 Pavel Begunkov 提交于 7月 27, 2022

We want to use all optimisations that we have for io_uring requests like
completion batching, memory caching and more but for zc notifications.
Fortunately, notification perfectly fit the request model so we can
overlay them onto struct io_kiocb and use all the infratructure.

Most of the fields of struct io_notif natively fits into io_kiocb, so we
replace struct io_notif with struct io_kiocb carrying struct
io_notif_data in the cmd cache line. Then we adapt io_alloc_notif() to
use io_alloc_req()/io_alloc_req_refill(), and kill leftovers of hand
coded caching. __io_notif_complete_tw() is converted to use io_uring's
tw infra.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9e010125175e80baf51f0ca63bdc7cc6a4a9fa56.1658913593.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

14b146b6

io_uring: export req alloc from core · bd1a3783

由 Pavel Begunkov 提交于 7月 27, 2022

We want to do request allocation out of the core io_uring code, make the
allocation functions public for other io_uring parts.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0314fedd3a02a514210ba42d4720332538c65956.1658913593.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

bd1a3783

25 7月, 2022 19 次提交

io_uring: flush notifiers after sendzc · 63809137

由 Pavel Begunkov 提交于 7月 12, 2022

Allow to flush notifiers as a part of sendzc request by setting
IORING_SENDZC_FLUSH flag. When the sendzc request succeedes it will
flush the used [active] notifier.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e0b4d9a6797e2fd6092824fe42953db7a519bbc8.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

63809137

io_uring: add notification slot registration · bc24d6bd

由 Pavel Begunkov 提交于 7月 12, 2022

Let the userspace to register and unregister notification slots.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a0aa8161fe3ebb2a4cc6e5dbd0cffb96e6881cf5.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

bc24d6bd

io_uring: cache struct io_notif · eb4a299b

由 Pavel Begunkov 提交于 7月 12, 2022

kmalloc'ing struct io_notif is too expensive when done frequently, cache
them as many other resources in io_uring. Keep two list, the first one
is from where we're getting notifiers, it's protected by ->uring_lock.
The second is protected by ->completion_lock, to which we queue released
notifiers. Then we splice one list into another when needed.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9dec18f7fcbab9f4bd40b96e5ae158b119945230.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

eb4a299b

io_uring: add zc notification infrastructure · eb42cebb

由 Pavel Begunkov 提交于 7月 12, 2022

Add internal part of send zerocopy notifications. There are two main
structures, the first one is struct io_notif, which carries inside
struct ubuf_info and maps 1:1 to it. io_uring will be binding a number
of zerocopy send requests to it and ask to complete (aka flush) it. When
flushed and all attached requests and skbs complete, it'll generate one
and only one CQE. There are intended to be passed into the network layer
as struct msghdr::msg_ubuf.

The second concept is notification slots. The userspace will be able to
register an array of slots and subsequently addressing them by the index
in the array. Slots are independent of each other. Each slot can have
only one notifier at a time (called active notifier) but many notifiers
during the lifetime. When active, a notifier not going to post any
completion but the userspace can attach requests to it by specifying
the corresponding slot while issueing send zc requests. Eventually, the
userspace will want to "flush" the notifier losing any way to attach
new requests to it, however it can use the next atomatically added
notifier of this slot or of any other slot.

When the network layer is done with all enqueued skbs attached to a
notifier and doesn't need the specified in them user data, the flushed
notifier will post a CQE.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3ecf54c31a85762bf679b0a432c9f43ecf7e61cc.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

eb42cebb

io_uring: export io_put_task() · e70cb608

由 Pavel Begunkov 提交于 7月 12, 2022

Make io_put_task() available to non-core parts of io_uring, we'll need
it for notification infrastructure.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3686807d4c03b72e389947b0e8692d4d44334ef0.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e70cb608

io_uring: ensure REQ_F_ISREG is set async offload · f6b543fd

由 Jens Axboe 提交于 7月 21, 2022

If we're offloading requests directly to io-wq because IOSQE_ASYNC was
set in the sqe, we can miss hashing writes appropriately because we
haven't set REQ_F_ISREG yet. This can cause a performance regression
with buffered writes, as io-wq then no longer correctly serializes writes
to that file.

Ensure that we set the flags in io_prep_async_work(), which will cause
the io-wq work item to be hashed appropriately.

Fixes: 584b0180 ("io_uring: move read/write file prep state into actual opcode handler")
Link: https://lore.kernel.org/io-uring/20220608080054.GB22428@xsang-OptiPlex-9020/Reported-by: Nkernel test robot <oliver.sang@intel.com>
Tested-by: NYin Fengwei <fengwei.yin@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f6b543fd

io_uring: Don't require reinitable percpu_ref · 48904229

由 Michal Koutný 提交于 7月 15, 2022

The commit 8bb649ee ("io_uring: remove ring quiesce for
io_uring_register") removed the worklow relying on reinit/resurrection
of the percpu_ref, hence, initialization with that requested is a relic.

This is based on code review, this causes no real bug (and theoretically
can't). Technically it's a revert of commit 21482896 ("io_uring:
initialize percpu refcounters using PERCU_REF_ALLOW_REINIT") but since
the flag omission is now justified, I'm not making this a revert.

Fixes: 8bb649ee ("io_uring: remove ring quiesce for io_uring_register")
Signed-off-by: NMichal Koutný <mkoutny@suse.com>
Acked-by: NRoman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

48904229

io_uring: add netmsg cache · 43e0bbbd

由 Jens Axboe 提交于 7月 07, 2022

For recvmsg/sendmsg, if they don't complete inline, we currently need
to allocate a struct io_async_msghdr for each request. This is a
somewhat large struct.

Hook up sendmsg/recvmsg to use the io_alloc_cache. This reduces the
alloc + free overhead considerably, yielding 4-5% of extra performance
running netbench.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

43e0bbbd

io_uring: impose max limit on apoll cache · 9731bc98

由 Jens Axboe 提交于 7月 07, 2022

Caches like this tend to grow to the peak size, and then never get any
smaller. Impose a max limit on the size, to prevent it from growing too
big.

A somewhat randomly chosen 512 is the max size we'll allow the cache
to get. If a batch of frees come in and would bring it over that, we
simply start kfree'ing the surplus.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9731bc98

io_uring: add abstraction around apoll cache · 9b797a37

由 Jens Axboe 提交于 7月 07, 2022

In preparation for adding limits, and one more user, abstract out the
core bits of the allocation+free cache.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9b797a37

io_uring: move apoll cache to poll.c · 9da7471e

由 Jens Axboe 提交于 7月 07, 2022

This is where it's used, move the flush handler in there.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9da7471e

io_uring: only trace one of complete or overflow · e0486f3f

由 Dylan Yudaken 提交于 6月 30, 2022

In overflow we see a duplcate line in the trace, and in some cases 3
lines (if initial io_post_aux_cqe fails).
Instead just trace once for each CQE
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220630091231.1456789-13-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e0486f3f

io_uring: add allow_overflow to io_post_aux_cqe · 52120f0f

由 Dylan Yudaken 提交于 6月 30, 2022

Some use cases of io_post_aux_cqe would not want to overflow as is, but
might want to change the flags/result. For example multishot receive
requires in order CQE, and so if there is an overflow it would need to
stop receiving until the overflow is taken care of.
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220630091231.1456789-8-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

52120f0f

io_uring: let to set a range for file slot allocation · 6e73dffb

由 Pavel Begunkov 提交于 6月 25, 2022

From recently io_uring provides an option to allocate a file index for
operation registering fixed files. However, it's utterly unusable with
mixed approaches when for a part of files the userspace knows better
where to place it, as it may race and users don't have any sane way to
pick a slot and hoping it will not be taken.

Let the userspace to register a range of fixed file slots in which the
auto-allocation happens. The use case is splittting the fixed table in
two parts, where on of them is used for auto-allocation and another for
slot-specified operations.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/66ab0394e436f38437cf7c44676e1920d09687ad.1656154403.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6e73dffb

io_uring: remove ctx->refs pinning on enter · fbb8bb02

由 Pavel Begunkov 提交于 6月 25, 2022

io_uring_enter() takes ctx->refs, which was previously preventing racing
with register quiesce. However, as register now doesn't touch the refs,
we can freely kill extra ctx pinning and rely on the fact that we're
holding a file reference preventing the ring from being destroyed.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a11c57ad33a1be53541fce90669c1b79cf4d8940.1656153286.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

fbb8bb02

io_uring: don't check file ops of registered rings · 3273c440

由 Pavel Begunkov 提交于 6月 25, 2022

Registered rings are per definitions io_uring files, so we don't need to
additionally verify them.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/425cd64fd885b8e329a46c205ee811987691baaf.1656153286.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3273c440

io_uring: remove extra TIF_NOTIFY_SIGNAL check · ad8b261d

由 Pavel Begunkov 提交于 6月 25, 2022

io_run_task_work() accounts for TIF_NOTIFY_SIGNAL, so no need to have an
second check in io_run_task_work_sig().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/52ce41a592ad904511697f432141e5690fd4b968.1656153285.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ad8b261d

io_uring: fuse fallback_node and normal tw node · 3218e5d3

由 Pavel Begunkov 提交于 6月 25, 2022

Now as both normal and fallback paths use llist, just keep one node head
in struct io_task_work and kill off ->fallback_node.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d04ebde409f7b162fe247b361b4486b193293e46.1656153285.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3218e5d3

io_uring: add sync cancelation API through io_uring_register() · 78a861b9

由 Jens Axboe 提交于 6月 18, 2022

The io_uring cancelation API is async, like any other API that we expose
there. For the case of finding a request to cancel, or not finding one,
it is fully sync in that when submission returns, the CQE for both the
cancelation request and the targeted request have been posted to the
CQ ring.

However, if the targeted work is being executed by io-wq, the API can
only start the act of canceling it. This makes it difficult to use in
some circumstances, as the caller then has to wait for the CQEs to come
in and match on the same cancelation data there.

Provide a IORING_REGISTER_SYNC_CANCEL command for io_uring_register()
that does sync cancelations, always. For the io-wq case, it'll wait
for the cancelation to come in before returning. The only expected
returns from this API is:

0		Request found and canceled fine.
> 0		Requests found and canceled. Only happens if asked to
		cancel multiple requests, and if the work wasn't in
		progress.
-ENOENT		Request not found.
-ETIME		A timeout on the operation was requested, but the timeout
		expired before we could cancel.

and we won't get -EALREADY via this API.

If the timeout value passed in is -1 (tv_sec and tv_nsec), then that
means that no timeout is requested. Otherwise, the timespec passed in
is the amount of time the sync cancel will wait for a successful
cancelation.

Link: https://github.com/axboe/liburing/discussions/608Signed-off-by: NJens Axboe <axboe@kernel.dk>

78a861b9

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功