提交 · 63809137ebb58f0aa2ce359117422686e3304f45 · openeuler / Kernel

25 7月, 2022 14 次提交

io_uring: flush notifiers after sendzc · 63809137

由 Pavel Begunkov 提交于 7月 12, 2022

Allow to flush notifiers as a part of sendzc request by setting
IORING_SENDZC_FLUSH flag. When the sendzc request succeedes it will
flush the used [active] notifier.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e0b4d9a6797e2fd6092824fe42953db7a519bbc8.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

63809137

io_uring: sendzc with fixed buffers · 10c7d33e

由 Pavel Begunkov 提交于 7月 12, 2022

Allow zerocopy sends to use fixed buffers. There is an optimisation for
this case, the network layer don't need to reference the pages, see
SKBFL_MANAGED_FRAG_REFS, so io_uring have to ensure validity of fixed
buffers until the notifier is released.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e1d8bd1b5934e541d90c1824eb4020ae3f5f43f3.1657643355.git.asml.silence@gmail.com
[axboe: fold in 32-bit pointer cast warning fix]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

10c7d33e

io_uring: allow to pass addr into sendzc · 092aeedb

由 Pavel Begunkov 提交于 7月 12, 2022

Allow to specify an address to zerocopy sends making it more like
sendto(2).
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/70417a8f7c5b51ab454690bae08adc0c187f89e8.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

092aeedb

io_uring: wire send zc request type · 06a5464b

由 Pavel Begunkov 提交于 7月 12, 2022

Add a new io_uring opcode IORING_OP_SENDZC. The main distinction from
IORING_OP_SEND is that the user should specify a notification slot
index in sqe::notification_idx and the buffers are safe to reuse only
when the used notification is flushed and completes.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a80387c6a68ce9cf99b3b6ef6f71068468761fb7.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

06a5464b

io_uring: add notification slot registration · bc24d6bd

由 Pavel Begunkov 提交于 7月 12, 2022

Let the userspace to register and unregister notification slots.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a0aa8161fe3ebb2a4cc6e5dbd0cffb96e6881cf5.1657643355.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

bc24d6bd

io_uring: support multishot in recvmsg · 9bb66906

由 Dylan Yudaken 提交于 7月 14, 2022

Similar to multishot recv, this will require provided buffers to be
used. However recvmsg is much more complex than recv as it has multiple
outputs. Specifically flags, name, and control messages.

Support this by introducing a new struct io_uring_recvmsg_out with 4
fields. namelen, controllen and flags match the similar out fields in
msghdr from standard recvmsg(2), payloadlen is the length of the payload
following the header.
This struct is placed at the start of the returned buffer. Based on what
the user specifies in struct msghdr, the next bytes of the buffer will be
name (the next msg_namelen bytes), and then control (the next
msg_controllen bytes). The payload will come at the end. The return value
in the CQE is the total used size of the provided buffer.
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220714110258.1336200-4-dylany@fb.com
[axboe: style fixups, see link]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9bb66906

io_uring: multishot recv · b3fdea6e

由 Dylan Yudaken 提交于 6月 30, 2022

Support multishot receive for io_uring.
Typical server applications will run a loop where for each recv CQE it
requeues another recv/recvmsg.

This can be simplified by using the existing multishot functionality
combined with io_uring's provided buffers.
The API is to add the IORING_RECV_MULTISHOT flag to the SQE. CQEs will
then be posted (with IORING_CQE_F_MORE flag set) when data is available
and is read. Once an error occurs or the socket ends, the multishot will
be removed and a completion without IORING_CQE_F_MORE will be posted.

The benefit to this is that the recv is much more performant.
 * Subsequent receives are queued up straight away without requiring the
   application to finish a processing loop.
 * If there are more data in the socket (sat the provided buffer size is
   smaller than the socket buffer) then the data is immediately
   returned, improving batching.
 * Poll is only armed once and reused, saving CPU cycles
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220630091231.1456789-11-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b3fdea6e

io_uring: let to set a range for file slot allocation · 6e73dffb

由 Pavel Begunkov 提交于 6月 25, 2022

From recently io_uring provides an option to allocate a file index for
operation registering fixed files. However, it's utterly unusable with
mixed approaches when for a part of files the userspace knows better
where to place it, as it may race and users don't have any sane way to
pick a slot and hoping it will not be taken.

Let the userspace to register a range of fixed file slots in which the
auto-allocation happens. The use case is splittting the fixed table in
two parts, where on of them is used for auto-allocation and another for
slot-specified operations.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/66ab0394e436f38437cf7c44676e1920d09687ad.1656154403.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6e73dffb

io_uring: add support for passing fixed file descriptors · e6130eba

由 Jens Axboe 提交于 6月 13, 2022

With IORING_OP_MSG_RING, one ring can send a message to another ring.
Extend that support to also allow sending a fixed file descriptor to
that ring, enabling one ring to pass a registered descriptor to another
one.

Arguments are extended to pass in:

sqe->addr3	fixed file slot in source ring
sqe->file_index	fixed file slot in destination ring

IORING_OP_MSG_RING is extended to take a command argument in sqe->addr.
If set to zero (or IORING_MSG_DATA), it sends just a message like before.
If set to IORING_MSG_SEND_FD, a fixed file descriptor is sent according
to the above arguments.

Two common use cases for this are:

1) Server needs to be shutdown or restarted, pass file descriptors to
   another onei

2) Backend is split, and one accepts connections, while others then get
  the fd passed and handle the actual connection.

Both of those are classic SCM_RIGHTS use cases, and it's not possible to
support them with direct descriptors today.

By default, this will post a CQE to the target ring, similarly to how
IORING_MSG_DATA does it. If IORING_MSG_RING_CQE_SKIP is set, no message
is posted to the target ring. The issuer is expected to notify the
receiver side separately.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e6130eba

io_uring: replace zero-length array with flexible-array member · 8fcf4c48

由 Gustavo A. R. Silva 提交于 6月 28, 2022

There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/78Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8fcf4c48

io_uring: add sync cancelation API through io_uring_register() · 78a861b9

由 Jens Axboe 提交于 6月 18, 2022

The io_uring cancelation API is async, like any other API that we expose
there. For the case of finding a request to cancel, or not finding one,
it is fully sync in that when submission returns, the CQE for both the
cancelation request and the targeted request have been posted to the
CQ ring.

However, if the targeted work is being executed by io-wq, the API can
only start the act of canceling it. This makes it difficult to use in
some circumstances, as the caller then has to wait for the CQEs to come
in and match on the same cancelation data there.

Provide a IORING_REGISTER_SYNC_CANCEL command for io_uring_register()
that does sync cancelations, always. For the io-wq case, it'll wait
for the cancelation to come in before returning. The only expected
returns from this API is:

0		Request found and canceled fine.
> 0		Requests found and canceled. Only happens if asked to
		cancel multiple requests, and if the work wasn't in
		progress.
-ENOENT		Request not found.
-ETIME		A timeout on the operation was requested, but the timeout
		expired before we could cancel.

and we won't get -EALREADY via this API.

If the timeout value passed in is -1 (tv_sec and tv_nsec), then that
means that no timeout is requested. Otherwise, the timespec passed in
is the amount of time the sync cancel will wait for a successful
cancelation.

Link: https://github.com/axboe/liburing/discussions/608Signed-off-by: NJens Axboe <axboe@kernel.dk>

78a861b9

io_uring: add IORING_ASYNC_CANCEL_FD_FIXED cancel flag · 7d8ca725

由 Jens Axboe 提交于 6月 18, 2022

In preparation for not having a request to pass in that carries this
state, add a separate cancelation flag that allows the caller to ask
for a fixed file for cancelation.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7d8ca725

io_uring: add IORING_SETUP_SINGLE_ISSUER · 97bbdc06

由 Pavel Begunkov 提交于 6月 16, 2022

Add a new IORING_SETUP_SINGLE_ISSUER flag and the userspace visible part
of it, i.e. put limitations of submitters. Also, don't allow it together
with IOPOLL as we're not going to put it to good use.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4bcc41ee467fdf04c8aab8baf6ce3ba21858c3d4.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

97bbdc06

io_uring: add support for level triggered poll · b9ba8a44

由 Jens Axboe 提交于 5月 27, 2022

By default, the POLL_ADD command does edge triggered poll - if we get
a non-zero mask on the initial poll attempt, we complete the request
successfully.

Support level triggered by always waiting for a notification, regardless
of whether or not the initial mask matches the file state.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b9ba8a44

08 7月, 2022 1 次提交

io_uring: explicit sqe padding for ioctl commands · bdb2c48e

由 Pavel Begunkov 提交于 7月 07, 2022

32 bit sqe->cmd_op is an union with 64 bit values. It's always a good
idea to do padding explicitly. Also zero check it in prep, so it can be
used in the future if needed without compatibility concerns.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e6b95a05e970af79000435166185e85b196b2ba2.1657202417.git.asml.silence@gmail.com
[axboe: turn bitwise OR into logical variant]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bdb2c48e

30 6月, 2022 1 次提交

io_uring: keep sendrecv flags in ioprio · 29c1ac23

由 Pavel Begunkov 提交于 6月 30, 2022

We waste a u64 SQE field for flags even though we don't need as many
bits and it can be used for something more useful later. Store io_uring
specific send/recv flags in sqe->ioprio instead of ->addr2.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Fixes: 0455d4cc ("io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg")
[axboe: change comment in io_uring.h as well]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

29c1ac23

15 6月, 2022 1 次提交

io_uring: remove IORING_CLOSE_FD_AND_FILE_SLOT · d884b649

由 Pavel Begunkov 提交于 6月 14, 2022

This partially reverts a7c41b46

Even though IORING_CLOSE_FD_AND_FILE_SLOT might save cycles for some
users, but it tries to do two things at a time and it's not clear how to
handle errors and what to return in a single result field when one part
fails and another completes well. Kill it for now.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/837c745019b3795941eee4fcfd7de697886d645b.1655224415.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d884b649

31 5月, 2022 1 次提交

io_uring: let IORING_OP_FILES_UPDATE support choosing fixed file slots · a7c41b46

由 Xiaoguang Wang 提交于 5月 30, 2022

One big issue with the file registration feature is that it needs user
space apps to maintain free slot info about io_uring's fixed file table,
which really is a burden for development. io_uring now supports choosing
free file slot for user space apps by using IORING_FILE_INDEX_ALLOC flag
in accept, open, and socket operations, but they need the app to use
direct accept or direct open, which not all apps are prepared to use yet.

To support apps that still need real fds, make use of the registration
feature easier. Let IORING_OP_FILES_UPDATE support choosing fixed file
slots, which will store picked fixed files slots in fd array and let cqe
return the number of slots allocated.
Suggested-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
[axboe: move flag to uapi io_uring header, change goto to break, init]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a7c41b46

18 5月, 2022 1 次提交

io_uring: add support for ring mapped supplied buffers · c7fb1942

由 Jens Axboe 提交于 4月 30, 2022

Provided buffers allow an application to supply io_uring with buffers
that can then be grabbed for a read/receive request, when the data
source is ready to deliver data. The existing scheme relies on using
IORING_OP_PROVIDE_BUFFERS to do that, but it can be difficult to use
in real world applications. It's pretty efficient if the application
is able to supply back batches of provided buffers when they have been
consumed and the application is ready to recycle them, but if
fragmentation occurs in the buffer space, it can become difficult to
supply enough buffers at the time. This hurts efficiency.

Add a register op, IORING_REGISTER_PBUF_RING, which allows an application
to setup a shared queue for each buffer group of provided buffers. The
application can then supply buffers simply by adding them to this ring,
and the kernel can consume then just as easily. The ring shares the head
with the application, the tail remains private in the kernel.

Provided buffers setup with IORING_REGISTER_PBUF_RING cannot use
IORING_OP_{PROVIDE,REMOVE}_BUFFERS for adding or removing entries to the
ring, they must use the mapped ring. Mapped provided buffer rings can
co-exist with normal provided buffers, just not within the same group ID.

To gauge overhead of the existing scheme and evaluate the mapped ring
approach, a simple NOP benchmark was written. It uses a ring of 128
entries, and submits/completes 32 at the time. 'Replenish' is how
many buffers are provided back at the time after they have been
consumed:

Test			Replenish			NOPs/sec
================================================================
No provided buffers	NA				~30M
Provided buffers	32				~16M
Provided buffers	 1				~10M
Ring buffers		32				~27M
Ring buffers		 1				~27M

The ring mapped buffers perform almost as well as not using provided
buffers at all, and they don't care if you provided 1 or more back at
the same time. This means application can just replenish as they go,
rather than need to batch and compact, further reducing overhead in the
application. The NOP benchmark above doesn't need to do any compaction,
so that overhead isn't even reflected in the above test.
Co-developed-by: NDylan Yudaken <dylany@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c7fb1942

14 5月, 2022 1 次提交

io_uring: add IORING_ACCEPT_MULTISHOT for accept · 390ed29b

由 Hao Xu 提交于 5月 14, 2022

add an accept_flag IORING_ACCEPT_MULTISHOT for accept, which is to
support multishot.
Signed-off-by: NHao Xu <howeyxu@tencent.com>
Link: https://lore.kernel.org/r/20220514142046.58072-2-haoxu.linux@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

390ed29b

13 5月, 2022 2 次提交

io_uring: add flag for allocating a fully sparse direct descriptor space · a8da73a3

由 Jens Axboe 提交于 5月 09, 2022

Currently to setup a fully sparse descriptor space upfront, the app needs
to alloate an array of the full size and memset it to -1 and then pass
that in. Make this a bit easier by allowing a flag that simply does
this internally rather than needing to copy each slot separately.

This works with IORING_REGISTER_FILES2 as the flag is set in struct
io_uring_rsrc_register, and is only allow when the type is
IORING_RSRC_FILE as this doesn't make sense for registered buffers.
Reviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a8da73a3

io_uring: allow allocated fixed files for openat/openat2 · 1339f24b

由 Jens Axboe 提交于 5月 07, 2022

If the application passes in IORING_FILE_INDEX_ALLOC as the file_slot,
then that's a hint to allocate a fixed file descriptor rather than have
one be passed in directly.

This can be useful for having io_uring manage the direct descriptor space.

Normal open direct requests will complete with 0 for success, and < 0
in case of error. If io_uring is asked to allocated the direct descriptor,
then the direct descriptor is returned in case of success.
Reviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1339f24b

11 5月, 2022 1 次提交

fs,io_uring: add infrastructure for uring-cmd · ee692a21

由 Jens Axboe 提交于 5月 11, 2022

file_operations->uring_cmd is a file private handler.
This is somewhat similar to ioctl but hopefully a lot more sane and
useful as it can be used to enable many io_uring capabilities for the
underlying operation.

IORING_OP_URING_CMD is a file private kind of request. io_uring doesn't
know what is in this command type, it's for the provider of ->uring_cmd()
to deal with.
Co-developed-by: NKanchan Joshi <joshi.k@samsung.com>
Signed-off-by: NKanchan Joshi <joshi.k@samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220511054750.20432-2-joshi.k@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ee692a21

09 5月, 2022 2 次提交

io_uring: support CQE32 in io_uring_cqe · 7a51e5b4

由 Stefan Roesch 提交于 4月 26, 2022

This adds the big_cqe array to the struct io_uring_cqe to support large
CQE's.
Co-developed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NStefan Roesch <shr@fb.com>
Reviewed-by: NKanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/r/20220426182134.136504-2-shr@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

7a51e5b4

io_uring: add support for 128-byte SQEs · ebdeb7c0

由 Jens Axboe 提交于 3月 31, 2022

Normal SQEs are 64-bytes in length, which is fine for all the commands
we support. However, in preparation for supporting passthrough IO,
provide an option for setting up a ring with 128-byte SQEs.

We continue to use the same type for io_uring_sqe, it's marked and
commented with a zero sized array pad at the end. This provides up
to 80 bytes of data for a passthrough command - 64 bytes for the
extra added data, and 16 bytes available at the end of the existing
SQE.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ebdeb7c0

06 5月, 2022 1 次提交

io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg · 0455d4cc

由 Jens Axboe 提交于 4月 26, 2022

If IORING_RECVSEND_POLL_FIRST is set for recv/recvmsg or send/sendmsg,
then we arm poll first rather than attempt a receive or send upfront.
This can be useful if we expect there to be no data (or space) available
for the request, as we can then avoid wasting time on the initial
issue attempt.
Reviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0455d4cc

30 4月, 2022 3 次提交

io_uring: add IORING_SETUP_TASKRUN_FLAG · ef060ea9

由 Jens Axboe 提交于 4月 25, 2022

If IORING_SETUP_COOP_TASKRUN is set to use cooperative scheduling for
running task_work, then IORING_SETUP_TASKRUN_FLAG can be set so the
application can tell if task_work is pending in the kernel for this
ring. This allows use cases like io_uring_peek_cqe() to still function
appropriately, or for the task to know when it would be useful to
call io_uring_wait_cqe() to run pending events.
Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20220426014904.60384-7-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>

ef060ea9

io_uring: use TWA_SIGNAL_NO_IPI if IORING_SETUP_COOP_TASKRUN is used · e1169f06

由 Jens Axboe 提交于 4月 25, 2022

If this is set, io_uring will never use an IPI to deliver a task_work
notification. This can be used in the common case where a single task or
thread communicates with the ring, and doesn't rely on
io_uring_cqe_peek().

This provides a noticeable win in performance, both from eliminating
the IPI itself, but also from avoiding interrupting the submitting
task unnecessarily.
Reviewed-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20220426014904.60384-6-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>

e1169f06

io_uring: return hint on whether more data is available after receive · f548a12e

由 Jens Axboe 提交于 4月 26, 2022

For now just use a CQE flag for this, with big CQE support we could
return the actual number of bytes left.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f548a12e

26 4月, 2022 1 次提交

io_uring: add type to op enum · cc51eaa8

由 Dylan Yudaken 提交于 4月 26, 2022

It is useful to have a type enum for opcodes, to allow the compiler to
assert that every value is used in a switch statement.
Signed-off-by: NDylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220426082907.3600028-2-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

cc51eaa8

25 4月, 2022 6 次提交

io_uring: add socket(2) support · 1374e08e

由 Jens Axboe 提交于 4月 12, 2022

Supports both regular socket(2) where a normal file descriptor is
instantiated when called, or direct descriptors.

Link: https://lore.kernel.org/r/20220412202240.234207-3-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>

1374e08e

io_uring: add fgetxattr and getxattr support · a56834e0

由 Stefan Roesch 提交于 3月 23, 2022

This adds support to io_uring for the fgetxattr and getxattr API.
Signed-off-by: NStefan Roesch <shr@fb.com>
Acked-by: NChristian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20220323154420.3301504-5-shr@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a56834e0

io_uring: add fsetxattr and setxattr support · e9621e2b

由 Stefan Roesch 提交于 3月 23, 2022

This adds support to io_uring for the fsetxattr and setxattr API.
Signed-off-by: NStefan Roesch <shr@fb.com>
Acked-by: NChristian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20220323154420.3301504-4-shr@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e9621e2b

io_uring: add support for IORING_ASYNC_CANCEL_ANY · 970f256e

由 Jens Axboe 提交于 4月 18, 2022

Rather than match on a specific key, be it user_data or file, allow
canceling any request that we can lookup. Works like
IORING_ASYNC_CANCEL_ALL in that it cancels multiple requests, but it
doesn't key off user_data or the file.

Can't be set with IORING_ASYNC_CANCEL_FD, as that's a key selector.
Only one may be used at the time.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20220418164402.75259-6-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>

970f256e

io_uring: allow IORING_OP_ASYNC_CANCEL with 'fd' key · 4bf94615

由 Jens Axboe 提交于 4月 18, 2022

Currently sqe->addr must contain the user_data of the request being
canceled. Introduce the IORING_ASYNC_CANCEL_FD flag, which tells the
kernel that we're keying off the file fd instead for cancelation. This
allows canceling any request that a) uses a file, and b) was assigned the
file based on the value being passed in.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20220418164402.75259-5-axboe@kernel.dk

4bf94615

io_uring: add support for IORING_ASYNC_CANCEL_ALL · 8e29da69

由 Jens Axboe 提交于 4月 18, 2022

The current cancelation will lookup and cancel the first request it
finds based on the key passed in. Add a flag that allows to cancel any
request that matches they key. It completes with the number of requests
found and canceled, or res < 0 if an error occured.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20220418164402.75259-4-axboe@kernel.dk

8e29da69

11 4月, 2022 1 次提交

io_uring: flag the fact that linked file assignment is sane · c4212f3e

由 Jens Axboe 提交于 4月 10, 2022

Give applications a way to tell if the kernel supports sane linked files,
as in files being assigned at the right time to be able to reliably
do <open file direct into slot X><read file from slot X> while using
IOSQE_IO_LINK to order them.

Not really a bug fix, but flag it as such so that it gets pulled in with
backports of the deferred file assignment.

Fixes: 6bf9c47a ("io_uring: defer file assignment")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c4212f3e

24 3月, 2022 1 次提交

io_uring: remove IORING_CQE_F_MSG · 7ef66d18

由 Jens Axboe 提交于 3月 24, 2022

This was introduced with the message ring opcode, but isn't strictly
required for the request itself. The sender can encode what is needed
in user_data, which is passed to the receiver. It's unclear if having
a separate flag that essentially says "This CQE did not originate from
an SQE on this ring" provides any real utility to applications. While
we can always re-introduce a flag to provide this information, we cannot
take it away at a later point in time.

Remove the flag while we still can, before it's in a released kernel.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7ef66d18

11 3月, 2022 2 次提交

io_uring: allow submissions to continue on error · bcbb7bf6

由 Jens Axboe 提交于 3月 10, 2022

By default, io_uring will stop submitting a batch of requests if we run
into an error submitting a request. This isn't strictly necessary, as
the error result is passed out-of-band via a CQE anyway. And it can be
a bit confusing for some applications.

Provide a way to setup a ring that will continue submitting on error,
when the error CQE has been posted.

There's still one case that will break out of submission. If we fail
allocating a request, then we'll still return -ENOMEM. We could in theory
post a CQE for that condition too even if we never got a request. Leave
that for a potential followup.
Reported-by: NDylan Yudaken <dylany@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bcbb7bf6

io_uring: add support for IORING_OP_MSG_RING command · 4f57f06c

由 Jens Axboe 提交于 3月 10, 2022

This adds support for IORING_OP_MSG_RING, which allows an SQE to signal
another ring. That allows either waking up someone waiting on the ring,
or even passing a 64-bit value via the user_data field in the CQE.

sqe->fd must contain the fd of a ring that should receive the CQE.
sqe->off will be propagated to the cqe->user_data on the target ring,
and sqe->len will be propagated to cqe->res. The results CQE will have
IORING_CQE_F_MSG set in its flags, to indicate that this CQE was generated
from a messaging request rather than a SQE issued locally on that ring.
This effectively allows passing a 64-bit and a 32-bit quantify between
the two rings.

This request type has the following request specific error cases:

- -EBADFD. Set if the sqe->fd doesn't point to a file descriptor that is
  of the io_uring type.
- -EOVERFLOW. Set if we were not able to deliver a request to the target
  ring.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4f57f06c

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功