- 13 5月, 2022 7 次提交
-
-
由 Pavel Begunkov 提交于
If an opcode handler semi-reliably returns -EAGAIN, io_wq_submit_work() might continue busily hammer the same handler over and over again, which is not ideal. The -EAGAIN handling in question was put there only for IOPOLL, so restrict it to IOPOLL mode only where there is no other recourse than to retry as we cannot wait. Fixes: def596e9 ("io_uring: support for IO polling") Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f168b4f24181942f3614dd8ff648221736f572e6.1652433740.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Currently to setup a fully sparse descriptor space upfront, the app needs to alloate an array of the full size and memset it to -1 and then pass that in. Make this a bit easier by allowing a flag that simply does this internally rather than needing to copy each slot separately. This works with IORING_REGISTER_FILES2 as the flag is set in struct io_uring_rsrc_register, and is only allow when the type is IORING_RSRC_FILE as this doesn't make sense for registered buffers. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We currently limit these to 32K, but since we're now backing the table space with vmalloc when needed, there's no reason why we can't make it bigger. The total space is limited by RLIMIT_NOFILE as well. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If the application passes in IORING_FILE_INDEX_ALLOC as the file_slot, then that's a hint to allocate a fixed file descriptor rather than have one be passed in directly. This can be useful for having io_uring manage the direct descriptor space, and also allows multi-shot support to work with fixed files. Normal accept direct requests will complete with 0 for success, and < 0 in case of error. If io_uring is asked to allocated the direct descriptor, then the direct descriptor is returned in case of success. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If the application passes in IORING_FILE_INDEX_ALLOC as the file_slot, then that's a hint to allocate a fixed file descriptor rather than have one be passed in directly. This can be useful for having io_uring manage the direct descriptor space. Normal open direct requests will complete with 0 for success, and < 0 in case of error. If io_uring is asked to allocated the direct descriptor, then the direct descriptor is returned in case of success. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Applications currently always pick where they want fixed files to go. In preparation for allowing these types of commands with multishot support, add a basic allocator in the fixed file table. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
In preparation for adding a basic allocator for direct descriptors, add helpers that set/clear whether a file slot is used. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 5月, 2022 11 次提交
-
-
由 Jens Axboe 提交于
It's not needed as the REQ_F_BUFFER_SELECTED flag tracks the state of whether or not kbuf is valid, so just drop it. Suggested-by: NDylan Yudaken <dylany@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We have io_kiocb->buf_index which is used for either fixed buffers, or for provided buffers. For the latter, it's used to hold the buffer group ID for buffer selection. Post selection, req->kbuf->bid is used to get the buffer ID. Store the buffer ID, when selected, in req->buf_index. If we do end up recycling the buffer, reset it back to the buffer group ID. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
The timeout and other items that follow are less hot, so let's move the provided buffer state above that. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
These are mutually exclusive - if you use provided buffers, then you cannot use fixed buffers and vice versa. Move them into the same spot in the io_kiocb, which is also advantageous for provided buffers as they get near the submit side hot cacheline. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
In preparation for providing another way to select a buffer, move the existing logic into a helper. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Callers already have room to store the addr and length information, clean it up by having the caller just assign the previously provided data. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Use a plain array for any group ID that's less than 64, and punt anything beyond that to an xarray. 64 fits in a page even for 4KB page sizes and with the planned additions. This makes the expected group usage faster by avoiding a hash and lookup to find our list, and it uses less memory upfront by not allocating any memory for provided buffers unless it's actually being used. Suggested-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
The read/write opcodes use it already, but the recv/recvmsg do not. If we switch them over and read and validate this at init time while we're checking if the opcode supports it anyway, then we can do it in one spot and we don't have to pass in a separate group ID for io_buffer_select(). Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
There's no point in validity checking buf_index if the request doesn't have REQ_F_BUFFER_SELECT set, as we will never use it for that case. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
After the recent changes, this is direct call to io_buffer_select() anyway. With this change, there are no wrappers left for provided buffer selection. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
There's no point in having callers provide a kbuf, we're just returning the address anyway. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 5月, 2022 4 次提交
-
-
由 Jens Axboe 提交于
It's just a thin wrapper around io_buffer_select(), get rid of it. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
For all of send/sendmsg and recv/recvmsg we have the local 'sr' variable, yet some cases still use req->sr_msg which sr points to. Use 'sr' consistently. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If IORING_RECVSEND_POLL_FIRST is set for recv/recvmsg or send/sendmsg, then we arm poll first rather than attempt a receive or send upfront. This can be useful if we expect there to be no data (or space) available for the request, as we can then avoid wasting time on the initial issue attempt. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Don't punt this check to the op prep handlers, add the support to io_op_defs and we can check them while setting up the request. This reduces the text size by 500 bytes on aarch64, and makes this less fragile by having the check in one spot and needing opcodes to opt in to IOPOLL or ioprio support. Reviewed-by: NHao Xu <howeyxu@tencent.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 4月, 2022 5 次提交
-
-
由 Almog Khaikin 提交于
The IORING_SQ_NEED_WAKEUP flag is now set using atomic_or() which implies a full barrier on some architectures but it is not required to do so. Use the more appropriate smp_mb__after_atomic() which avoids the extra barrier on those architectures. Signed-off-by: NAlmog Khaikin <almogkh@gmail.com> Link: https://lore.kernel.org/r/20220426163403.112692-1-almogkh@gmail.com Fixes: 8018823e6987 ("io_uring: serialize ctx->rings->sq_flags with atomic_or/and") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If IORING_SETUP_COOP_TASKRUN is set to use cooperative scheduling for running task_work, then IORING_SETUP_TASKRUN_FLAG can be set so the application can tell if task_work is pending in the kernel for this ring. This allows use cases like io_uring_peek_cqe() to still function appropriately, or for the task to know when it would be useful to call io_uring_wait_cqe() to run pending events. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-7-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
If this is set, io_uring will never use an IPI to deliver a task_work notification. This can be used in the common case where a single task or thread communicates with the ring, and doesn't rely on io_uring_cqe_peek(). This provides a noticeable win in performance, both from eliminating the IPI itself, but also from avoiding interrupting the submitting task unnecessarily. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-6-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
While doing so, switch SQPOLL to TWA_SIGNAL_NO_IPI as well, as that just does a task wakeup and then we can remove the special wakeup we have in task_work_add. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-5-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Rather than require ctx->completion_lock for ensuring that we don't clobber the flags, use the atomic bitop helpers instead. This removes the need to grab the completion_lock, in preparation for needing to set or clear sq_flags when we don't know the status of this lock. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-3-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 4月, 2022 1 次提交
-
-
由 Jens Axboe 提交于
If IO_URING_SCM_ALL isn't set, as it would not be on 32-bit builds, then we trigger a warning: fs/io_uring.c: In function '__io_sqe_files_unregister': fs/io_uring.c:8992:13: warning: unused variable 'i' [-Wunused-variable] 8992 | int i; | ^ Move the ifdef up to include the 'i' variable declaration. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Fixes: 5e45690a ("io_uring: store SCM state in io_fixed_file->file_ptr") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 4月, 2022 12 次提交
-
-
由 Dylan Yudaken 提交于
Right now io_uring will not actively inform userspace if a CQE is dropped. This is extremely rare, requiring a CQ ring overflow, as well as a GFP_ATOMIC kmalloc failure. However the consequences could cause for example applications to go into an undefined state, possibly waiting for a CQE that never arrives. Return an error code (EBADR) in these cases. Since this is expected to be incredibly rare, try and avoid as much as possible affecting the hot code paths, and so it only is returned lazily and when there is no other available CQEs. Once the error is returned, reset the error condition assuming the user is either ok with it or will clean up appropriately. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-6-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
Prepare to use this bitfield for more flags by using constants instead of magic value 0 Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-5-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
io_uring_enter returns the count submitted preferrably over an error code. In some code paths this check is not required, so reorganise the code so that the check is only done as needed. This is also a prep for returning error codes only in waiting scenarios. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-4-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
Trace cqe overflows in io_uring. Print ocqe before the check, so if it is NULL it indicates that it has been dropped. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-3-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We currently check REQ_F_POLLED before arming async poll for a notification to retry. If it's set, then we don't allow poll and will punt to io-wq instead. This is done to prevent a situation where a buggy driver will repeatedly return that there's space/data available yet we get -EAGAIN. However, if we already transferred data, then it should be safe to rely on poll again. Gate the check on whether or not REQ_F_PARTIAL_IO is also set. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Like commit 7ba89d2a for recv/recvmsg, support MSG_WAITALL for the send side. If this flag is set and we do a short send, retry for a stream of seqpacket socket. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Rather than match on a specific key, be it user_data or file, allow canceling any request that we can lookup. Works like IORING_ASYNC_CANCEL_ALL in that it cancels multiple requests, but it doesn't key off user_data or the file. Can't be set with IORING_ASYNC_CANCEL_FD, as that's a key selector. Only one may be used at the time. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-6-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Currently sqe->addr must contain the user_data of the request being canceled. Introduce the IORING_ASYNC_CANCEL_FD flag, which tells the kernel that we're keying off the file fd instead for cancelation. This allows canceling any request that a) uses a file, and b) was assigned the file based on the value being passed in. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-5-axboe@kernel.dk
-
由 Jens Axboe 提交于
The current cancelation will lookup and cancel the first request it finds based on the key passed in. Add a flag that allows to cancel any request that matches they key. It completes with the number of requests found and canceled, or res < 0 if an error occured. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-4-axboe@kernel.dk
-
由 Jens Axboe 提交于
In preparation for being able to not only key cancel off the user_data, pass in the io_cancel_data struct for the various functions that deal with request cancelation. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-3-axboe@kernel.dk
-
由 Jens Axboe 提交于
It's only called from one location, and it always passes in 'false'. Kill the argument, and just pass in 'false' to io_poll_find(). Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-2-axboe@kernel.dk
-
由 Pavel Begunkov 提交于
Split timeout handling into removal + failing, so we can reduce spinlocking time and remove another instance of triple nested locking. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0f00d115f9d4c5749028f19623708ad3695512d6.1650458197.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-