- 30 4月, 2022 3 次提交
-
-
由 Jens Axboe 提交于
The only difference between set_notify_signal() and __set_notify_signal() is that the former checks if it needs to deliver an IPI to force a reschedule. As the io-wq workers never leave the kernel, and IPI is never needed, they simply need a wakeup. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-4-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Rather than require ctx->completion_lock for ensuring that we don't clobber the flags, use the atomic bitop helpers instead. This removes the need to grab the completion_lock, in preparation for needing to set or clear sq_flags when we don't know the status of this lock. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220426014904.60384-3-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Some use cases don't always need an IPI when sending a TWA_SIGNAL notification. Add TWA_SIGNAL_NO_IPI, which is just like TWA_SIGNAL, except it doesn't send an IPI to the target task. It merely sets TIF_NOTIFY_SIGNAL and wakes up the task. This can be useful in avoiding a forceful transition to the kernel if the task is running in userspace. Depending on the task_work in question, it may be quite fine waiting for the next reschedule or kernel enter anyway, or the use case may even have other mechanisms for hinting to the task that a transition may be useful. This can drive more cooperative scheduling of task_work. Reviewed-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/821f42b6-7d91-8074-8212-d34998097de4@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 4月, 2022 1 次提交
-
-
由 Jens Axboe 提交于
If IO_URING_SCM_ALL isn't set, as it would not be on 32-bit builds, then we trigger a warning: fs/io_uring.c: In function '__io_sqe_files_unregister': fs/io_uring.c:8992:13: warning: unused variable 'i' [-Wunused-variable] 8992 | int i; | ^ Move the ifdef up to include the 'i' variable declaration. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Fixes: 5e45690a ("io_uring: store SCM state in io_fixed_file->file_ptr") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 4月, 2022 36 次提交
-
-
由 Dylan Yudaken 提交于
Right now io_uring will not actively inform userspace if a CQE is dropped. This is extremely rare, requiring a CQ ring overflow, as well as a GFP_ATOMIC kmalloc failure. However the consequences could cause for example applications to go into an undefined state, possibly waiting for a CQE that never arrives. Return an error code (EBADR) in these cases. Since this is expected to be incredibly rare, try and avoid as much as possible affecting the hot code paths, and so it only is returned lazily and when there is no other available CQEs. Once the error is returned, reset the error condition assuming the user is either ok with it or will clean up appropriately. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-6-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
Prepare to use this bitfield for more flags by using constants instead of magic value 0 Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-5-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
io_uring_enter returns the count submitted preferrably over an error code. In some code paths this check is not required, so reorganise the code so that the check is only done as needed. This is also a prep for returning error codes only in waiting scenarios. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-4-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
Trace cqe overflows in io_uring. Print ocqe before the check, so if it is NULL it indicates that it has been dropped. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-3-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Dylan Yudaken 提交于
Add trace function for overflowing CQ ring. Signed-off-by: NDylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220421091345.2115755-2-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
We currently check REQ_F_POLLED before arming async poll for a notification to retry. If it's set, then we don't allow poll and will punt to io-wq instead. This is done to prevent a situation where a buggy driver will repeatedly return that there's space/data available yet we get -EAGAIN. However, if we already transferred data, then it should be safe to rely on poll again. Gate the check on whether or not REQ_F_PARTIAL_IO is also set. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Like commit 7ba89d2a for recv/recvmsg, support MSG_WAITALL for the send side. If this flag is set and we do a short send, retry for a stream of seqpacket socket. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Rather than match on a specific key, be it user_data or file, allow canceling any request that we can lookup. Works like IORING_ASYNC_CANCEL_ALL in that it cancels multiple requests, but it doesn't key off user_data or the file. Can't be set with IORING_ASYNC_CANCEL_FD, as that's a key selector. Only one may be used at the time. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-6-axboe@kernel.dkSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
Currently sqe->addr must contain the user_data of the request being canceled. Introduce the IORING_ASYNC_CANCEL_FD flag, which tells the kernel that we're keying off the file fd instead for cancelation. This allows canceling any request that a) uses a file, and b) was assigned the file based on the value being passed in. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-5-axboe@kernel.dk
-
由 Jens Axboe 提交于
The current cancelation will lookup and cancel the first request it finds based on the key passed in. Add a flag that allows to cancel any request that matches they key. It completes with the number of requests found and canceled, or res < 0 if an error occured. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-4-axboe@kernel.dk
-
由 Jens Axboe 提交于
In preparation for being able to not only key cancel off the user_data, pass in the io_cancel_data struct for the various functions that deal with request cancelation. Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-3-axboe@kernel.dk
-
由 Jens Axboe 提交于
It's only called from one location, and it always passes in 'false'. Kill the argument, and just pass in 'false' to io_poll_find(). Signed-off-by: NJens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220418164402.75259-2-axboe@kernel.dk
-
由 Pavel Begunkov 提交于
Split timeout handling into removal + failing, so we can reduce spinlocking time and remove another instance of triple nested locking. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0f00d115f9d4c5749028f19623708ad3695512d6.1650458197.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Move ->timeout_lock grabbing inside of io_timeout_cancel(), so we can do io_req_task_queue_fail() outside of the lock. It's much nicer than relying on triple nested locking. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cde758c2897930d31e205ed8f476d4ec879a8849.1650458197.git.asml.silence@gmail.com [axboe: drop now wrong timeout_lock annotation] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jens Axboe 提交于
A previous commit removed SCM accounting for non-unix sockets, as those are the only ones that can cause a fixed file reference. While that is true, it also means we're now dereferencing the file as part of the workqueue driven __io_sqe_files_unregister() after the process has exited. This isn't safe for SCM files, as unix gc may have already reaped them when the process exited. KASAN complains about this: [ 12.307040] Freed by task 0: [ 12.307592] kasan_save_stack+0x28/0x4c [ 12.308318] kasan_set_track+0x28/0x38 [ 12.309049] kasan_set_free_info+0x24/0x44 [ 12.309890] ____kasan_slab_free+0x108/0x11c [ 12.310739] __kasan_slab_free+0x14/0x1c [ 12.311482] slab_free_freelist_hook+0xd4/0x164 [ 12.312382] kmem_cache_free+0x100/0x1dc [ 12.313178] file_free_rcu+0x58/0x74 [ 12.313864] rcu_core+0x59c/0x7c0 [ 12.314675] rcu_core_si+0xc/0x14 [ 12.315496] _stext+0x30c/0x414 [ 12.316287] [ 12.316687] Last potentially related work creation: [ 12.317885] kasan_save_stack+0x28/0x4c [ 12.318845] __kasan_record_aux_stack+0x9c/0xb0 [ 12.319976] kasan_record_aux_stack_noalloc+0x10/0x18 [ 12.321268] call_rcu+0x50/0x35c [ 12.322082] __fput+0x2fc/0x324 [ 12.322873] ____fput+0xc/0x14 [ 12.323644] task_work_run+0xac/0x10c [ 12.324561] do_notify_resume+0x37c/0xe74 [ 12.325420] el0_svc+0x5c/0x68 [ 12.326050] el0t_64_sync_handler+0xb0/0x12c [ 12.326918] el0t_64_sync+0x164/0x168 [ 12.327657] [ 12.327976] Second to last potentially related work creation: [ 12.329134] kasan_save_stack+0x28/0x4c [ 12.329864] __kasan_record_aux_stack+0x9c/0xb0 [ 12.330735] kasan_record_aux_stack+0x10/0x18 [ 12.331576] task_work_add+0x34/0xf0 [ 12.332284] fput_many+0x11c/0x134 [ 12.332960] fput+0x10/0x94 [ 12.333524] __scm_destroy+0x80/0x84 [ 12.334213] unix_destruct_scm+0xc4/0x144 [ 12.334948] skb_release_head_state+0x5c/0x6c [ 12.335696] skb_release_all+0x14/0x38 [ 12.336339] __kfree_skb+0x14/0x28 [ 12.336928] kfree_skb_reason+0xf4/0x108 [ 12.337604] unix_gc+0x1e8/0x42c [ 12.338154] unix_release_sock+0x25c/0x2dc [ 12.338895] unix_release+0x58/0x78 [ 12.339531] __sock_release+0x68/0xec [ 12.340170] sock_close+0x14/0x20 [ 12.340729] __fput+0x18c/0x324 [ 12.341254] ____fput+0xc/0x14 [ 12.341763] task_work_run+0xac/0x10c [ 12.342367] do_notify_resume+0x37c/0xe74 [ 12.343086] el0_svc+0x5c/0x68 [ 12.343510] el0t_64_sync_handler+0xb0/0x12c [ 12.344086] el0t_64_sync+0x164/0x168 We have an extra bit we can use in file_ptr on 64-bit, use that to store whether this file is SCM'ed or not, avoiding the need to look at the file contents itself. This does mean that 32-bit will be stuck with SCM for all registered files, just like 64-bit did before the referenced commit. Fixes: 1f59bc0f ("io_uring: don't scm-account for non af_unix sockets") Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
The ctx argument of io_req_put_rsrc() is not used, kill it. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bb51bf3ff02775b03e6ea21bc79c25d7870d1644.1650311386.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Add a simple helper to encapsulating dropping rsrc nodes references, it's cleaner and will help if we'd change rsrc refcounting or play with percpu_ref_put() [no]inlining. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/63fdd953ac75898734cd50e8f69e95e6664f46fe.1650311386.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
req->fixed_rsrc_refs keeps a pointer to rsrc node pcpu references, but it's more natural just to store rsrc node directly. There were some reasons for that in the past but not anymore. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cee1c86ec9023f3e4f6ce8940d58c017ef8782f4.1650311386.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
All io_assign_file() callers do error handling themselves, req_set_fail() in the io_assign_file()'s fail path needlessly bloats the kernel and is not the best abstraction to have. Simplify the error path. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/eff77fb1eac2b6a90cca5223813e6a396ffedec0.1650311386.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
We have io_ring_submit_[un]lock() functions helping us with conditional ->uring_lock locking, use them in io_file_get_fixed() instead of hand coding. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c9c9ff1e046f6eb68da0a251962a697f8a2275fa.1650311386.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
We have several racy reads, mark them with data_race() to demonstrate this fact. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7e56e750d294c70b2a56938bd733386f19f0eb53.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Inline io_req_complete_fail_submit(), there is only one caller and the name doesn't tell us much. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fe5851af01dcd39fc84b71b8539c7cbe4658fb6d.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Remove one extra if for non-linked path of io_submit_sqe(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/03183199d1bf494b4a72eca16d792c8a5945acb4.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Remove the lazy link fail logic from io_submit_sqe() and hide it into a helper. It simplifies the code and will be needed in next patches. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6a68aca9cf4492132da1d7c8a09068b74aba3c65.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Add a macro for all link request flags to avoid duplication. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/df38b883e31e7e0ca4e364d25a0743862961b180.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
io_queue_sqe() is a part of the submission path and we try hard to keep it inlined, so shed some extra bytes from it by moving the error checking part into io_queue_sqe_arm_apoll() and renaming it accordingly. note: io_queue_sqe_arm_apoll() is not inlined, thus the patch doesn't change the number of function calls for the apoll path. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9b79edd246336decfaca79b949a15ac69123490d.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Rename io_queue_async_work(). The name is pretty old but now doesn't reflect well what the function is doing. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5d4b25c54cccf084f9f2fd63bd4e4fa4515e998e.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Inline io_queue_sqe() as there is only one caller left, and rename __io_queue_sqe(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d5742683b7a7caceb1c054e91e5b9135b0f3b858.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
We try to aggresively inline the submission path, so it's a good idea to not pollute it with colder code. One of them is linked timeout preparation + queue, which can be extracted into a function. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ecf74df7ac77389b6d9211211ec4954e91de98ba.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Inline io_free_req() into its only user and remove an underscore prefix from __io_free_req(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ed114edef5c256a644f4839bb372df70d8df8e3f.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
We have several spots where a call to io_fill_cqe_req() is immediately followed by io_put_req_deferred(). Replace them with __io_req_complete_post() and get rid of io_put_req_deferred() and io_fill_cqe_req(). > size ./fs/io_uring.o text data bss dec hex filename 86942 13734 8 100684 1894c ./fs/io_uring.o > size ./fs/io_uring.o text data bss dec hex filename 86438 13654 8 100100 18704 ./fs/io_uring.o Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/10672a538774ac8986bee6468d960527af59169d.1650056133.git.asml.silence@gmail.com [axboe: fold in followup fix] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Get rid of some useless local variables Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7798327b684b7015f7e4300420142ddfcd317297.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
When we meet PF_EXITING in io_poll_check_events(), don't overcomplicate the code with io_poll_mark_cancelled() but just return -ECANCELED and the callers will deal with the rest. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f0cc981af82a5b193658f8f44397eeb3bf838b7b.1650056133.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
io_get_cqe() is expensive because of a bunch of loads, masking, etc. However, most of the time we should have enough of entries in the CQ, so we can cache two pointers representing a range of contiguous CQE memory we can use. When the range is exhausted we'll go through a slower path to set up a new range. When there are no CQEs avaliable, pointers will naturally point to the same address. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/487eeef00f3146537b3d9c1a9cef2fc0b9a86f81.1649771823.git.asml.silence@gmail.com [axboe: santinel -> sentinel] Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Considering all inlining io_submit_sqe() is huge and usually ends up calling some other functions. We decrement @left in io_submit_sqes() just before calling io_submit_sqe() and use it later after the call. Considering how huge io_submit_sqe() is, there is not much hope @left will be treated gracefully by compilers. Decrement it after the call, not only it's easier on register spilling and probably saves stack write/read, but also at least for x64 uses CPU flags set by the dec instead of doing (read/write and tests). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/807f9a276b54ee8ff4e42e2b78721484f1c71743.1649771823.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Pavel Begunkov 提交于
Instead of keeping @Submitted in io_submit_sqes(), which for each iteration requires comparison with the initial number of SQEs, store the number of SQEs left to submit. We'll need nr only for when we're done with SQE handling. note: if we can't allocate a req for the first SQE we always has been returning -EAGAIN to the userspace, save this behaviour by looking into the cache in a slow path. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c3b3df9aeae4c2f7a53fd8386385742e4e261e77.1649771823.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-