提交 · da521626ac620d8719d674a48b8ec3620eefd42a · openeuler / Kernel

24 8月, 2021 40 次提交

bio: optimize initialization of a bio · da521626

由 Jens Axboe 提交于 8月 11, 2021

The memset() used is measurably slower in targeted benchmarks, wasting
about 1% of the total runtime, or 50% of the (later) hot path cached
bio alloc. Get rid of it and fill in the bio manually.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

da521626

io_uring: optimise hot path of ltimeout prep · fd08e530

由 Pavel Begunkov 提交于 8月 11, 2021

io_prep_linked_timeout() grew too heavy and compiler now refuse to
inline the function. Help it by splitting in two and annotating with
inline.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/560636717a32e9513724f09b9ecaace942dde4d4.1628705069.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

fd08e530

io_uring: skip request refcounting · 20e60a38

由 Pavel Begunkov 提交于 8月 11, 2021

As submission references are gone, there is only one initial reference
left. Instead of actually doing atomic refcounting, add a flag
indicating whether we're going to take more refs or doing any other sync
magic. The flag should be set before the request may get used in
parallel.

Together with the previous patch it saves 2 refcount atomics per request
for IOPOLL and IRQ completions, and 1 atomic per req for inline
completions, with some exceptions. In particular, currently, there are
three cases, when the refcounting have to be enabled:
- Polling, including apoll. Because double poll entries takes a ref.
  Might get relaxed in the near future.
- Link timeouts, enabled for both, the timeout and the request it's
  bound to, because they work in-parallel and we need to synchronise
  to cancel one of them on completion.
- When a request gets in io-wq, because it doesn't hold uring_lock and
  we need guarantees of submission references.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8b204b6c5f6643062270a1913d6d3a7f8f795fd9.1628705069.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

20e60a38

io_uring: remove submission references · 5d5901a3

由 Pavel Begunkov 提交于 8月 11, 2021

Requests are by default given with two references, submission and
completion. Completion references are straightforward, they represent
request ownership and are put when a request is completed or so.
Submission references are a bit more trickier. They're needed when
io_issue_sqe() followed deep into the submission stack (e.g. in fs,
block, drivers, etc.), request may have given away for concurrent
execution or already completed, and the code unwinding back to
io_issue_sqe() may be accessing some pieces of our requests, e.g.
file or iov.

Now, we prevent such async/in-depth completions by pushing requests
through task_work. Punting to io-wq is also done through task_works,
apart from a couple of cases with a pretty well known context. So,
there're two cases:
1) io_issue_sqe() from the task context and protected by ->uring_lock.
Either requests return back to io_uring or handed to task_work, which
won't be executed because we're currently controlling that task. So,
we can be sure that requests are staying alive all the time and we don't
need submission references to pin them.

2) io_issue_sqe() from io-wq, which doesn't hold the mutex. The role of
submission reference is played by io-wq reference, which is put by
io_wq_submit_work(). Hence, it should be fine.

Considering that, we can carefully kill the submission reference.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6b68f1c763229a590f2a27148aee77767a8d7750.1628705069.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

5d5901a3

io_uring: remove req_ref_sub_and_test() · 91c2f697

由 Pavel Begunkov 提交于 8月 11, 2021

Soon, we won't need to put several references at once, remove
req_ref_sub_and_test() and @nr argument from io_put_req_deferred(),
and put the rest of the references by hand.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1868c7554108bff9194fb5757e77be23fadf7fc0.1628705069.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

91c2f697

io_uring: move req_ref_get() and friends · 21c843d5

由 Pavel Begunkov 提交于 8月 11, 2021

Move all request refcount helpers to avoid forward declarations in the
future.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/89fd36f6f3fe5b733dfe4546c24725eee40df605.1628705069.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

21c843d5

io_uring: remove IRQ aspect of io_ring_ctx completion lock · 79ebeaee

由 Jens Axboe 提交于 8月 10, 2021

We have no hard/soft IRQ users of this lock left, remove any IRQ
disabling/saving and restoring when grabbing this lock.

This is straight forward with no users entering with IRQs disabled
anymore, the only thing to look out for is the waitqueue poll head
lock which nests inside the completion lock. That needs IRQs disabled,
and hence we have to do that now instead of relying on the outer lock
doing so.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

79ebeaee

io_uring: run regular file completions from task_work · 8ef12efe

由 Jens Axboe 提交于 8月 10, 2021

This is in preparation to making the completion lock work outside of
hard/soft IRQ context.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8ef12efe

io_uring: run linked timeouts from task_work · 89b263f6

由 Jens Axboe 提交于 8月 10, 2021

This is in preparation to making the completion lock work outside of
hard/soft IRQ context.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

89b263f6

io_uring: run timeouts from task_work · 89850fce

由 Jens Axboe 提交于 8月 10, 2021

This is in preparation to making the completion lock work outside of
hard/soft IRQ context.

Add a timeout_lock to handle the ordering of timeout completions or
cancelations with the timeouts actually triggering.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

89850fce

io_uring: remove file batch-get optimisation · 62906e89

由 Pavel Begunkov 提交于 8月 10, 2021

For requests with non-fixed files, instead of grabbing just one
reference, we get by the number of left requests, so the following
requests using the same file can take it without atomics.

However, it's not all win. If there is one request in the middle
not using files or having a fixed file, we'll need to put back the left
references. Even worse if an application submits requests dealing with
different files, it will do a put for each new request, so doubling the
number of atomics needed. Also, even if not used, it's still takes some
cycles in the submission path.

If a file used many times, it rather makes sense to pre-register it, if
not, we may fall in the described pitfall. So, this optimisation is a
matter of use case. Go with the simpliest code-wise way, remove it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

62906e89

io_uring: clean up tctx_task_work() · 6294f368

由 Pavel Begunkov 提交于 8月 10, 2021

After recent fixes, tctx_task_work() always does proper spinlocking
before looking into ->task_list, so now we don't need atomics for
->task_state, replace it with non-atomic task_running using the critical
section.

Tide it up, combine two separate block with spinlocking, and always try
to splice in there, so we do less locking when new requests are arriving
during the function execution.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
[axboe: fix missing ->task_running reset on task_work_add() failure]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6294f368

io_uring: inline io_poll_remove_waitqs · 5d709043

由 Pavel Begunkov 提交于 8月 09, 2021

Inline io_poll_remove_waitqs() into its only user and clean it up.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2f1a91a19ffcd591531dc4c61e2f11c64a2d6a6d.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

5d709043

io_uring: remove extra argument for overflow flush · 90f67366

由 Pavel Begunkov 提交于 8月 09, 2021

Unlike __io_cqring_overflow_flush(), nobody does forced flushing with
io_cqring_overflow_flush(), so removed the argument from it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7594f869ca41b7cfb5a35a3c7c2d402242834e9e.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

90f67366

io_uring: inline struct io_comp_state · cd0ca2e0

由 Pavel Begunkov 提交于 8月 09, 2021

Inline struct io_comp_state into struct io_submit_state. They are
already coupled tightly, together with mixed responsibilities it
only brings confusion having them separately.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e55bba77426b399e3a2e54e3c6c267c6a0fc4b57.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

cd0ca2e0

io_uring: use inflight_entry instead of compl.list · bb943b82

由 Pavel Begunkov 提交于 8月 09, 2021

req->compl.list is used to cache freed requests, and so can't overlap in
time with req->inflight_entry. So, use inflight_entry to link requests
and remove compl.list.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e430e79d22d70a190d718831bda7bfed1daf8976.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

bb943b82

io_uring: remove redundant args from cache_free · 7255834e

由 Pavel Begunkov 提交于 8月 09, 2021

We don't use @tsk argument of io_req_cache_free(), remove it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6a28b4a58ee0aaf0db98e2179b9c9f06f9b0cca1.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

7255834e

io_uring: cache __io_free_req()'d requests · c34b025f

由 Pavel Begunkov 提交于 8月 09, 2021

Don't kfree requests in __io_free_req() but put them back into the
internal request cache. That makes allocations more sustainable and will
be used for refcounting optimisations.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9f4950fbe7771c8d41799366d0a3a08ac3040236.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

c34b025f

io_uring: move io_fallback_req_func() · f56165e6

由 Pavel Begunkov 提交于 8月 09, 2021

Move io_fallback_req_func() to kill yet another forward declaration.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d0a8f9d9a0057ed761d6237167d51c9378798d2d.1628536684.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

f56165e6

io_uring: optimise putting task struct · e9dbe221

由 Pavel Begunkov 提交于 8月 09, 2021

We cache all the reference to task + tctx, so if io_put_task() is
called by the corresponding task itself, we can save on atomics and
return the refs right back into the cache.

It's beneficial for all inline completions, and also iopolling, when
polling and submissions are done by the same task, including
SQPOLL|IOPOLL.

Note: io_uring_cancel_generic() can return refs to the cache as well,
so those should be flushed in the loop for tctx_inflight() to work
right.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6fe9646b3cb70e46aca1f58426776e368c8926b3.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e9dbe221

io_uring: drop exec checks from io_req_task_submit · af066f31

由 Pavel Begunkov 提交于 8月 09, 2021

In case of on-exec io_uring cancellations, tasks already wait for all
submitted requests to get completed/cancelled, so we don't need to check
for ->in_execve separately.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/be8707049f10df9d20ca03dc4ca3316239b5e8e0.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

af066f31

io_uring: kill unused IO_IOPOLL_BATCH · bbbca094

由 Pavel Begunkov 提交于 8月 09, 2021

IO_IOPOLL_BATCH is not used, delete it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b2bdf19dbee2c9fc8865bbab9412135a14e24a64.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

bbbca094

io_uring: improve ctx hang handling · 58d3be2c

由 Pavel Begunkov 提交于 8月 09, 2021

If io_ring_exit_work() can't get it done in 5 minutes, something is
going very wrong, don't keep spinning at HZ / 20 rate, it doesn't help
and it may take much of CPU time if there is a lot of workers stuck as
such.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9e2d1ca81d569f6bc628af1a42ff6663bff7ce9c.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

58d3be2c

io_uring: deduplicate open iopoll check · d3fddf6d

由 Pavel Begunkov 提交于 8月 09, 2021

Move IORING_SETUP_IOPOLL check into __io_openat_prep(), so both openat
and openat2 reuse it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9a73ce83e4ee60d011180ef177eecef8e87ff2a2.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d3fddf6d

io_uring: inline io_free_req_deferred · 543af3a1

由 Pavel Begunkov 提交于 8月 09, 2021

Inline io_free_req_deferred(), there is no reason to keep it separated.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ce04b7180d4eac0d69dd00677b227eefe80c2cc5.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

543af3a1

io_uring: move io_rsrc_node_alloc() definition · b9bd2bea

由 Pavel Begunkov 提交于 8月 09, 2021

Move the function together with io_rsrc_node_ref_zero() in the source
file as it is to get rid of forward declarations.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4d81f6f833e7d017860b24463a9a68b14a8a5ed2.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b9bd2bea

io_uring: move io_put_task() definition · 6a290a14

由 Pavel Begunkov 提交于 8月 09, 2021

Move the function in the source file as it is to get rid of forward
declarations.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/33d917d69e4206557c75a5b98fe22bcdf77ce47d.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6a290a14

io_uring: extract a helper for ctx quiesce · e73c5c7c

由 Pavel Begunkov 提交于 8月 09, 2021

Refactor __io_uring_register() by extracting a helper responsible for
ctx queisce. Looks better and will make it easier to add more
optimisations.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0339e0027504176be09237eefa7945bf9a6f153d.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e73c5c7c

io_uring: optimise io_cqring_wait() hot path · 90291099

由 Pavel Begunkov 提交于 8月 09, 2021

Turns out we always init struct io_wait_queue in io_cqring_wait(), even
if it's not used after, i.e. there are already enough of CQEs. And often
it's exactly what happens, for instance, requests may have been
completed inline, or in case of io_uring_enter(submit=N, wait=1).

It shows up in my profiler, so optimise it by delaying the struct init.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6f1b81c60b947d165583dc333947869c3d85d037.1628471125.git.asml.silence@gmail.com
[axboe: fixed up for new cqring wait]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

90291099

io_uring: add more locking annotations for submit · 282cdc86

由 Pavel Begunkov 提交于 8月 09, 2021

Add more annotations for submission path functions holding ->uring_lock.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/128ec4185e26fbd661dd3a424aa66108ee8ff951.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

282cdc86

io_uring: don't halt iopoll too early · a2416e1e

由 Pavel Begunkov 提交于 8月 09, 2021

IOPOLL users should care more about getting completions for requests
they submitted, but not in "device did/completed something". Currently,
io_do_iopoll() may return a positive number, which will instruct
io_iopoll_check() to break the loop and end the syscall, even if there
is not enough CQEs or none at all.

Don't return positive numbers, so io_iopoll_check() exits only when it
gets an actual error, need reschedule or got enough CQEs.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/641a88f751623b6758303b3171f0a4141f06726e.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a2416e1e

io_uring: refactor io_alloc_req · 864ea921

由 Pavel Begunkov 提交于 8月 09, 2021

Replace the main if of io_flush_cached_reqs() with inverted condition +
goto, so all the cases are handled in the same way. And also extract
io_preinit_req() to make it cleaner and easier to refer to.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1abcba1f7b55dc53bf1dbe95036e345ffb1d5b01.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

864ea921

io-wq: improve wq_list_add_tail() · 8724dd8c

由 Pavel Begunkov 提交于 8月 09, 2021

Prepare nodes that we're going to add before actually linking them, it's
always safer and costs us nothing.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f7e53f0c84c02ed6748c488ed0789b98f8cc6185.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

8724dd8c

io_uring: remove unnecessary PF_EXITING check · 2215bed9

由 Pavel Begunkov 提交于 8月 09, 2021

We prefer nornal task_works even if it would fail requests inside. Kill
a PF_EXITING check in io_req_task_work_add(), task_work_add() handles
well dying tasks, i.e. return error when can't enqueue due to late
stages of do_exit().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/fc14297e8441cd8f5d1743a2488cf0df09bf48ac.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

2215bed9

io_uring: clean io-wq callbacks · ebc11b6c

由 Pavel Begunkov 提交于 8月 09, 2021

Move io-wq callbacks closer to each other, so it's easier to work with
them, and rename io_free_work() into io_wq_free_work() for consistency.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/851bbc7f0f86f206d8c1333efee8bcb9c26e419f.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ebc11b6c

io_uring: avoid touching inode in rw prep · c97d8a0f

由 Pavel Begunkov 提交于 8月 09, 2021

If we use fixed files, we can be sure (almost) that REQ_F_ISREG is set.
However, for non-reg files io_prep_rw() still will look into inode to
double check, and that's expensive and can be avoided.

The only caveat is that it only currently works with 64+ bit
architectures, see FFS_ISREG, so we should consider that.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0a62780c491ca2522cd52db4ae3f16e03aafed0f.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

c97d8a0f

io_uring: rename io_file_supports_async() · b191e2df

由 Pavel Begunkov 提交于 8月 09, 2021

io_file_supports_async() checks whether a file supports nowait
operations, so "async" in the name is misleading. Rename it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/33d55b5ce43aa1884c637c1957f1e30d30dc3bec.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b191e2df

io_uring: inline fixed part of io_file_get() · ac177053

由 Pavel Begunkov 提交于 8月 09, 2021

Optimise io_file_get() with registered files, which is in a hot path,
by inlining parts of the function. Saves a function call, and
inefficiencies of passing arguments, e.g. evaluating
(sqe_flags & IOSQE_FIXED_FILE).

It couldn't have been done before as compilers were refusing to inline
it because of the function size.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/52115cd6ce28f33bd0923149c0e6cb611084a0b1.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ac177053

io_uring: use kvmalloc for fixed files · 042b0d85

由 Pavel Begunkov 提交于 8月 09, 2021

Instead of hand-coded two-level tables for registered files, allocate
them with kvmalloc(). In many cases small enough tables are enough, and
so can be kmalloc()'ed removing an extra memory load and a bunch of bit
logic instructions from the hot path. If the table is larger, we trade
off all the pros with a TLB-assisted memory lookup.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/280421d3b48775dabab773006bb5588c7b2dabc0.1628471125.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

042b0d85

io_uring: be smarter about waking multiple CQ ring waiters · 5fd46178

由 Jens Axboe 提交于 8月 06, 2021

Currently we only wake the first waiter, even if we have enough entries
posted to satisfy multiple waiters. Improve that situation so that
every waiter knows how much the CQ tail has to advance before they can
be safely woken up.

With this change, if we have N waiters each asking for 1 event and we get
4 completions, then we wake up 4 waiters. If we have N waiters asking
for 2 completions and we get 4 completions, then we wake up the first
two. Previously, only the first waiter would've been woken up.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5fd46178

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功