提交 · b3659a65be70eb68d9fc9802c4ce81e0f943abfd · openeuler / Kernel

25 7月, 2022 40 次提交

io_uring: change ->cqe_cached invariant for CQE32 · b3659a65

由 Pavel Begunkov 提交于 6月 17, 2022

With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but
rather an implicit offset into cqes. Store the real cqe pointer and
increment it accordingly if CQE32.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1ee1838cba16bed96381a006950b36ba640d998c.1655455613.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b3659a65

io_uring: deduplicate io_get_cqe() calls · e8c328c3

由 Pavel Begunkov 提交于 6月 17, 2022

Deduplicate calls to io_get_cqe() from __io_fill_cqe_req().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4fa077986cc3abab7c59ff4e7c390c783885465f.1655455613.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e8c328c3

io_uring: deduplicate __io_fill_cqe_req tracing · ae5735c6

由 Pavel Begunkov 提交于 6月 17, 2022

Deduplicate two trace_io_uring_complete() calls in __io_fill_cqe_req().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/277ed85dba5189ab7d932164b314013a0f0b0fdc.1655455613.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ae5735c6

io_uring: introduce io_req_cqe_overflow() · 68494a65

由 Pavel Begunkov 提交于 6月 17, 2022

__io_fill_cqe_req() is hot and inlined, we want it to be as small as
possible. Add io_req_cqe_overflow() accepting only a request and doing
all overflow accounting, and replace with it two calls to 6 argument
io_cqring_event_overflow().
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/048b9fbcce56814d77a1a540409c98c3d383edcb.1655455613.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

68494a65

io_uring: don't inline __io_get_cqe() · faf88dde

由 Pavel Begunkov 提交于 6月 17, 2022

__io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it
sheds ~500B from the binary.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c1ac829198a881b7af8710926f99a3559b9f24c0.1655455613.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

faf88dde

io_uring: don't expose io_fill_cqe_aux() · d245bca6

由 Pavel Begunkov 提交于 6月 17, 2022

Deduplicate some code and add a helper for filling an aux CQE, locking
and notification.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d245bca6

io_uring: kbuf: add comments for some tricky code · f09c8643

由 Hao Xu 提交于 6月 17, 2022

Add comments to explain why it is always under uring lock when
incrementing head in __io_kbuf_recycle. And rectify one comemnt about
kbuf consuming in iowq case.
Signed-off-by: NHao Xu <howeyxu@tencent.com>
Link: https://lore.kernel.org/r/20220617050429.94293-1-hao.xu@linux.devSigned-off-by: NJens Axboe <axboe@kernel.dk>

f09c8643

io_uring: mutex locked poll hashing · 9ca9fb24

由 Pavel Begunkov 提交于 6月 16, 2022

Currently we do two extra spin lock/unlock pairs to add a poll/apoll
request to the cancellation hash table and remove it from there.

On the submission side we often already hold ->uring_lock and tw
completion is likely to hold it as well. Add a second cancellation hash
table protected by ->uring_lock. In concerns for latency because of a
need to have the mutex locked on the completion side, use the new table
only in following cases:

1) IORING_SETUP_SINGLE_ISSUER: only one task grabs uring_lock, so there
is little to no contention and so the main tw hander will almost
always end up grabbing it before calling callbacks.

2) IORING_SETUP_SQPOLL: same as with single issuer, only one task is
a major user of ->uring_lock.

3) apoll: we normally grab the lock on the completion side anyway to
execute the request, so it's free.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1bbad9c78c454b7b92f100bbf46730a37df7194f.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9ca9fb24

io_uring: propagate locking state to poll cancel · 5d7943d9

由 Pavel Begunkov 提交于 6月 16, 2022

Poll cancellation will be soon need to grab ->uring_lock inside, pass
the locking state, i.e. issue_flags, inside the cancellation functions.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b86781d047727c07163443b57551a3fa57c7c5e1.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5d7943d9

io_uring: introduce a struct for hash table · e6f89be6

由 Pavel Begunkov 提交于 6月 16, 2022

Instead of passing around a pointer to hash buckets, add a bit of type
safety and wrap it into a structure.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d65bc3faba537ec2aca9eabf334394936d44bd28.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e6f89be6

io_uring: pass hash table into poll_find · a2cdd519

由 Pavel Begunkov 提交于 6月 16, 2022

In preparation for having multiple cancellation hash tables, pass a
table pointer into io_poll_find() and other poll cancel functions.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a31c88502463dce09254240fa037352927d7ecc3.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a2cdd519

io_uring: add IORING_SETUP_SINGLE_ISSUER · 97bbdc06

由 Pavel Begunkov 提交于 6月 16, 2022

Add a new IORING_SETUP_SINGLE_ISSUER flag and the userspace visible part
of it, i.e. put limitations of submitters. Also, don't allow it together
with IOPOLL as we're not going to put it to good use.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4bcc41ee467fdf04c8aab8baf6ce3ba21858c3d4.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

97bbdc06

io_uring: use state completion infra for poll reqs · 0ec6dca2

由 Pavel Begunkov 提交于 6月 16, 2022

Use io_req_task_complete() for poll request completions, so it can
utilise state completions and save lots of unnecessary locking.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ced94cb5a728d8e386c640d052fd3da3f5d6891a.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0ec6dca2

io_uring: clean up io_ring_ctx_alloc · 8b1dfd34

由 Pavel Begunkov 提交于 6月 16, 2022

Add a variable for the number of hash buckets in io_ring_ctx_alloc(),
makes it more readable.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/993926ed0d614ba9a76b2a85bebae2babcb13983.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8b1dfd34

io_uring: limit the number of cancellation buckets · 4a07723f

由 Pavel Begunkov 提交于 6月 16, 2022

Don't allocate to many hash/cancellation buckets, there might be too
many, clamp it to 8 bits, or 256 * 64B = 16KB. We don't usually have too
many requests, and 256 buckets should be enough, especially since we
do hash search only in the cancellation path.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b9620c8072ba61a2d50eba894b89bd93a94a9abd.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4a07723f

io_uring: clean up io_try_cancel · 4dfab8ab

由 Pavel Begunkov 提交于 6月 16, 2022

Get rid of an unnecessary extra goto in io_try_cancel() and simplify the
function.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/48cf5417b43a8386c6c364dba1ad9b4c7382d158.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4dfab8ab

io_uring: pass poll_find lock back · 1ab1edb0

由 Pavel Begunkov 提交于 6月 16, 2022

Instead of using implicit knowledge of what is locked or not after
io_poll_find() and co returns, pass back a pointer to the locked
bucket if any. If set the user must to unlock the spinlock.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dae1dc5749aa34367812ecf62f82fd3f053aae44.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1ab1edb0

io_uring: switch cancel_hash to use per entry spinlock · 38513c46

由 Hao Xu 提交于 6月 16, 2022

Add a new io_hash_bucket structure so that each bucket in cancel_hash
has separate spinlock. Use per entry lock for cancel_hash, this removes
some completion lock invocation and remove contension between different
cancel_hash entries.
Signed-off-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/05d1e135b0c8bce9d1441e6346776589e5783e26.1655371007.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

38513c46

io_uring: poll: remove unnecessary req->ref set · 3654ab0c

由 Hao Xu 提交于 6月 16, 2022

We now don't need to set req->refcount for poll requests since the
reworked poll code ensures no request release race.
Signed-off-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ec6fee45705890bdb968b0c175519242753c0215.1655371007.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3654ab0c

io_uring: don't inline io_put_kbuf · 53ccf69b

由 Pavel Begunkov 提交于 6月 16, 2022

io_put_kbuf() is huge, don't bloat the kernel with inlining.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2e21ccf0be471ffa654032914b9430813cae53f8.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

53ccf69b

io_uring: refactor io_req_task_complete() · 7012c815

由 Pavel Begunkov 提交于 6月 16, 2022

Clean up io_req_task_complete() and deduplicate io_put_kbuf() calls.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ae3148ac7eb5cce3e06895cde306e9e959d6f6ae.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7012c815

io_uring: kill REQ_F_COMPLETE_INLINE · 75d7b3ae

由 Pavel Begunkov 提交于 6月 16, 2022

REQ_F_COMPLETE_INLINE is only needed to delay queueing into the
completion list to io_queue_sqe() as __io_req_complete() is inlined and
we don't want to bloat the kernel.

As now we complete in a more centralised fashion in io_issue_sqe() we
can get rid of the flag and queue to the list directly.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

75d7b3ae

io_uring: rw: delegate sync completions to core io_uring · df9830d8

由 Pavel Begunkov 提交于 6月 16, 2022

io_issue_sqe() from the io_uring core knows how to complete requests
based on the returned error code, we can delegate io_read()/io_write()
completion to it. Make kiocb_done() to return the right completion
code and propagate it.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/32ef005b45d23bf6b5e6837740dc0331bb051bd4.1655371007.git.asml.silence@gmail.comReviewed-by: NHao Xu <howeyxu@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

df9830d8

J
io_uring: remove unused IO_REQ_CACHE_SIZE defined · bb8f8700
由 Jens Axboe 提交于 6月 15, 2022
```
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
bb8f8700

io_uring: don't set REQ_F_COMPLETE_INLINE in tw · c65f5279

由 Pavel Begunkov 提交于 6月 15, 2022

io_req_task_complete() enqueues requests for state completion itself, no
need for REQ_F_COMPLETE_INLINE, which is only serve the purpose of not
bloating the kernel.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/aca80f71464ad02c06f1311d998a2d6ee0b31573.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

c65f5279

io_uring: remove check_cq checking from hot paths · 3a08576b

由 Pavel Begunkov 提交于 6月 15, 2022

All ctx->check_cq events are slow path, don't test every single flag one
by one in the hot path, but add a common guarding if.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dff026585cea7ff3a172a7c83894a3b0111bbf6a.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3a08576b

io_uring: never defer-complete multi-apoll · aeaa72c6

由 Pavel Begunkov 提交于 6月 15, 2022

Luckily, nnobody completes multi-apoll requests outside the polling
functions, but don't set IO_URING_F_COMPLETE_DEFER in any case as
there is nobody who is catching REQ_F_COMPLETE_INLINE, and so will leak
requests if used.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a65ed3f5effd9321ee06e6edea294a03be3e15a0.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

aeaa72c6

io_uring: inline ->registered_rings · 6a02e4be

由 Pavel Begunkov 提交于 6月 15, 2022

There can be only 16 registered rings, no need to allocate an array for
them separately but store it in tctx.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/495f0b953c87994dd9e13de2134019054fa5830d.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6a02e4be

io_uring: explain io_wq_work::cancel_seq placement · 48c13d89

由 Pavel Begunkov 提交于 6月 15, 2022

Add a comment on why we keep ->cancel_seq in struct io_wq_work instead
of struct io_kiocb despite it needed only by io_uring but not io-wq.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/988e87eec9dc700b5dae933df3aefef303502f6c.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

48c13d89

io_uring: move small helpers to headers · aa1e90f6

由 Pavel Begunkov 提交于 6月 15, 2022

There is a bunch of inline helpers that will be useful not only to the
core of io_uring, move them to headers.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

aa1e90f6

io_uring: refactor ctx slow data placement · 22eb2a3f

由 Pavel Begunkov 提交于 6月 15, 2022

Shove all slow path data at the end of ctx and get rid of extra
indention.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/bcaf200298dd469af20787650550efc66d89bef2.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

22eb2a3f

io_uring: better caching for ctx timeout fields · aff5b2df

由 Pavel Begunkov 提交于 6月 15, 2022

Following timeout fields access patterns, move all of them into a
separate cache line inside ctx, so they don't intervene with normal
completion caching, especially since timeout removals and completion
are separated and the later is done via tw.

It also sheds some bytes from io_ring_ctx, 1216B -> 1152B
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4b163793072840de53b3cb66e0c2995e7226ff78.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

aff5b2df

io_uring: move defer_list to slow data · b2543603

由 Pavel Begunkov 提交于 6月 15, 2022

draining is slow path, move defer_list to the end where slow data lives
inside the context.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e16379391ca72b490afdd24e8944baab849b4a7b.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b2543603

io_uring: make reg buf init consistent · 5ff4fdff

由 Pavel Begunkov 提交于 6月 15, 2022

The default (i.e. empty) state of register buffer is dummy_ubuf, so set
it to dummy on init instead of NULL.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c5456aecf03d9627fbd6e65e100e2b5293a6151e.1655310733.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

5ff4fdff

io_uring: deprecate epoll_ctl support · 61a2732a

由 Jens Axboe 提交于 6月 01, 2022

As far as we know, nobody ever adopted the epoll_ctl management via
io_uring. Deprecate it now with a warning, and plan on removing it in
a later kernel version. When we do remove it, we can revert the following
commits as well:

39220e8d ("eventpoll: support non-blocking do_epoll_ctl() calls")
58e41a44 ("eventpoll: abstract out epoll_ctl() handler")
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/io-uring/CAHk-=wiTyisXBgKnVHAGYCNvkmjk=50agS2Uk6nr+n3ssLZg2w@mail.gmail.com/Signed-off-by: NJens Axboe <axboe@kernel.dk>

61a2732a

io_uring: add support for level triggered poll · b9ba8a44

由 Jens Axboe 提交于 5月 27, 2022

By default, the POLL_ADD command does edge triggered poll - if we get
a non-zero mask on the initial poll attempt, we complete the request
successfully.

Support level triggered by always waiting for a notification, regardless
of whether or not the initial mask matches the file state.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b9ba8a44

io_uring: move opcode table to opdef.c · d9b57aa3

由 Jens Axboe 提交于 6月 15, 2022

We already have the declarations in opdef.h, move the rest into its own
file rather than in the main io_uring.c file.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d9b57aa3

J
io_uring: move read/write related opcodes to its own file · f3b44f92
由 Jens Axboe 提交于 6月 13, 2022
```
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
f3b44f92
J
io_uring: move remaining file table manipulation to filetable.c · c98817e6
由 Jens Axboe 提交于 5月 26, 2022
```
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
c98817e6
J
io_uring: move rsrc related data, core, and commands · 73572984
由 Jens Axboe 提交于 6月 13, 2022
```
Signed-off-by: NJens Axboe <axboe@kernel.dk>
```
73572984

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功