提交 · 9e895cd9649abe4392c59d14e31b0f5667d082d2 · openeuler / Kernel

02 5月, 2021 1 次提交

xprtrdma: Fix a NULL dereference in frwr_unmap_sync() · 9e895cd9

由 Chuck Lever 提交于 5月 01, 2021

The normal mechanism that invalidates and unmaps MRs is
frwr_unmap_async(). frwr_unmap_sync() is used only when an RPC
Reply bearing Write or Reply chunks has been lost (ie, almost
never).

Coverity found that after commit 9a301caf ("xprtrdma: Move
fr_linv_done field to struct rpcrdma_mr"), the while() loop in
frwr_unmap_sync() exits only once @mr is NULL, unconditionally
causing subsequent dereferences of @mr to Oops.

I've tested this fix by creating a client that skips invoking
frwr_unmap_async() when RPC Replies complete. That forces all
invalidation tasks to fall upon frwr_unmap_sync(). Simple workloads
with this fix applied to the adulterated client work as designed.
Reported-by: Ncoverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1504556 ("Null pointer dereferences")
Fixes: 9a301caf ("xprtrdma: Move fr_linv_done field to struct rpcrdma_mr")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9e895cd9

26 4月, 2021 13 次提交

xprtrdma: Move fr_mr field to struct rpcrdma_mr · 13bcf7e3

由 Chuck Lever 提交于 4月 19, 2021

Clean up: The last remaining field in struct rpcrdma_frwr has been
removed, so the struct can be eliminated.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

13bcf7e3

xprtrdma: Move the Work Request union to struct rpcrdma_mr · dcff9ed2

由 Chuck Lever 提交于 4月 19, 2021

Clean up.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

dcff9ed2

xprtrdma: Move fr_linv_done field to struct rpcrdma_mr · 9a301caf

由 Chuck Lever 提交于 4月 19, 2021

Clean up: Move more of struct rpcrdma_frwr into its parent.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9a301caf

xprtrdma: Move cqe to struct rpcrdma_mr · e10fa96d

由 Chuck Lever 提交于 4月 19, 2021

Clean up.

- Simplify variable initialization in the completion handlers.

- Move another field out of struct rpcrdma_frwr.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e10fa96d

xprtrdma: Move fr_cid to struct rpcrdma_mr · 0a26d10e

由 Chuck Lever 提交于 4月 19, 2021

Clean up (for several purposes):

- The MR's cid is initialized sooner so that tracepoints can show
  something reasonable even if the MR is never posted.
- The MR's res.id doesn't change so the cid won't change either.
  Initializing the cid once is sufficient.
- struct rpcrdma_frwr is going away soon.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

0a26d10e

xprtrdma: Add tracepoints showing FastReg WRs and remote invalidation · 4ddd0fc3

由 Chuck Lever 提交于 4月 19, 2021

The Send signaling logic is a little subtle, so add some
observability around it. For every xprtrdma_mr_fastreg event, there
should be an xprtrdma_mr_localinv or xprtrdma_mr_reminv event.

When these tracepoints are enabled, we can see exactly when an MR is
DMA-mapped, registered, invalidated (either locally or remotely) and
then DMA-unmapped.

kworker/u25:2-190 [000] 787.979512: xprtrdma_mr_map: task:351@5 mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE)
kworker/u25:2-190 [000] 787.979515: xprtrdma_chunk_read: task:351@5 pos=148 5608@0x8679e0c8f6f56000:0x00000503 (last)
kworker/u25:2-190 [000] 787.979519: xprtrdma_marshal: task:351@5 xid=0x8679e0c8: hdr=52 xdr=148/5608/0 read list/inline
kworker/u25:2-190 [000] 787.979525: xprtrdma_mr_fastreg: task:351@5 mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE)
kworker/u25:2-190 [000] 787.979526: xprtrdma_post_send: task:351@5 cq.id=0 cid=73 (2 SGEs)

...

kworker/5:1H-219 [005] 787.980567: xprtrdma_wc_receive: cq.id=1 cid=161 status=SUCCESS (0/0x0) received=164
kworker/5:1H-219 [005] 787.980571: xprtrdma_post_recvs: peer=[192.168.100.55]:20049 r_xprt=0xffff8884974d4000: 0 new recvs, 70 active (rc 0)
kworker/5:1H-219 [005] 787.980573: xprtrdma_reply: task:351@5 xid=0x8679e0c8 credits=64
kworker/5:1H-219 [005] 787.980576: xprtrdma_mr_reminv: task:351@5 mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE)
kworker/5:1H-219 [005] 787.980577: xprtrdma_mr_unmap: mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE)

Note that I've moved the xprtrdma_post_send tracepoint so that event
always appears after the xprtrdma_mr_fastreg tracepoint. Otherwise
the event log looks counterintuitive (FastReg is always supposed to
happen before Send).
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

4ddd0fc3

xprtrdma: Avoid Send Queue wrapping · b3ce7a25

由 Chuck Lever 提交于 4月 19, 2021

Send WRs can be signalled or unsignalled. A signalled Send WR
always has a matching Send completion, while a unsignalled Send
has a completion only if the Send WR fails.

xprtrdma has a Send account mechanism that is designed to reduce
the number of signalled Send WRs. This in turn mitigates the
interrupt rate of the underlying device.

RDMA consumers can't leave all Sends unsignaled, however, because
providers rely on Send completions to maintain their Send Queue head
and tail pointers. xprtrdma counts the number of unsignaled Send WRs
that have been posted to ensure that Sends are signalled often
enough to prevent the Send Queue from wrapping.

This mechanism neglected to account for FastReg WRs, which are
posted on the Send Queue but never signalled. As a result, the
Send Queue wrapped on occasion, resulting in duplication completions
of FastReg and LocalInv WRs.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

b3ce7a25

xprtrdma: Do not wake RPC consumer on a failed LocalInv · 8a053433

由 Chuck Lever 提交于 4月 19, 2021

Throw away any reply where the LocalInv flushes or could not be
posted. The registered memory region is in an unknown state until
the disconnect completes.

rpcrdma_xprt_disconnect() will find and release the MR. No need to
put it back on the MR free list in this case.

The client retransmits pending RPC requests once it reestablishes a
fresh connection, so a replacement reply should be forthcoming on
the next connection instance.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

8a053433

xprtrdma: Do not recycle MR after FastReg/LocalInv flushes · e4b52ca0

由 Chuck Lever 提交于 4月 19, 2021

Better not to touch MRs involved in a flush or post error until the
Send and Receive Queues are drained and the transport is fully
quiescent. Simply don't insert such MRs back onto the free list.
They remain on mr_all and will be released when the connection is
torn down.

I had thought that recycling would prevent hardware resources from
being tied up for a long time. However, since v5.7, a transport
disconnect destroys the QP and other hardware-owned resources. The
MRs get cleaned up nicely at that point.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e4b52ca0

xprtrdma: Clarify use of barrier in frwr_wc_localinv_done() · 44438ad9

由 Chuck Lever 提交于 4月 19, 2021

Clean up: The comment and the placement of the memory barrier is
confusing. Humans want to read the function statements from head
to tail.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

44438ad9

xprtrdma: Rename frwr_release_mr() · f912af77

由 Chuck Lever 提交于 4月 19, 2021

Clean up: To be consistent with other functions in this source file,
follow the naming convention of putting the object being acted upon
before the action itself.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

f912af77

xprtrdma: rpcrdma_mr_pop() already does list_del_init() · 1363e638

由 Chuck Lever 提交于 4月 19, 2021

The rpcrdma_mr_pop() earlier in the function has already cleared
out mr_list, so it must not be done again in the error path.

Fixes: 84756894 ("xprtrdma: Remove fr_state")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

1363e638

xprtrdma: Avoid Receive Queue wrapping · 32e6b681

由 Chuck Lever 提交于 4月 19, 2021

Commit e340c2d6 ("xprtrdma: Reduce the doorbell rate (Receive)")
increased the number of Receive WRs that are posted by the client,
but did not increase the size of the Receive Queue allocated during
transport set-up.

This is usually not an issue because RPCRDMA_BACKWARD_WRS is defined
as (32) when SUNRPC_BACKCHANNEL is defined. In cases where it isn't,
there is a real risk of Receive Queue wrapping.

Fixes: e340c2d6 ("xprtrdma: Reduce the doorbell rate (Receive)")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

32e6b681

06 2月, 2021 2 次提交

xprtrdma: Refactor invocations of offset_in_page() · 67b16625

由 Chuck Lever 提交于 2月 04, 2021

Clean up so that offset_in_page() is invoked less often in the
most common case, which is mapping xdr->pages.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

67b16625

xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map() · 54e6aec5

由 Chuck Lever 提交于 2月 04, 2021

Clean up.

Remove a conditional branch from the SGL set-up loop in frwr_map():
Instead of using either sg_set_page() or sg_set_buf(), initialize
the mr_page field properly when rpcrdma_convert_kvec() converts the
kvec to an SGL entry. frwr_map() can then invoke sg_set_page()
unconditionally.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

54e6aec5

11 11月, 2020 5 次提交

xprtrdma: Micro-optimize MR DMA-unmapping · 7a03aeb6