提交 · d8099feda4833bab96b1bf312e9e6aad6b771570 · openeuler / Kernel

09 7月, 2019 2 次提交

xprtrdma: Reduce context switching due to Local Invalidation · d8099fed

由 Chuck Lever 提交于 6月 19, 2019

Since commit ba69cd12 ("xprtrdma: Remove support for FMR memory
registration"), FRWR is the only supported memory registration mode.

We can take advantage of the asynchronous nature of FRWR's LOCAL_INV
Work Requests to get rid of the completion wait by having the
LOCAL_INV completion handler take care of DMA unmapping MRs and
waking the upper layer RPC waiter.

This eliminates two context switches when local invalidation is
necessary. As a side benefit, we will no longer need the per-xprt
deferred completion work queue.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d8099fed

xprtrdma: Fix occasional transport deadlock · 05eb06d8

由 Chuck Lever 提交于 6月 19, 2019

Under high I/O workloads, I've noticed that an RPC/RDMA transport
occasionally deadlocks (IOPS goes to zero, and doesn't recover).
Diagnosis shows that the sendctx queue is empty, but when sendctxs
are returned to the queue, the xprt_write_space wake-up never
occurs. The wake-up logic in rpcrdma_sendctx_put_locked is racy.

I noticed that both EMPTY_SCQ and XPRT_WRITE_SPACE are implemented
via an atomic bit. Just one of those is sufficient. Removing
EMPTY_SCQ in favor of the generic bit mechanism makes the deadlock
un-reproducible.

Without EMPTY_SCQ, rpcrdma_buffer::rb_flags is no longer used and
is therefore removed.

Unfortunately this patch does not apply cleanly to stable. If
needed, someone will have to port it and test it.

Fixes: 2fad6592 ("xprtrdma: Wait on empty sendctx queue")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

05eb06d8

03 7月, 2019 1 次提交

xprtrdma: Fix use-after-free in rpcrdma_post_recvs · 2d0abe36

由 Chuck Lever 提交于 6月 19, 2019

Dereference wr->next /before/ the memory backing wr has been
released. This issue was found by code inspection. It is not
expected to be a significant problem because it is in an error
path that is almost never executed.

Fixes: 7c8d9e7c ("xprtrdma: Move Receive posting to ... ")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2d0abe36

28 5月, 2019 1 次提交

xprtrdma: Use struct_size() in kzalloc() · 66d4218f

由 Gustavo A. R. Silva 提交于 1月 30, 2019

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    struct boo entry[];
};

instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

66d4218f

26 4月, 2019 13 次提交

xprtrdma: Update comments that reference ib_drain_qp · b8fe677f

由 Chuck Lever 提交于 4月 24, 2019

Commit e1ede312 ("xprtrdma: Fix helper that drains the
transport") replaced the ib_drain_qp() call, so update documenting
comments to reflect current operation.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

b8fe677f

xprtrdma: Remove pr_err() call sites from completion handlers · 5f2311f5

由 Chuck Lever 提交于 4月 24, 2019

Clean up: rely on the trace points instead.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5f2311f5

xprtrdma: Eliminate struct rpcrdma_create_data_internal · 86c4ccd9

由 Chuck Lever 提交于 4月 24, 2019

Clean up.

Move the remaining field in rpcrdma_create_data_internal so the
structure can be removed.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

86c4ccd9

xprtrdma: Aggregate the inline settings in struct rpcrdma_ep · 94087e97

由 Chuck Lever 提交于 4月 24, 2019

Clean up.

The inline settings are actually a characteristic of the endpoint,
and not related to the device. They are also modified after the
transport instance is created, so they do not belong in the cdata
structure either.

Lastly, let's use names that are more natural to RDMA than to NFS:
inline_write -> inline_send and inline_read -> inline_recv. The
/proc files retain their names to avoid breaking user space.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

94087e97

xprtrdma: Eliminate rpcrdma_ia::ri_device · f19bd0bb

由 Chuck Lever 提交于 4月 24, 2019

Clean up.

Since commit 54cbd6b0 ("xprtrdma: Delay DMA mapping Send and
Receive buffers"), a pointer to the device is now saved in each
regbuf when it is DMA mapped.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f19bd0bb

xprtrdma: More Send completion batching · c209e49c

由 Chuck Lever 提交于 4月 24, 2019

Instead of using a fixed number, allow the amount of Send completion
batching to vary based on the client's maximum credit limit.

- A larger default gives a small boost to IOPS throughput

- Reducing it based on max_requests gives a safe result when the
  max credit limit is cranked down (eg. when the device has a small
  max_qp_wr).
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c209e49c

xprtrdma: Clean up sendctx functions · dbcc53a5

由 Chuck Lever 提交于 4月 24, 2019

Minor clean-ups I've stumbled on since sendctx was merged last year.
In particular, making Send completion processing more efficient
appears to have a measurable impact on IOPS throughput.

Note: test_and_clear_bit() returns a value, thus an explicit memory
barrier is not necessary.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

dbcc53a5

xprtrdma: Clean up regbuf helpers · d2832af3

由 Chuck Lever 提交于 4月 24, 2019

For code legibility, clean up the function names to be consistent
with the pattern: "rpcrdma" _ object-type _ action

Also rpcrdma_regbuf_alloc and rpcrdma_regbuf_free no longer have any
callers outside of verbs.c, and can thus be made static.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d2832af3

xprtrdma: De-duplicate "allocate new, free old regbuf" · 0f665ceb

由 Chuck Lever 提交于 4月 24, 2019

Clean up by providing an API to do this common task.

At this point, the difference between rpcrdma_get_sendbuf and
rpcrdma_get_recvbuf has become tiny. These can be collapsed into a
single helper.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0f665ceb

xprtrdma: Allocate req's regbufs at xprt create time · bb93a1ae

由 Chuck Lever 提交于 4月 24, 2019

Allocating an rpcrdma_req's regbufs at xprt create time enables
a pair of micro-optimizations:

First, if these regbufs are always there, we can eliminate two
conditional branches from the hot xprt_rdma_allocate path.

Second, by allocating a 1KB buffer, it places a lower bound on the
size of these buffers, without adding yet another conditional
branch. The lower bound reduces the number of hardway re-
allocations. In fact, for some workloads it completely eliminates
hardway allocations.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bb93a1ae

xprtrdma: rpcrdma_regbuf alignment · 8cec3dba

由 Chuck Lever 提交于 4月 24, 2019

Allocate the struct rpcrdma_regbuf separately from the I/O buffer
to better guarantee the alignment of the I/O buffer and eliminate
the wasted space between the rpcrdma_regbuf metadata and the buffer
itself.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8cec3dba

xprtrdma: Clean up rpcrdma_create_rep() and rpcrdma_destroy_rep() · 23146500

由 Chuck Lever 提交于 4月 24, 2019

For code legibility, clean up the function names to be consistent
with the pattern: "rpcrdma" _ object-type _ action
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

23146500

xprtrdma: Clean up rpcrdma_create_req() · 1769e6a8

由 Chuck Lever 提交于 4月 24, 2019

Eventually, I'd like to invoke rpcrdma_create_req() during the
call_reserve step. Memory allocation there probably needs to use
GFP_NOIO. Therefore a set of GFP flags needs to be passed in.

As an additional clean up, just return a pointer or NULL, because
the only error return code here is -ENOMEM.

Lastly, clean up the function names to be consistent with the
pattern: "rpcrdma" _ object-type _ action
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

1769e6a8

12 4月, 2019 1 次提交

xprtrdma: Fix helper that drains the transport · e1ede312

由 Chuck Lever 提交于 4月 09, 2019

We want to drain only the RQ first. Otherwise the transport can
deadlock on ->close if there are outstanding Send completions.

Fixes: 6d2d0ee2 ("xprtrdma: Replace rpcrdma_receive_wq ... ")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e1ede312

13 2月, 2019 2 次提交

xprtrdma: Reduce the doorbell rate (Receive) · e340c2d6

由 Chuck Lever 提交于 2月 11, 2019

Post RECV WRs in batches to reduce the hardware doorbell rate per
transport. This helps the RPC-over-RDMA client scale better in
number of transports.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e340c2d6

xprtrdma: Make sure Send CQ is allocated on an existing compvec · a4cb5bdb

由 Nicolas Morey-Chaisemartin 提交于 2月 05, 2019

Make sure the device has at least 2 completion vectors
before allocating to compvec#1

Fixes: a4699f56 (xprtrdma: Put Send CQ in IB_POLL_WORKQUEUE mode)
Signed-off-by: NNicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a4cb5bdb

09 1月, 2019 2 次提交

xprtrdma: Double free in rpcrdma_sendctxs_create() · 6e17f58c

由 Dan Carpenter 提交于 1月 05, 2019

The clean up is handled by the caller, rpcrdma_buffer_create(), so this
call to rpcrdma_sendctxs_destroy() leads to a double free.

Fixes: ae72950a ("xprtrdma: Add data structure to manage RDMA Send arguments")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6e17f58c

xprtrdma: Fix error code in rpcrdma_buffer_create() · 4429b668

由 Dan Carpenter 提交于 1月 05, 2019

This should return -ENOMEM if __alloc_workqueue_key() fails, but it
returns success.

Fixes: 6d2d0ee2 ("xprtrdma: Replace rpcrdma_receive_wq with a per-xprt workqueue")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

4429b668

03 1月, 2019 12 次提交

xprtrdma: Add documenting comment for rpcrdma_buffer_destroy · af65ed40