提交 · f296bfd5cd04cbb49b8fc9585adc280ab2b58624 · openeuler / Kernel

09 3月, 2021 1 次提交

SUNRPC: Set memalloc_nofs_save() for sync tasks · f0940f4b

由 Benjamin Coddington 提交于 3月 03, 2021

We could recurse into NFS doing memory reclaim while sending a sync task,
which might result in a deadlock.  Set memalloc_nofs_save for sync task
execution.

Fixes: a1231fda ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs")
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f0940f4b

17 2月, 2021 3 次提交

SUNRPC: Further clean up svc_tcp_sendmsg() · 4d12b727

由 Chuck Lever 提交于 2月 16, 2021

Clean up: The msghdr is no longer needed in the caller.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

4d12b727

SUNRPC: Remove redundant socket flags from svc_tcp_sendmsg() · 987c7b1d

由 Trond Myklebust 提交于 2月 16, 2021

Now that the caller controls the TCP_CORK socket option, it is redundant
to set MSG_MORE and MSG_SENDPAGE_NOTLAST in the calls to
kernel_sendpage().
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

987c7b1d

SUNRPC: Use TCP_CORK to optimise send performance on the server · e0a912e8

由 Trond Myklebust 提交于 2月 16, 2021

Use a counter to keep track of how many requests are queued behind the
xprt->xpt_mutex, and keep TCP_CORK set until the queue is empty.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Link: https://lore.kernel.org/linux-nfs/20210213202532.23146-1-trondmy@kernel.org/T/#uSigned-off-by: NChuck Lever <chuck.lever@oracle.com>

e0a912e8

15 2月, 2021 1 次提交

svcrdma: Hold private mutex while invoking rdma_accept() · 0ac24c32

由 Chuck Lever 提交于 2月 09, 2021

RDMA core mutex locking was restructured by commit d114c6fe
("RDMA/cma: Add missing locking to rdma_accept()") [Aug 2020]. When
lock debugging is enabled, the RPC/RDMA server trips over the new
lockdep assertion in rdma_accept() because it doesn't call
rdma_accept() from its CM event handler.

As a temporary fix, have svc_rdma_accept() take the handler_mutex
explicitly. In the meantime, let's consider how to restructure the
RPC/RDMA transport to invoke rdma_accept() from the proper context.

Calls to svc_rdma_accept() are serialized with calls to
svc_rdma_free() by the generic RPC server layer.
Suggested-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/linux-rdma/20210209154014.GO4247@nvidia.com/
Fixes: d114c6fe ("RDMA/cma: Add missing locking to rdma_accept()")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

0ac24c32

06 2月, 2021 6 次提交

xprtrdma: Clean up rpcrdma_prepare_readch() · 586a0787

由 Chuck Lever 提交于 2月 05, 2021

Since commit 9ed5af26 ("SUNRPC: Clean up the handling of page
padding in rpc_prepare_reply_pages()") [Dec 2020] the NFS client
passes payload data to the transport with the padding in xdr->pages
instead of in the send buffer's tail kvec. There's no need for the
extra logic to advance the base of the tail kvec because the upper
layer no longer places XDR padding there.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

586a0787

xprtrdma: Pad optimization, revisited · 2324fbed

由 Chuck Lever 提交于 2月 04, 2021

The NetApp Linux team discovered that with NFS/RDMA servers that do
not support RFC 8797, the Linux client is forming NFSv4.x WRITE
requests incorrectly.

In this case, the Linux NFS client disables implicit chunk round-up
for odd-length Read and Write chunks. The goal was to support old
servers that needed that padding to be sent explicitly by clients.

In that case the Linux NFS included the tail kvec in the Read chunk,
since the tail contains any needed padding. That meant a separate
memory registration is needed for the tail kvec, adding to the cost
of forming such requests. To avoid that cost for a mere 3 bytes of
zeroes that are always ignored by receivers, we try to use implicit
roundup when possible.

For NFSv4.x, the tail kvec also sometimes contains a trailing
GETATTR operation. The Linux NFS client unintentionally includes
that GETATTR operation in the Read chunk as well as inline.

The fix is simply to /never/ include the tail kvec when forming a
data payload Read chunk. The padding is thus now always present.

Note that since commit 9ed5af26 ("SUNRPC: Clean up the handling
of page padding in rpc_prepare_reply_pages()") [Dec 2020] the NFS
client passes payload data to the transport with the padding in
xdr->pages instead of in the send buffer's tail kvec. So now the
Linux NFS client appends XDR padding to all odd-sized Read chunks.
This shouldn't be a problem because:

 - RFC 8166-compliant servers are supposed to work with or without
   that XDR padding in Read chunks.

 - Since the padding is now in the same memory region as the data
   payload, a separate memory registration is not needed. In
   addition, the link layer extends data in RDMA Read responses to
   4-byte boundaries anyway. Thus there is now no savings when the
   padding is not included.

Because older kernels include the payload's XDR padding in the
tail kvec, a fix there will be more complicated. Thus backporting
this patch is not recommended.

Reported by: Olga Kornievskaia <Olga.Kornievskaia@netapp.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2324fbed

rpcrdma: Fix comments about reverse-direction operation · 84dff5eb

由 Chuck Lever 提交于 2月 04, 2021

During the final stages of publication of RFC 8167, reviewers
requested that we use the term "reverse direction" rather than
"backwards direction". Update comments to reflect this preference.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

84dff5eb

xprtrdma: Refactor invocations of offset_in_page() · 67b16625

由 Chuck Lever 提交于 2月 04, 2021

Clean up so that offset_in_page() is invoked less often in the
most common case, which is mapping xdr->pages.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

67b16625

xprtrdma: Simplify rpcrdma_convert_kvec() and frwr_map() · 54e6aec5

由 Chuck Lever 提交于 2月 04, 2021

Clean up.

Remove a conditional branch from the SGL set-up loop in frwr_map():
Instead of using either sg_set_page() or sg_set_buf(), initialize
the mr_page field properly when rpcrdma_convert_kvec() converts the
kvec to an SGL entry. frwr_map() can then invoke sg_set_page()
unconditionally.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NTom Talpey <tom@talpey.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

54e6aec5

xprtrdma: Remove FMR support in rpcrdma_convert_iovs() · 9929f4ad

由 Chuck Lever 提交于 2月 04, 2021

Support for FMR was removed by commit ba69cd12 ("xprtrdma:
Remove support for FMR memory registration") [Dec 2018]. That means
the buffer-splitting behavior of rpcrdma_convert_kvec(), added by
commit 821c791a ("xprtrdma: Segment head and tail XDR buffers
on page boundaries") [Mar 2016], is no longer necessary. FRWR
memory registration handles this case with aplomb.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9929f4ad

02 2月, 2021 1 次提交

SUNRPC: Fix fall-through warnings for Clang · 93f479d3

由 Gustavo A. R. Silva 提交于 11月 20, 2020

In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of
letting the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

93f479d3

01 2月, 2021 3 次提交

net: sunrpc: xprtsock.c: Corrected few spellings ,in comments · 12b20ce3

由 Bhaskar Chowdhury 提交于 11月 09, 2020

Few trivial and rudimentary spell corrections.
Signed-off-by: NBhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

12b20ce3

SUNRPC: correct error code comment in xs_tcp_setup_socket() · 8c71139d

由 Calum Mackay 提交于 10月 24, 2020

This comment was introduced by commit 6ea44adc
("SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()").

I believe EIO was a typo at the time: it should have been EAGAIN.

Subsequently, commit 0445f92c ("SUNRPC: Fix disconnection races")
changed that to ENOTCONN.

Rather than trying to keep the comment here in sync with the code in
xprt_force_disconnect(), make the point in a non-specific way.

Fixes: 6ea44adc ("SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()")
Signed-off-by: NCalum Mackay <calum.mackay@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8c71139d

SUNRPC: Fix NFS READs that start at non-page-aligned offsets · bad4c6eb

由 Chuck Lever 提交于 1月 31, 2021

Anj Duvnjak reports that the Kodi.tv NFS client is not able to read
video files from a v5.10.11 Linux NFS server.

The new sendpage-based TCP sendto logic was not attentive to non-
zero page_base values. nfsd_splice_read() sets that field when a
READ payload starts in the middle of a page.

The Linux NFS client rarely emits an NFS READ that is not page-
aligned. All of my testing so far has been with Linux clients, so I
missed this one.
Reported-by: NA. Duvnjak <avian@extremenerds.net>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211471
Fixes: 4a85a6a3 ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NA. Duvnjak <avian@extremenerds.net>

bad4c6eb

26 1月, 2021 2 次提交

SUNRPC: Handle 0 length opaque XDR object data properly · e4a7d1f7

由 Dave Wysochanski 提交于 1月 21, 2021

When handling an auth_gss downcall, it's possible to get 0-length
opaque object for the acceptor.  In the case of a 0-length XDR
object, make sure simple_get_netobj() fills in dest->data = NULL,
and does not continue to kmemdup() which will set
dest->data = ZERO_SIZE_PTR for the acceptor.

The trace event code can handle NULL but not ZERO_SIZE_PTR for a
string, and so without this patch the rpcgss_context trace event
will crash the kernel as follows:

[  162.887992] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  162.898693] #PF: supervisor read access in kernel mode
[  162.900830] #PF: error_code(0x0000) - not-present page
[  162.902940] PGD 0 P4D 0
[  162.904027] Oops: 0000 [#1] SMP PTI
[  162.905493] CPU: 4 PID: 4321 Comm: rpc.gssd Kdump: loaded Not tainted 5.10.0 #133
[  162.908548] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  162.910978] RIP: 0010:strlen+0x0/0x20
[  162.912505] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
[  162.920101] RSP: 0018:ffffaec900c77d90 EFLAGS: 00010202
[  162.922263] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000fffde697
[  162.925158] RDX: 000000000000002f RSI: 0000000000000080 RDI: 0000000000000010
[  162.928073] RBP: 0000000000000010 R08: 0000000000000e10 R09: 0000000000000000
[  162.930976] R10: ffff8e698a590cb8 R11: 0000000000000001 R12: 0000000000000e10
[  162.933883] R13: 00000000fffde697 R14: 000000010034d517 R15: 0000000000070028
[  162.936777] FS:  00007f1e1eb93700(0000) GS:ffff8e6ab7d00000(0000) knlGS:0000000000000000
[  162.940067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  162.942417] CR2: 0000000000000010 CR3: 0000000104eba000 CR4: 00000000000406e0
[  162.945300] Call Trace:
[  162.946428]  trace_event_raw_event_rpcgss_context+0x84/0x140 [auth_rpcgss]
[  162.949308]  ? __kmalloc_track_caller+0x35/0x5a0
[  162.951224]  ? gss_pipe_downcall+0x3a3/0x6a0 [auth_rpcgss]
[  162.953484]  gss_pipe_downcall+0x585/0x6a0 [auth_rpcgss]
[  162.955953]  rpc_pipe_write+0x58/0x70 [sunrpc]
[  162.957849]  vfs_write+0xcb/0x2c0
[  162.959264]  ksys_write+0x68/0xe0
[  162.960706]  do_syscall_64+0x33/0x40
[  162.962238]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  162.964346] RIP: 0033:0x7f1e1f1e57df
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e4a7d1f7

SUNRPC: Move simple_get_bytes and simple_get_netobj into private header · ba6dfce4

由 Dave Wysochanski 提交于 1月 21, 2021

Remove duplicated helper functions to parse opaque XDR objects
and place inside new file net/sunrpc/auth_gss/auth_gss_internal.h.
In the new file carry the license and copyright from the source file
net/sunrpc/auth_gss/auth_gss.c.  Finally, update the comment inside
include/linux/sunrpc/xdr.h since lockd is not the only user of
struct xdr_netobj.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

ba6dfce4

25 1月, 2021 8 次提交

SUNRPC: Correct a comment · 4ff923ce

由 Chuck Lever 提交于 1月 03, 2021

Clean up: The rq_argpages field was removed from struct svc_rqst in
the pre-git era.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

4ff923ce

svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom() · dd2d055b

由 Chuck Lever 提交于 12月 30, 2020

The Receive completion handler doesn't look at the contents of the
Receive buffer. The DMA sync isn't terribly expensive but it's one
less thing that needs to be done by the Receive completion handler,
which is single-threaded (per svc_xprt). This helps scalability.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

dd2d055b

svcrdma: Reduce Receive doorbell rate · 43042b90

由 Chuck Lever 提交于 12月 08, 2020

This is similar to commit e340c2d6 ("xprtrdma: Reduce the
doorbell rate (Receive)") which added Receive batching to the
client.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

43042b90

svcrdma: Deprecate stat variables that are no longer used · c6226ff9