- 26 4月, 2021 19 次提交
-
-
由 Chuck Lever 提交于
Clean up: Move more of struct rpcrdma_frwr into its parent. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Clean up. - Simplify variable initialization in the completion handlers. - Move another field out of struct rpcrdma_frwr. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Clean up (for several purposes): - The MR's cid is initialized sooner so that tracepoints can show something reasonable even if the MR is never posted. - The MR's res.id doesn't change so the cid won't change either. Initializing the cid once is sufficient. - struct rpcrdma_frwr is going away soon. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Clean up: The handler only recorded a trace event. If indeed no action is needed by the RPC/RDMA consumer, then the event can be ignored. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
The Send signaling logic is a little subtle, so add some observability around it. For every xprtrdma_mr_fastreg event, there should be an xprtrdma_mr_localinv or xprtrdma_mr_reminv event. When these tracepoints are enabled, we can see exactly when an MR is DMA-mapped, registered, invalidated (either locally or remotely) and then DMA-unmapped. kworker/u25:2-190 [000] 787.979512: xprtrdma_mr_map: task:351@5 mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE) kworker/u25:2-190 [000] 787.979515: xprtrdma_chunk_read: task:351@5 pos=148 5608@0x8679e0c8f6f56000:0x00000503 (last) kworker/u25:2-190 [000] 787.979519: xprtrdma_marshal: task:351@5 xid=0x8679e0c8: hdr=52 xdr=148/5608/0 read list/inline kworker/u25:2-190 [000] 787.979525: xprtrdma_mr_fastreg: task:351@5 mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE) kworker/u25:2-190 [000] 787.979526: xprtrdma_post_send: task:351@5 cq.id=0 cid=73 (2 SGEs) ... kworker/5:1H-219 [005] 787.980567: xprtrdma_wc_receive: cq.id=1 cid=161 status=SUCCESS (0/0x0) received=164 kworker/5:1H-219 [005] 787.980571: xprtrdma_post_recvs: peer=[192.168.100.55]:20049 r_xprt=0xffff8884974d4000: 0 new recvs, 70 active (rc 0) kworker/5:1H-219 [005] 787.980573: xprtrdma_reply: task:351@5 xid=0x8679e0c8 credits=64 kworker/5:1H-219 [005] 787.980576: xprtrdma_mr_reminv: task:351@5 mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE) kworker/5:1H-219 [005] 787.980577: xprtrdma_mr_unmap: mr.id=4 nents=2 5608@0x8679e0c8f6f56000:0x00000503 (TO_DEVICE) Note that I've moved the xprtrdma_post_send tracepoint so that event always appears after the xprtrdma_mr_fastreg tracepoint. Otherwise the event log looks counterintuitive (FastReg is always supposed to happen before Send). Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Send WRs can be signalled or unsignalled. A signalled Send WR always has a matching Send completion, while a unsignalled Send has a completion only if the Send WR fails. xprtrdma has a Send account mechanism that is designed to reduce the number of signalled Send WRs. This in turn mitigates the interrupt rate of the underlying device. RDMA consumers can't leave all Sends unsignaled, however, because providers rely on Send completions to maintain their Send Queue head and tail pointers. xprtrdma counts the number of unsignaled Send WRs that have been posted to ensure that Sends are signalled often enough to prevent the Send Queue from wrapping. This mechanism neglected to account for FastReg WRs, which are posted on the Send Queue but never signalled. As a result, the Send Queue wrapped on occasion, resulting in duplication completions of FastReg and LocalInv WRs. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Throw away any reply where the LocalInv flushes or could not be posted. The registered memory region is in an unknown state until the disconnect completes. rpcrdma_xprt_disconnect() will find and release the MR. No need to put it back on the MR free list in this case. The client retransmits pending RPC requests once it reestablishes a fresh connection, so a replacement reply should be forthcoming on the next connection instance. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Better not to touch MRs involved in a flush or post error until the Send and Receive Queues are drained and the transport is fully quiescent. Simply don't insert such MRs back onto the free list. They remain on mr_all and will be released when the connection is torn down. I had thought that recycling would prevent hardware resources from being tied up for a long time. However, since v5.7, a transport disconnect destroys the QP and other hardware-owned resources. The MRs get cleaned up nicely at that point. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Clean up: The comment and the placement of the memory barrier is confusing. Humans want to read the function statements from head to tail. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Clean up: To be consistent with other functions in this source file, follow the naming convention of putting the object being acted upon before the action itself. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
The rpcrdma_mr_pop() earlier in the function has already cleared out mr_list, so it must not be done again in the error path. Fixes: 84756894 ("xprtrdma: Remove fr_state") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Clean up: The name recv_buffer_put() is a vestige of older code, and the function is just a wrapper for the newer rpcrdma_rep_put(). In most of the existing call sites, a pointer to the owning rpcrdma_buffer is already available. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
After a reconnect, the reply handler is opening the cwnd (and thus enabling more RPC Calls to be sent) /before/ rpcrdma_post_recvs() can post enough Receive WRs to receive their replies. This causes an RNR and the new connection is lost immediately. The race is most clearly exposed when KASAN and disconnect injection are enabled. This slows down rpcrdma_rep_create() enough to allow the send side to post a bunch of RPC Calls before the Receive completion handler can invoke ib_post_recv(). Fixes: 2ae50ad6 ("xprtrdma: Close window between waking RPC senders and posting Receives") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Defensive clean up: Protect the rb_all_reps list during rep creation. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Currently rpcrdma_reps_destroy() assumes that, at transport tear-down, the content of the rb_free_reps list is the same as the content of the rb_all_reps list. Although that is usually true, using the rb_all_reps list should be more reliable because of the way it's managed. And, rpcrdma_reps_unmap() uses rb_all_reps; these two functions should both traverse the "all" list. Ensure that all rpcrdma_reps are always destroyed whether they are on the rep free list or not. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Defer destruction of an rpcrdma_rep until transport tear-down to preserve the rb_all_reps list while Receives flush. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NTom Talpey <tom@talpey.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Currently the Receive completion handler refreshes the Receive Queue whenever a successful Receive completion occurs. On disconnect, xprtrdma drains the Receive Queue. The first few Receive completions after a disconnect are typically successful, until the first flushed Receive. This means the Receive completion handler continues to post more Receive WRs after the drain sentinel has been posted. The late- posted Receives flush after the drain sentinel has completed, leading to a crash later in rpcrdma_xprt_disconnect(). To prevent this crash, xprtrdma has to ensure that the Receive handler stops posting Receives before ib_drain_rq() posts its drain sentinel. Suggested-by: NTom Talpey <tom@talpey.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
Commit e340c2d6 ("xprtrdma: Reduce the doorbell rate (Receive)") increased the number of Receive WRs that are posted by the client, but did not increase the size of the Receive Queue allocated during transport set-up. This is usually not an issue because RPCRDMA_BACKWARD_WRS is defined as (32) when SUNRPC_BACKCHANNEL is defined. In cases where it isn't, there is a real risk of Receive Queue wrapping. Fixes: e340c2d6 ("xprtrdma: Reduce the doorbell rate (Receive)") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NTom Talpey <tom@talpey.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 14 4月, 2021 4 次提交
-
-
由 Chris Dion 提交于
Currently if a major timeout value is reached, but the minor value has not been reached, an ETIMEOUT will not be sent back to the caller. This can occur if the v4 server is not responding to requests and retrans is configured larger than the default of two. For example, A TCP mount with a configured timeout value of 50 and a retransmission count of 3 to a v4 server which is not responding: 1. Initial value and increment set to 5s, maxval set to 20s, retries at 3 2. Major timeout is set to 20s, minor timeout set to 5s initially 3. xport_adjust_timeout() is called after 5s, retry with 10s timeout, minor timeout is bumped to 10s 4. And again after another 10s, 15s total time with minor timeout set to 15s 5. After 20s total time xport_adjust_timeout is called as major timeout is reached, but skipped because the minor timeout is not reached - After this time the cpu spins continually calling xport_adjust_timeout() and returning 0 for 10 seconds. As seen on perf sched: 39243.913182 [0005] mount.nfs[3794] 4607.938 0.017 9746.863 6. This continues until the 15s minor timeout condition is reached (in this case for 10 seconds). After which the ETIMEOUT is processed back to the caller, the cpu spinning stops, and normal operations continue Fixes: 7de62bc0 ("SUNRPC dont update timeout value on connection reset") Signed-off-by: NChris Dion <Christopher.Dion@dell.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
This tracepoint can crash when dereferencing snd_task because when some transports connect, they put a cookie in that field instead of a pointer to an rpc_task. BUG: KASAN: use-after-free in trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc] Read of size 2 at addr ffff8881a83bd3a0 by task git/331872 CPU: 11 PID: 331872 Comm: git Tainted: G S 5.12.0-rc2-00007-g3ab6e585a7f9 #1453 Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015 Call Trace: dump_stack+0x9c/0xcf print_address_description.constprop.0+0x18/0x239 kasan_report+0x174/0x1b0 trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc] xprt_prepare_transmit+0x8e/0xc1 [sunrpc] call_transmit+0x4d/0xc6 [sunrpc] Fixes: 9ce07ae5 ("SUNRPC: Replace dprintk() call site in xprt_prepare_transmit") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
A separate tracepoint can be left enabled all the time to capture rare but important retransmission events. So for example: kworker/u26:3-568 [009] 156.967933: xprt_retransmit: task:44093@5 xid=0xa25dbc79 nfsv3 WRITE ntrans=2 Or, for example, enable all nfs and nfs4 tracepoints, and set up a trigger to disable tracing when xprt_retransmit fires to capture everything that leads up to it. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Chuck Lever 提交于
I've hit some crashes that occur in the xprt_rdma_inject_disconnect path. It appears that, for some provides, rdma_disconnect() can take so long that the transport can disconnect and release its hardware resources while rdma_disconnect() is still running, resulting in a UAF in the provider. The transport's fault injection method may depend on the stability of transport data structures. That means it needs to be invoked only from contexts that hold the transport write lock. Fixes: 4a068258 ("SUNRPC: Transport fault injection") Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 05 4月, 2021 3 次提交
-
-
由 Benjamin Coddington 提交于
If the server sends CB_ calls on a connection that is not associated with the backchannel, refuse to process the call and shut down the connection. This avoids a NULL dereference crash in xprt_complete_bc_request(). There's not much more we can do in this situation unless we want to look into allowing all connections to be associated with the fore and back channel. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Eryu Guan 提交于
Currently rpcbind client is created without setting rpc timeout (thus using the default value). But if the rpc_task already has a customized timeout in its tk_client field, it's also ignored. Let's use the same timeout setting in rpc_task->tk_client->cl_timeout for rpcbind connection. Signed-off-by: NEryu Guan <eguan@linux.alibaba.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
When we have multiple RPC requests queued up, it makes sense to set the TCP_CORK option while the transmit queue is non-empty. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 12 3月, 2021 1 次提交
-
-
由 Chuck Lever 提交于
I tested commit 43042b90 ("svcrdma: Reduce Receive doorbell rate") with mlx4 (IB) and software iWARP and didn't find any issues. However, I recently got my hardware iWARP setup back on line (FastLinQ) and it's crashing hard on this commit (confirmed via bisect). The failure mode is complex. - After a connection is established, the first Receive completes normally. - But the second and third Receives have garbage in their Receive buffers. The server responds with ERR_VERS as a result. - When the client tears down the connection to retry, a couple of posted Receives flush twice, and that corrupts the recv_ctxt free list. - __svc_rdma_free then faults or loops infinitely while destroying the xprt's recv_ctxts. Since 43042b90 ("svcrdma: Reduce Receive doorbell rate") does not fix a bug but is a scalability enhancement, it's safe and appropriate to revert it while working on a replacement. Fixes: 43042b90 ("svcrdma: Reduce Receive doorbell rate") Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 09 3月, 2021 1 次提交
-
-
由 Benjamin Coddington 提交于
We could recurse into NFS doing memory reclaim while sending a sync task, which might result in a deadlock. Set memalloc_nofs_save for sync task execution. Fixes: a1231fda ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs") Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 07 3月, 2021 4 次提交
-
-
由 J. Bruce Fields 提交于
I think this is unlikely but possible: svc_authenticate sets rq_authop and calls svcauth_gss_accept. The kmalloc(sizeof(*svcdata), GFP_KERNEL) fails, leaving rq_auth_data NULL, and returning SVC_DENIED. This causes svc_process_common to go to err_bad_auth, and eventually call svc_authorise. That calls ->release == svcauth_gss_release, which tries to dereference rq_auth_data. Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Link: https://lore.kernel.org/linux-nfs/3F1B347F-B809-478F-A1E9-0BE98E22B0F0@oracle.com/T/#tSigned-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Daniel Kobras 提交于
If an auth module's accept op returns SVC_CLOSE, svc_process_common() enters a call path that does not call svc_authorise() before leaving the function, and thus leaks a reference on the auth module's refcount. Hence, make sure calls to svc_authenticate() and svc_authorise() are paired for all call paths, to make sure rpc auth modules can be unloaded. Signed-off-by: NDaniel Kobras <kobras@puzzle-itc.de> Fixes: 4d712ef1 ("svcauth_gss: Close connection when dropping an incoming message") Link: https://lore.kernel.org/linux-nfs/3F1B347F-B809-478F-A1E9-0BE98E22B0F0@oracle.com/T/#tSigned-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Joe Korty 提交于
[ This problem is in mainline, but only rt has the chops to be able to detect it. ] Lockdep reports a circular lock dependency between serv->sv_lock and softirq_ctl.lock on system shutdown, when using a kernel built with CONFIG_PREEMPT_RT=y, and a nfs mount exists. This is due to the definition of spin_lock_bh on rt: local_bh_disable(); rt_spin_lock(lock); which forces a softirq_ctl.lock -> serv->sv_lock dependency. This is not a problem as long as _every_ lock of serv->sv_lock is a: spin_lock_bh(&serv->sv_lock); but there is one of the form: spin_lock(&serv->sv_lock); This is what is causing the circular dependency splat. The spin_lock() grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable. If later on in the critical region, someone does a local_bh_disable, we get a serv->sv_lock -> softirq_ctrl.lock dependency established. Deadlock. Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no exceptions. [ OK ] Stopped target NFS client services. Stopping Logout off all iSCSI sessions on shutdown... Stopping NFS server and services... [ 109.442380] [ 109.442385] ====================================================== [ 109.442386] WARNING: possible circular locking dependency detected [ 109.442387] 5.10.16-rt30 #1 Not tainted [ 109.442389] ------------------------------------------------------ [ 109.442390] nfsd/1032 is trying to acquire lock: [ 109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270 [ 109.442405] [ 109.442405] but task is already holding lock: [ 109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90 [ 109.442415] [ 109.442415] which lock already depends on the new lock. [ 109.442415] [ 109.442416] [ 109.442416] the existing dependency chain (in reverse order) is: [ 109.442417] [ 109.442417] -> #1 (&serv->sv_lock){+.+.}-{0:0}: [ 109.442421] rt_spin_lock+0x2b/0xc0 [ 109.442428] svc_add_new_perm_xprt+0x42/0xa0 [ 109.442430] svc_addsock+0x135/0x220 [ 109.442434] write_ports+0x4b3/0x620 [ 109.442438] nfsctl_transaction_write+0x45/0x80 [ 109.442440] vfs_write+0xff/0x420 [ 109.442444] ksys_write+0x4f/0xc0 [ 109.442446] do_syscall_64+0x33/0x40 [ 109.442450] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 109.442454] [ 109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}: [ 109.442457] __lock_acquire+0x1264/0x20b0 [ 109.442463] lock_acquire+0xc2/0x400 [ 109.442466] rt_spin_lock+0x2b/0xc0 [ 109.442469] __local_bh_disable_ip+0xd9/0x270 [ 109.442471] svc_xprt_do_enqueue+0xc0/0x4d0 [ 109.442474] svc_close_list+0x60/0x90 [ 109.442476] svc_close_net+0x49/0x1a0 [ 109.442478] svc_shutdown_net+0x12/0x40 [ 109.442480] nfsd_destroy+0xc5/0x180 [ 109.442482] nfsd+0x1bc/0x270 [ 109.442483] kthread+0x194/0x1b0 [ 109.442487] ret_from_fork+0x22/0x30 [ 109.442492] [ 109.442492] other info that might help us debug this: [ 109.442492] [ 109.442493] Possible unsafe locking scenario: [ 109.442493] [ 109.442493] CPU0 CPU1 [ 109.442494] ---- ---- [ 109.442495] lock(&serv->sv_lock); [ 109.442496] lock((softirq_ctrl.lock).lock); [ 109.442498] lock(&serv->sv_lock); [ 109.442499] lock((softirq_ctrl.lock).lock); [ 109.442501] [ 109.442501] *** DEADLOCK *** [ 109.442501] [ 109.442501] 3 locks held by nfsd/1032: [ 109.442503] #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270 [ 109.442508] #1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90 [ 109.442512] #2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0 [ 109.442518] [ 109.442518] stack backtrace: [ 109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 #1 [ 109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015 [ 109.442524] Call Trace: [ 109.442527] dump_stack+0x77/0x97 [ 109.442533] check_noncircular+0xdc/0xf0 [ 109.442546] __lock_acquire+0x1264/0x20b0 [ 109.442553] lock_acquire+0xc2/0x400 [ 109.442564] rt_spin_lock+0x2b/0xc0 [ 109.442570] __local_bh_disable_ip+0xd9/0x270 [ 109.442573] svc_xprt_do_enqueue+0xc0/0x4d0 [ 109.442577] svc_close_list+0x60/0x90 [ 109.442581] svc_close_net+0x49/0x1a0 [ 109.442585] svc_shutdown_net+0x12/0x40 [ 109.442588] nfsd_destroy+0xc5/0x180 [ 109.442590] nfsd+0x1bc/0x270 [ 109.442595] kthread+0x194/0x1b0 [ 109.442600] ret_from_fork+0x22/0x30 [ 109.518225] nfsd: last server has exited, flushing export cache [ OK ] Stopped NFSv4 ID-name mapping service. [ OK ] Stopped GSSAPI Proxy Daemon. [ OK ] Stopped NFS Mount Daemon. [ OK ] Stopped NFS status monitor for NFSv2/3 locking.. Fixes: 719f8bcc ("svcrpc: fix xpt_list traversal locking on shutdown") Signed-off-by: NJoe Korty <joe.korty@concurrent-rt.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Timo Rothenpieler 提交于
This brings it in line with the regular tcp backchannel, which also has all those timeouts disabled. Prevents the backchannel from timing out, getting some async operations like server side copying getting stuck indefinitely on the client side. Signed-off-by: NTimo Rothenpieler <timo@rothenpieler.org> Fixes: 5d252f90 ("svcrdma: Add class for RDMA backwards direction transport") Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 17 2月, 2021 3 次提交
-
-
由 Chuck Lever 提交于
Clean up: The msghdr is no longer needed in the caller. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Trond Myklebust 提交于
Now that the caller controls the TCP_CORK socket option, it is redundant to set MSG_MORE and MSG_SENDPAGE_NOTLAST in the calls to kernel_sendpage(). Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Trond Myklebust 提交于
Use a counter to keep track of how many requests are queued behind the xprt->xpt_mutex, and keep TCP_CORK set until the queue is empty. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Link: https://lore.kernel.org/linux-nfs/20210213202532.23146-1-trondmy@kernel.org/T/#uSigned-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 15 2月, 2021 1 次提交
-
-
由 Chuck Lever 提交于
RDMA core mutex locking was restructured by commit d114c6fe ("RDMA/cma: Add missing locking to rdma_accept()") [Aug 2020]. When lock debugging is enabled, the RPC/RDMA server trips over the new lockdep assertion in rdma_accept() because it doesn't call rdma_accept() from its CM event handler. As a temporary fix, have svc_rdma_accept() take the handler_mutex explicitly. In the meantime, let's consider how to restructure the RPC/RDMA transport to invoke rdma_accept() from the proper context. Calls to svc_rdma_accept() are serialized with calls to svc_rdma_free() by the generic RPC server layer. Suggested-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/linux-rdma/20210209154014.GO4247@nvidia.com/ Fixes: d114c6fe ("RDMA/cma: Add missing locking to rdma_accept()") Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 06 2月, 2021 4 次提交
-
-
由 Chuck Lever 提交于
Since commit 9ed5af26 ("SUNRPC: Clean up the handling of page padding in rpc_prepare_reply_pages()") [Dec 2020] the NFS client passes payload data to the transport with the padding in xdr->pages instead of in the send buffer's tail kvec. There's no need for the extra logic to advance the base of the tail kvec because the upper layer no longer places XDR padding there. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
The NetApp Linux team discovered that with NFS/RDMA servers that do not support RFC 8797, the Linux client is forming NFSv4.x WRITE requests incorrectly. In this case, the Linux NFS client disables implicit chunk round-up for odd-length Read and Write chunks. The goal was to support old servers that needed that padding to be sent explicitly by clients. In that case the Linux NFS included the tail kvec in the Read chunk, since the tail contains any needed padding. That meant a separate memory registration is needed for the tail kvec, adding to the cost of forming such requests. To avoid that cost for a mere 3 bytes of zeroes that are always ignored by receivers, we try to use implicit roundup when possible. For NFSv4.x, the tail kvec also sometimes contains a trailing GETATTR operation. The Linux NFS client unintentionally includes that GETATTR operation in the Read chunk as well as inline. The fix is simply to /never/ include the tail kvec when forming a data payload Read chunk. The padding is thus now always present. Note that since commit 9ed5af26 ("SUNRPC: Clean up the handling of page padding in rpc_prepare_reply_pages()") [Dec 2020] the NFS client passes payload data to the transport with the padding in xdr->pages instead of in the send buffer's tail kvec. So now the Linux NFS client appends XDR padding to all odd-sized Read chunks. This shouldn't be a problem because: - RFC 8166-compliant servers are supposed to work with or without that XDR padding in Read chunks. - Since the padding is now in the same memory region as the data payload, a separate memory registration is not needed. In addition, the link layer extends data in RDMA Read responses to 4-byte boundaries anyway. Thus there is now no savings when the padding is not included. Because older kernels include the payload's XDR padding in the tail kvec, a fix there will be more complicated. Thus backporting this patch is not recommended. Reported by: Olga Kornievskaia <Olga.Kornievskaia@netapp.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NTom Talpey <tom@talpey.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
During the final stages of publication of RFC 8167, reviewers requested that we use the term "reverse direction" rather than "backwards direction". Update comments to reflect this preference. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NTom Talpey <tom@talpey.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Clean up so that offset_in_page() is invoked less often in the most common case, which is mapping xdr->pages. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NTom Talpey <tom@talpey.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-