- 23 3月, 2022 2 次提交
-
-
由 Trond Myklebust 提交于
As for rpc_malloc(), we first try allocating from the slab, then fall back to a non-waiting allocation from the mempool. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> -
由 Trond Myklebust 提交于
When in a low memory situation, we do want rpciod to kick off direct reclaim in the case where that helps, however we don't want it looping forever in mempool_alloc(). So first try allocating from the slab using GFP_KERNEL | __GFP_NORETRY, and then fall back to a GFP_NOWAIT allocation from the mempool. Ditto for rpc_alloc_task() Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 14 3月, 2022 3 次提交
-
-
由 NeilBrown 提交于
rpc tasks can be marked as RPC_TASK_SWAPPER. This causes GFP_MEMALLOC to be used for some allocations. This is needed in some cases, but not in all where it is currently provided, and in some where it isn't provided. Currently *all* tasks associated with a rpc_client on which swap is enabled get the flag and hence some GFP_MEMALLOC support. GFP_MEMALLOC is provided for ->buf_alloc() but only swap-writes need it. However xdr_alloc_bvec does not get GFP_MEMALLOC - though it often does need it. xdr_alloc_bvec is called while the XPRT_LOCK is held. If this blocks, then it blocks all other queued tasks. So this allocation needs GFP_MEMALLOC for *all* requests, not just writes, when the xprt is used for any swap writes. Similarly, if the transport is not connected, that will block all requests including swap writes, so memory allocations should get GFP_MEMALLOC if swap writes are possible. So with this patch: 1/ we ONLY set RPC_TASK_SWAPPER for swap writes. 2/ __rpc_execute() sets PF_MEMALLOC while handling any task with RPC_TASK_SWAPPER set, or when handling any task that holds the XPRT_LOCKED lock on an xprt used for swap. This removes the need for the RPC_IS_SWAPPER() test in ->buf_alloc handlers. 3/ xprt_prepare_transmit() sets PF_MEMALLOC after locking any task to a swapper xprt. __rpc_execute() will clear it. 3/ PF_MEMALLOC is set for all the connect workers. Reviewed-by: Chuck Lever <chuck.lever@oracle.com> (for xprtrdma parts) Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> -
由 NeilBrown 提交于
Currently, tasks marked as "swapper" tasks get put to the front of non-priority rpc_queues, and are sorted earlier than non-swapper tasks on the transport's ->xmit_queue. This is pointless as currently *all* tasks for a mount that has swap enabled on *any* file are marked as "swapper" tasks. So the net result is that the non-priority rpc_queues are reverse-ordered (LIFO). This scheduling boost is not necessary to avoid deadlocks, and hurts fairness, so remove it. If there were a need to expedite some requests, the tk_priority mechanism is a more appropriate tool. Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 NeilBrown 提交于
When memory is short, new worker threads cannot be created and we depend on the minimum one rpciod thread to be able to handle everything. So it must not block waiting for memory. mempools are particularly a problem as memory can only be released back to the mempool by an async rpc task running. If all available workqueue threads are waiting on the mempool, no thread is available to return anything. rpc_malloc() can block, and this might cause deadlocks. So check RPC_IS_ASYNC(), rather than RPC_IS_SWAPPER() to determine if blocking is acceptable. Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 26 2月, 2022 1 次提交
-
-
由 Trond Myklebust 提交于
The sections which should not re-enter the filesystem are already protected with memalloc_nofs_save/restore calls, so it is better to use GFP_KERNEL in these calls to allow better performance for synchronous RPC calls. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 21 10月, 2021 1 次提交
-
-
由 Chuck Lever 提交于
Introduce a single tracepoint that can replace simple dprintk call sites in upper layer "rpc_call_done" callbacks. Example: kworker/u24:2-1254 [001] 771.026677: rpc_stats_latency: task:00000001@00000002 xid=0x16a6f3c0 rpcbindv2 GETPORT backlog=446 rtt=101 execute=555 kworker/u24:2-1254 [001] 771.026677: rpc_task_call_done: task:00000001@00000002 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpcb_getport_done kworker/u24:2-1254 [001] 771.026678: rpcb_setport: task:00000001@00000002 status=0 port=20048 Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 10 10月, 2021 1 次提交
-
-
由 Chuck Lever 提交于
The current range of RPC task PIDs is 0..65535. That's not adequate for distinguishing tasks across multiple rpc_clnts running high throughput workloads. To help relieve this situation and to reduce the bottleneck of having a single atomic for assigning all RPC task PIDs, assign task PIDs per rpc_clnt. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 04 10月, 2021 2 次提交
-
-
由 Trond Myklebust 提交于
Don't let xprtiod pre-empt softirq. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> -
由 Trond Myklebust 提交于
Allow tasks that need to pre-empt rpciod/xprtiod to do so when it is safe. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 28 6月, 2021 2 次提交
-
-
由 Zhang Xiaoxu 提交于
When find a task from wait queue to wake up, a non-privileged task may be found out, rather than the privileged. This maybe lead a deadlock same as commit dfe1fe75 ("NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()"): Privileged delegreturn task is queued to privileged list because all the slots are assigned. If there has no enough slot to wake up the non-privileged batch tasks(session less than 8 slot), then the privileged delegreturn task maybe lost waked up because the found out task can't get slot since the session is on draining. So we should treate the privileged task as the emergency task, and execute it as for as we can. Reported-by: NHulk Robot <hulkci@huawei.com> Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode") Cc: stable@vger.kernel.org Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Zhang Xiaoxu 提交于
The 'queue->nr' will wraparound from 0 to 255 when only current priority queue has tasks. This maybe lead a deadlock same as commit dfe1fe75 ("NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()"): Privileged delegreturn task is queued to privileged list because all the slots are assigned. When non-privileged task complete and release the slot, a non-privileged maybe picked out. It maybe allocate slot failed when the session on draining. If the 'queue->nr' has wraparound to 255, and no enough slot to service it, then the privileged delegreturn will lost to wake up. So we should avoid the wraparound on 'queue->nr'. Reported-by: NHulk Robot <hulkci@huawei.com> Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode") Fixes: 1da177e4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 09 3月, 2021 1 次提交
-
-
由 Benjamin Coddington 提交于
We could recurse into NFS doing memory reclaim while sending a sync task, which might result in a deadlock. Set memalloc_nofs_save for sync task execution. Fixes: a1231fda ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs") Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 03 12月, 2020 1 次提交
-
-
由 Trond Myklebust 提交于
Currently, we wake up the tasks by priority queue ordering, which means that we ignore the batching that is supposed to help with QoS issues. Fixes: c049f8ea ("SUNRPC: Remove the bh-safe lock requirement on the rpc_wait_queue->lock") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 21 9月, 2020 5 次提交
-
-
由 Chuck Lever 提交于
Clean up. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Remove redundant call sites or call sites that are already covered by tracepoints. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
Remove several redundant dprintk call sites, and replace a couple of potentially useful ones with tracepoints. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
These instruments don't appear to add any substantial value. We already have this at the termination of each RPC: iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58 iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384 iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384 Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> -
由 Chuck Lever 提交于
Introduce a tracepoint in call_allocate that reports the exact sizes in the RPC buffer allocation request and the status of the result. This helps catch problems with XDR buffer provisioning, and replaces transport-specific debugging instrumentation. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 05 4月, 2020 1 次提交
-
-
由 Trond Myklebust 提交于
Move the test for whether a task is already queued to prevent corruption of the timer list in __rpc_sleep_on_priority_timeout(). Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 16 3月, 2020 1 次提交
-
-
由 Trond Myklebust 提交于
Add a flag to signal to the RPC layer that the credential is already pinned for the duration of the RPC call. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 15 1月, 2020 1 次提交
-
-
由 Chuck Lever 提交于
Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 23 11月, 2019 1 次提交
-
-
由 Chuck Lever 提交于
RPC tasks on the backchannel never invoke xprt_complete_rqst(), so there is no way to report their tk_status at completion. Also, any RPC task that exits via rpc_exit_task() before it is replied to will also disappear without a trace. Introduce a trace point that is symmetrical with rpc_task_begin that captures the termination status of each RPC task. Sample trace output for callback requests initiated on the server: kworker/u8:12-448 [003] 127.025240: rpc_task_end: task:50@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task kworker/u8:12-448 [002] 127.567310: rpc_task_end: task:51@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task kworker/u8:12-448 [001] 130.506817: rpc_task_end: task:52@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task Odd, though, that I never see trace_rpc_task_complete, either in the forward or backchannel. Should it be removed? Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 06 11月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
Jon Hunter: "I have been tracking down another suspend/NFS related issue where again I am seeing random delays exiting suspend. The delays can be up to a couple minutes in the worst case and this is causing a suspend test we have to fail." Change the use of a deferrable work to a standard delayed one. Reported-by: NJon Hunter <jonathanh@nvidia.com> Tested-by: NJon Hunter <jonathanh@nvidia.com> Fixes: 7e0a0e38 ("SUNRPC: Replace the queue timer with a delayed work function") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 18 9月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we set task->tk_rpc_status for all RPC level errors so that the caller can distinguish between those and server reply status errors. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 20 8月, 2019 1 次提交
-
-
由 Chuck Lever 提交于
Clean up: commit c544577d ("SUNRPC: Clean up transport write space handling") appears to have removed the last caller of rpc_wake_up_queued_task_on_wq(). Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 13 7月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we do the required accounting for the round robin queue when the caller to rpc_init_task() has passed in a transport to be used. Reported-by: NOlga Kornievskaia <aglo@umich.edu> Reported-by: NNeil Brown <neilb@suse.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 09 7月, 2019 1 次提交
-
-
由 Chuck Lever 提交于
Adapt and apply changes that were made to the TCP socket connect code. See the following commits for details on the purpose of these changes: Commit 7196dbb0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Commit 3851f1cd ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout") Commit 02910177 ("SUNRPC: Fix reconnection timeouts") Some common transport code is moved to xprt.c to satisfy the code duplication police. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 07 7月, 2019 3 次提交
-
-
由 Dave Wysochanski 提交于
For diagnostic purposes, it would be useful to have an rpc_iostats metric of RPCs completing with tk_status < 0. Unfortunately, tk_status is reset inside the rpc_call_done functions for each operation, and the call to tally the per-op metrics comes after rpc_call_done. Refactor the call to rpc_count_iostat earlier in rpc_exit_task so we can count these RPCs completing in error. Signed-off-by: NDave Wysochanski <dwysocha@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> -
由 Trond Myklebust 提交于
The queue timer function, which walks the RPC queue in order to locate candidates for waking up is one of the current constraints against removing the bh-safe queue spin locks. Replace it with a delayed work queue, so that we can do the actual rpc task wake ups from an ordinary process context. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 22 6月, 2019 1 次提交
-
-
由 Anna Schumaker 提交于
Jon Hunter reports: "I have been noticing intermittent failures with a system suspend test on some of our machines that have a NFS mounted root file-system. Bisecting this issue points to your commit 43123581 ("SUNRPC: Declare RPC timers as TIMER_DEFERRABLE") and reverting this on top of v5.2-rc3 does appear to resolve the problem. The cause of the suspend failure appears to be a long delay observed sometimes when resuming from suspend, and this is causing our test to timeout." This reverts commit 43123581. Reported-by: NJon Hunter <jonathanh@nvidia.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 21 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 4月, 2019 7 次提交
-
-
由 Trond Myklebust 提交于
Don't wake idle CPUs only for the purpose of servicing an RPC queue timeout. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Simplify the setting of queue timeouts by using the timer_reduce() function. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Add a helper to ensure that debugfs and friends print out the correct current task timeout value. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Clean up the RPC task sleep interfaces by replacing the task->tk_timeout 'hidden parameter' to rpc_sleep_on() with a new function that takes an absolute timeout. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
None of the callers set the 'action' argument, so let's just remove it. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
rpc_sleep_on() does not need to set the task->tk_callback under the queue lock, so move that out. Also refactor the check for whether the task is active. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
The RPC_TASK_KILLED flag should really not be set from another context because it can clobber data in the struct task when task->tk_flags is changed non-atomically. Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add a function to set that flag and safely wake up the task. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-