提交 · 910ad38697d95bd32f45ba70fd6952f6c2956f28 · openeuler / Kernel

23 3月, 2022 2 次提交

NFS: Fix memory allocation in rpc_alloc_task() · 910ad386

由 Trond Myklebust 提交于 3月 21, 2022

As for rpc_malloc(), we first try allocating from the slab, then fall
back to a non-waiting allocation from the mempool.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

910ad386

NFS: Fix memory allocation in rpc_malloc() · 33e5c765

由 Trond Myklebust 提交于 3月 14, 2022

When in a low memory situation, we do want rpciod to kick off direct
reclaim in the case where that helps, however we don't want it looping
forever in mempool_alloc().
So first try allocating from the slab using GFP_KERNEL | __GFP_NORETRY,
and then fall back to a GFP_NOWAIT allocation from the mempool.

Ditto for rpc_alloc_task()
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

33e5c765

14 3月, 2022 3 次提交

SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC · 8db55a03

由 NeilBrown 提交于 3月 07, 2022

rpc tasks can be marked as RPC_TASK_SWAPPER.  This causes GFP_MEMALLOC
to be used for some allocations.  This is needed in some cases, but not
in all where it is currently provided, and in some where it isn't
provided.

Currently *all* tasks associated with a rpc_client on which swap is
enabled get the flag and hence some GFP_MEMALLOC support.

GFP_MEMALLOC is provided for ->buf_alloc() but only swap-writes need it.
However xdr_alloc_bvec does not get GFP_MEMALLOC - though it often does
need it.

xdr_alloc_bvec is called while the XPRT_LOCK is held.  If this blocks,
then it blocks all other queued tasks.  So this allocation needs
GFP_MEMALLOC for *all* requests, not just writes, when the xprt is used
for any swap writes.

Similarly, if the transport is not connected, that will block all
requests including swap writes, so memory allocations should get
GFP_MEMALLOC if swap writes are possible.

So with this patch:
 1/ we ONLY set RPC_TASK_SWAPPER for swap writes.
 2/ __rpc_execute() sets PF_MEMALLOC while handling any task
    with RPC_TASK_SWAPPER set, or when handling any task that
    holds the XPRT_LOCKED lock on an xprt used for swap.
    This removes the need for the RPC_IS_SWAPPER() test
    in ->buf_alloc handlers.
 3/ xprt_prepare_transmit() sets PF_MEMALLOC after locking
    any task to a swapper xprt.  __rpc_execute() will clear it.
 3/ PF_MEMALLOC is set for all the connect workers.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com> (for xprtrdma parts)
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

8db55a03

SUNRPC: remove scheduling boost for "SWAPPER" tasks. · a80a8461

由 NeilBrown 提交于 3月 07, 2022

Currently, tasks marked as "swapper" tasks get put to the front of
non-priority rpc_queues, and are sorted earlier than non-swapper tasks on
the transport's ->xmit_queue.

This is pointless as currently *all* tasks for a mount that has swap
enabled on *any* file are marked as "swapper" tasks.  So the net result
is that the non-priority rpc_queues are reverse-ordered (LIFO).

This scheduling boost is not necessary to avoid deadlocks, and hurts
fairness, so remove it.  If there were a need to expedite some requests,
the tk_priority mechanism is a more appropriate tool.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

a80a8461

SUNRPC/call_alloc: async tasks mustn't block waiting for memory · c487216b

由 NeilBrown 提交于 3月 07, 2022

When memory is short, new worker threads cannot be created and we depend
on the minimum one rpciod thread to be able to handle everything.
So it must not block waiting for memory.

mempools are particularly a problem as memory can only be released back
to the mempool by an async rpc task running.  If all available
workqueue threads are waiting on the mempool, no thread is available to
return anything.

rpc_malloc() can block, and this might cause deadlocks.
So check RPC_IS_ASYNC(), rather than RPC_IS_SWAPPER() to determine if
blocking is acceptable.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

c487216b

26 2月, 2022 1 次提交

SUNRPC: Convert GFP_NOFS to GFP_KERNEL · 0adc8794

由 Trond Myklebust 提交于 1月 29, 2022

The sections which should not re-enter the filesystem are already
protected with memalloc_nofs_save/restore calls, so it is better to use
GFP_KERNEL in these calls to allow better performance for synchronous
RPC calls.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

0adc8794

21 10月, 2021 1 次提交

SUNRPC: Trace calls to .rpc_call_done · b40887e1

由 Chuck Lever 提交于 10月 16, 2021

Introduce a single tracepoint that can replace simple dprintk call
sites in upper layer "rpc_call_done" callbacks. Example:

kworker/u24:2-1254 [001] 771.026677: rpc_stats_latency: task:00000001@00000002 xid=0x16a6f3c0 rpcbindv2 GETPORT backlog=446 rtt=101 execute=555
kworker/u24:2-1254 [001] 771.026677: rpc_task_call_done: task:00000001@00000002 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpcb_getport_done
kworker/u24:2-1254 [001] 771.026678: rpcb_setport: task:00000001@00000002 status=0 port=20048
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

b40887e1

10 10月, 2021 1 次提交

SUNRPC: Per-rpc_clnt task PIDs · 0392dd51

由 Chuck Lever 提交于 10月 04, 2021

The current range of RPC task PIDs is 0..65535. That's not adequate
for distinguishing tasks across multiple rpc_clnts running high
throughput workloads.

To help relieve this situation and to reduce the bottleneck of
having a single atomic for assigning all RPC task PIDs, assign task
PIDs per rpc_clnt.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

0392dd51

04 10月, 2021 2 次提交

SUNRPC: Remove WQ_HIGHPRI from xprtiod · 6dbcbe3f

由 Trond Myklebust 提交于 7月 12, 2021

Don't let xprtiod pre-empt softirq.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

6dbcbe3f

SUNRPC: Add cond_resched() at the appropriate point in __rpc_execute() · 47dd8796

由 Trond Myklebust 提交于 7月 12, 2021

Allow tasks that need to pre-empt rpciod/xprtiod to do so when it is
safe.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

47dd8796

28 6月, 2021 2 次提交

SUNRPC: Should wake up the privileged task firstly. · 5483b904

由 Zhang Xiaoxu 提交于 6月 26, 2021

When find a task from wait queue to wake up, a non-privileged task may
be found out, rather than the privileged. This maybe lead a deadlock
same as commit dfe1fe75 ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):

Privileged delegreturn task is queued to privileged list because all
the slots are assigned. If there has no enough slot to wake up the
non-privileged batch tasks(session less than 8 slot), then the privileged
delegreturn task maybe lost waked up because the found out task can't
get slot since the session is on draining.

So we should treate the privileged task as the emergency task, and
execute it as for as we can.
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode")
Cc: stable@vger.kernel.org
Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

5483b904

SUNRPC: Fix the batch tasks count wraparound. · fcb170a9

由 Zhang Xiaoxu 提交于 6月 26, 2021

The 'queue->nr' will wraparound from 0 to 255 when only current
priority queue has tasks. This maybe lead a deadlock same as commit
dfe1fe75 ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):

Privileged delegreturn task is queued to privileged list because all
the slots are assigned. When non-privileged task complete and release
the slot, a non-privileged maybe picked out. It maybe allocate slot
failed when the session on draining.

If the 'queue->nr' has wraparound to 255, and no enough slot to
service it, then the privileged delegreturn will lost to wake up.

So we should avoid the wraparound on 'queue->nr'.
Reported-by: NHulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode")
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: NZhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

fcb170a9

09 3月, 2021 1 次提交

SUNRPC: Set memalloc_nofs_save() for sync tasks · f0940f4b

由 Benjamin Coddington 提交于 3月 03, 2021

We could recurse into NFS doing memory reclaim while sending a sync task,
which might result in a deadlock.  Set memalloc_nofs_save for sync task
execution.

Fixes: a1231fda ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs")
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f0940f4b

03 12月, 2020 1 次提交

SUNRPC: rpc_wake_up() should wake up tasks in the correct order · e4c72201

由 Trond Myklebust 提交于 10月 22, 2020

Currently, we wake up the tasks by priority queue ordering, which means
that we ignore the batching that is supposed to help with QoS issues.

Fixes: c049f8ea ("SUNRPC: Remove the bh-safe lock requirement on the rpc_wait_queue->lock")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e4c72201

21 9月, 2020 5 次提交

SUNRPC: Remove remaining dprintks from sched.c · 5589cc47

由 Chuck Lever 提交于 7月 08, 2020

Clean up.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5589cc47

SUNRPC: Remove dprintk call sites in RPC queuing functions · 721a1d38

由 Chuck Lever 提交于 7月 08, 2020

Remove redundant call sites or call sites that are already covered
by tracepoints.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

721a1d38

SUNRPC: Clean up RPC scheduler tracepoints · 1466c221

由 Chuck Lever 提交于 7月 08, 2020

Remove several redundant dprintk call sites, and replace a couple of
potentially useful ones with tracepoints.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

1466c221

SUNRPC: Remove debugging instrumentation from xprt_release · 78069487

由 Chuck Lever 提交于 7月 08, 2020

These instruments don't appear to add any substantial value.

We already have this at the termination of each RPC:

iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58
iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384
iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

78069487

SUNRPC: Hoist trace_xprtrdma_op_allocate into generic code · 06e234c6

由 Chuck Lever 提交于 7月 08, 2020

Introduce a tracepoint in call_allocate that reports the exact
sizes in the RPC buffer allocation request and the status of the
result. This helps catch problems with XDR buffer provisioning,
and replaces transport-specific debugging instrumentation.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

06e234c6

05 4月, 2020 1 次提交

SUNRPC: Don't start a timer on an already queued rpc task · 1fab7dc4

由 Trond Myklebust 提交于 4月 04, 2020

Move the test for whether a task is already queued to prevent
corruption of the timer list in __rpc_sleep_on_priority_timeout().
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

1fab7dc4

16 3月, 2020 1 次提交

SUNRPC: Add a flag to avoid reference counts on credentials · 7eac5264

由 Trond Myklebust 提交于 2月 07, 2020

Add a flag to signal to the RPC layer that the credential is already
pinned for the duration of the RPC call.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7eac5264

15 1月, 2020 1 次提交

SUNRPC: Capture signalled RPC tasks · abf8af78

由 Chuck Lever 提交于 12月 23, 2019

Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

abf8af78

23 11月, 2019 1 次提交

SUNRPC: Capture completion of all RPC tasks · a264abad

由 Chuck Lever 提交于 11月 20, 2019

RPC tasks on the backchannel never invoke xprt_complete_rqst(), so
there is no way to report their tk_status at completion. Also, any
RPC task that exits via rpc_exit_task() before it is replied to will
also disappear without a trace.

Introduce a trace point that is symmetrical with rpc_task_begin that
captures the termination status of each RPC task.

Odd, though, that I never see trace_rpc_task_complete, either in the
forward or backchannel. Should it be removed?
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

a264abad

06 11月, 2019 1 次提交

SUNRPC: Avoid RPC delays when exiting suspend · 66eb3add

由 Trond Myklebust 提交于 11月 05, 2019

Jon Hunter: "I have been tracking down another suspend/NFS related
issue where again I am seeing random delays exiting suspend. The delays
can be up to a couple minutes in the worst case and this is causing a
suspend test we have to fail."

Change the use of a deferrable work to a standard delayed one.
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Tested-by: NJon Hunter <jonathanh@nvidia.com>
Fixes: 7e0a0e38 ("SUNRPC: Replace the queue timer with a delayed work function")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

66eb3add

18 9月, 2019 1 次提交

SUNRPC: RPC level errors should always set task->tk_rpc_status · 714fbc73

由 Trond Myklebust 提交于 9月 12, 2019

Ensure that we set task->tk_rpc_status for all RPC level errors so that
the caller can distinguish between those and server reply status errors.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

714fbc73

20 8月, 2019 1 次提交

SUNRPC: Remove rpc_wake_up_queued_task_on_wq() · 691b45dd

由 Chuck Lever 提交于 8月 19, 2019

Clean up: commit c544577d ("SUNRPC: Clean up transport write
space handling") appears to have removed the last caller of
rpc_wake_up_queued_task_on_wq().
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

691b45dd

13 7月, 2019 1 次提交

SUNRPC: Fix transport accounting when caller specifies an rpc_xprt · a101b043

由 Trond Myklebust 提交于 7月 11, 2019

Ensure that we do the required accounting for the round robin queue
when the caller to rpc_init_task() has passed in a transport to be
used.
Reported-by: NOlga Kornievskaia <aglo@umich.edu>
Reported-by: NNeil Brown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

a101b043

09 7月, 2019 1 次提交

xprtrdma: Modernize ops->connect · 675dd90a

由 Chuck Lever 提交于 6月 19, 2019

Adapt and apply changes that were made to the TCP socket connect
code. See the following commits for details on the purpose of
these changes:

Commit 7196dbb0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly")
Commit 3851f1cd ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout")
Commit 02910177 ("SUNRPC: Fix reconnection timeouts")

Some common transport code is moved to xprt.c to satisfy the code
duplication police.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

675dd90a

07 7月, 2019 3 次提交

SUNRPC: Move call to rpc_count_iostats before rpc_call_done · 9dfe52a9

由 Dave Wysochanski 提交于 5月 23, 2019

For diagnostic purposes, it would be useful to have an rpc_iostats
metric of RPCs completing with tk_status < 0. Unfortunately,
tk_status is reset inside the rpc_call_done functions for each
operation, and the call to tally the per-op metrics comes after
rpc_call_done. Refactor the call to rpc_count_iostat earlier in
rpc_exit_task so we can count these RPCs completing in error.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9dfe52a9

T
SUNRPC: Remove the bh-safe lock requirement on the rpc_wait_queue->lock · c049f8ea
由 Trond Myklebust 提交于 5月 02, 2019
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
c049f8ea

SUNRPC: Replace the queue timer with a delayed work function · 7e0a0e38

由 Trond Myklebust 提交于 5月 01, 2019

The queue timer function, which walks the RPC queue in order to locate
candidates for waking up is one of the current constraints against
removing the bh-safe queue spin locks. Replace it with a delayed
work queue, so that we can do the actual rpc task wake ups from an
ordinary process context.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7e0a0e38

22 6月, 2019 1 次提交

Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE" · 502980e8

由 Anna Schumaker 提交于 6月 18, 2019

Jon Hunter reports:
  "I have been noticing intermittent failures with a system suspend test on
   some of our machines that have a NFS mounted root file-system. Bisecting
   this issue points to your commit 43123581 ("SUNRPC: Declare RPC
   timers as TIMER_DEFERRABLE") and reverting this on top of v5.2-rc3 does
   appear to resolve the problem.

   The cause of the suspend failure appears to be a long delay observed
   sometimes when resuming from suspend, and this is causing our test to
   timeout."

This reverts commit 43123581.
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

502980e8

21 5月, 2019 1 次提交

treewide: Add SPDX license identifier for missed files · 457c8996

由 Thomas Gleixner 提交于 5月 19, 2019

Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

457c8996

26 4月, 2019 7 次提交

SUNRPC: Declare RPC timers as TIMER_DEFERRABLE · 43123581

由 Trond Myklebust 提交于 4月 07, 2019

Don't wake idle CPUs only for the purpose of servicing an RPC
queue timeout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

43123581

SUNRPC: Simplify queue timeouts using timer_reduce() · 24a9d9a2

由 Trond Myklebust 提交于 4月 07, 2019

Simplify the setting of queue timeouts by using the timer_reduce()
function.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

24a9d9a2

SUNRPC: Fix up tracking of timeouts · 5efd1876

由 Trond Myklebust 提交于 4月 07, 2019

Add a helper to ensure that debugfs and friends print out the
correct current task timeout value.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5efd1876

SUNRPC: Add function rpc_sleep_on_timeout() · 6b2e6856

由 Trond Myklebust 提交于 4月 07, 2019

Clean up the RPC task sleep interfaces by replacing the task->tk_timeout
'hidden parameter' to rpc_sleep_on() with a new function that takes an
absolute timeout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6b2e6856

SUNRPC: Remove unused argument 'action' from rpc_sleep_on_priority() · 8357a9b6

由 Trond Myklebust 提交于 4月 07, 2019

None of the callers set the 'action' argument, so let's just remove it.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8357a9b6

SUNRPC: Refactor rpc_sleep_on() · 87150aae

由 Trond Myklebust 提交于 4月 07, 2019

rpc_sleep_on() does not need to set the task->tk_callback under the
queue lock, so move that out.
Also refactor the check for whether the task is active.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

87150aae

SUNRPC: Fix up task signalling · ae67bd38

由 Trond Myklebust 提交于 4月 07, 2019

The RPC_TASK_KILLED flag should really not be set from another context
because it can clobber data in the struct task when task->tk_flags is
changed non-atomically.
Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add
a function to set that flag and safely wake up the task.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ae67bd38

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功