提交 · 09252177d5f924f404551b4b4eded5daa7f04a3a · openeuler / Kernel

14 4月, 2021 4 次提交

SUNRPC: Handle major timeout in xprt_adjust_timeout() · 09252177

由 Chris Dion 提交于 4月 04, 2021

Currently if a major timeout value is reached, but the minor value has
not been reached, an ETIMEOUT will not be sent back to the caller.
This can occur if the v4 server is not responding to requests and
retrans is configured larger than the default of two.

For example, A TCP mount with a configured timeout value of 50 and a
retransmission count of 3 to a v4 server which is not responding:

1. Initial value and increment set to 5s, maxval set to 20s, retries at 3
2. Major timeout is set to 20s, minor timeout set to 5s initially
3. xport_adjust_timeout() is called after 5s, retry with 10s timeout,
   minor timeout is bumped to 10s
4. And again after another 10s, 15s total time with minor timeout set
   to 15s
5. After 20s total time xport_adjust_timeout is called as major timeout is
   reached, but skipped because the minor timeout is not reached
       - After this time the cpu spins continually calling
       	 xport_adjust_timeout() and returning 0 for 10 seconds.
	 As seen on perf sched:
   	 39243.913182 [0005]  mount.nfs[3794] 4607.938      0.017   9746.863
6. This continues until the 15s minor timeout condition is reached (in
   this case for 10 seconds). After which the ETIMEOUT is processed
   back to the caller, the cpu spinning stops, and normal operations
   continue

Fixes: 7de62bc0 ("SUNRPC dont update timeout value on connection reset")
Signed-off-by: NChris Dion <Christopher.Dion@dell.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

09252177

SUNRPC: Remove trace_xprt_transmit_queued · 6cf23783

由 Chuck Lever 提交于 3月 31, 2021

This tracepoint can crash when dereferencing snd_task because
when some transports connect, they put a cookie in that field
instead of a pointer to an rpc_task.

BUG: KASAN: use-after-free in trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
Read of size 2 at addr ffff8881a83bd3a0 by task git/331872

CPU: 11 PID: 331872 Comm: git Tainted: G S                5.12.0-rc2-00007-g3ab6e585a7f9 #1453
Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Call Trace:
 dump_stack+0x9c/0xcf
 print_address_description.constprop.0+0x18/0x239
 kasan_report+0x174/0x1b0
 trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
 xprt_prepare_transmit+0x8e/0xc1 [sunrpc]
 call_transmit+0x4d/0xc6 [sunrpc]

Fixes: 9ce07ae5 ("SUNRPC: Replace dprintk() call site in xprt_prepare_transmit")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

6cf23783

SUNRPC: Add tracepoint that fires when an RPC is retransmitted · e936a597

由 Chuck Lever 提交于 3月 31, 2021

A separate tracepoint can be left enabled all the time to capture
rare but important retransmission events. So for example:

kworker/u26:3-568 [009] 156.967933: xprt_retransmit: task:44093@5 xid=0xa25dbc79 nfsv3 WRITE ntrans=2

Or, for example, enable all nfs and nfs4 tracepoints, and set up a
trigger to disable tracing when xprt_retransmit fires to capture
everything that leads up to it.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e936a597

SUNRPC: Move fault injection call sites · 7638e0bf

由 Chuck Lever 提交于 3月 31, 2021

I've hit some crashes that occur in the xprt_rdma_inject_disconnect
path. It appears that, for some provides, rdma_disconnect() can
take so long that the transport can disconnect and release its
hardware resources while rdma_disconnect() is still running,
resulting in a UAF in the provider.

The transport's fault injection method may depend on the stability
of transport data structures. That means it needs to be invoked
only from contexts that hold the transport write lock.

Fixes: 4a068258 ("SUNRPC: Transport fault injection")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7638e0bf

05 4月, 2021 1 次提交

SUNRPC: Set TCP_CORK until the transmit queue is empty · d737e5d4

由 Trond Myklebust 提交于 2月 09, 2021

When we have multiple RPC requests queued up, it makes sense to set the
TCP_CORK option while the transmit queue is non-empty.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d737e5d4

03 12月, 2020 4 次提交

T
SUNRPC: Remove unused function xprt_load_transport() · c87b056e
由 Trond Myklebust 提交于 11月 10, 2020
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
c87b056e
T
SUNRPC: Add a helper to return the transport identifier given a netid · 1fc5f131
由 Trond Myklebust 提交于 11月 10, 2020
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
1fc5f131

SUNRPC: Close a race with transport setup and module put · 9bccd264

由 Trond Myklebust 提交于 11月 10, 2020

After we've looked up the transport module, we need to ensure it can't
go away until we've finished running the transport setup code.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9bccd264

SUNRPC: xprt_load_transport() needs to support the netid "rdma6" · d5aa6b22

由 Trond Myklebust 提交于 11月 06, 2020

According to RFC5666, the correct netid for an IPv6 addressed RDMA
transport is "rdma6", which we've supported as a mount option since
Linux-4.7. The problem is when we try to load the module "xprtrdma6",
that will fail, since there is no modulealias of that name.

Fixes: 181342c5 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d5aa6b22

21 9月, 2020 6 次提交

SUNRPC: Mitigate cond_resched() in xprt_transmit() · 6f9f1728

由 Chuck Lever 提交于 7月 08, 2020

The original purpose of this expensive call is to prevent a long
queue of requests from blocking other work.

The cond_resched() call is unnecessary after just a single send
operation.

For longer queues, instead of invoking the kernel scheduler, simply
release the transport send lock and return to the RPC scheduler.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6f9f1728

SUNRPC: Replace connect dprintk call sites with a tracepoint · db0a86c4

由 Chuck Lever 提交于 7月 08, 2020

This trace event can be used to audit transport connections from the
client.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

db0a86c4

SUNRPC: Replace dprintk() call site in xprt_prepare_transmit · 9ce07ae5

由 Chuck Lever 提交于 7月 08, 2020

Generate a trace event when an RPC request is queued without being
sent immediately.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9ce07ae5

SUNRPC: Update debugging instrumentation in xprt_do_reserve() · 09d2ba0c

由 Chuck Lever 提交于 7月 08, 2020

Replace a dprintk() with a tracepoint. The tracepoint marks the
point where an RPC request is assigned an XID.

Additional clean up: Remove trace_xprt_enq_xmit, which reports much
the same thing. That tracepoint was added for debugging commit
918f3c1f ("SUNRPC: Improve latency for interactive tasks").
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

09d2ba0c

SUNRPC: Remove debugging instrumentation from xprt_release · 78069487

由 Chuck Lever 提交于 7月 08, 2020

These instruments don't appear to add any substantial value.

We already have this at the termination of each RPC:

iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58
iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384
iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

78069487

SUNRPC: Remove trace_xprt_complete_rqst() · e4378a0f

由 Chuck Lever 提交于 7月 08, 2020

Request completion is already recorded by an "rpc_task_wakeup
queue=xprt_pending" trace record. A subsequent rpc_xdr_recvfrom
trace record shows the number of bytes received.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e4378a0f

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

05 8月, 2020 1 次提交

SUNRPC dont update timeout value on connection reset · 7de62bc0

由 Olga Kornievskaia 提交于 7月 15, 2020

Current behaviour: every time a v3 operation is re-sent to the server
we update (double) the timeout. There is no distinction between whether
or not the previous timer had expired before the re-sent happened.

Here's the scenario:
1. Client sends a v3 operation
2. Server RST-s the connection (prior to the timeout) (eg., connection
is immediately reset)
3. Client re-sends a v3 operation but the timeout is now 120sec.

As a result, an application sees 2mins pause before a retry in case
server again does not reply.

Instead, this patch proposes to keep track off when the minor timeout
should happen and if it didn't, then don't update the new timeout.
Value is updated based on the previous value to make timeouts
predictable.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7de62bc0

12 6月, 2020 2 次提交

SUNRPC: Trace transport lifetime events · 911813d7

由 Chuck Lever 提交于 5月 12, 2020

Refactor: Hoist create/destroy/disconnect tracepoints out of
xprtrdma and into the generic RPC client. Some benefits include:

- Enable tracing of xprt lifetime events for the socket transport
  types

- Expose the different types of disconnect to help run down
  issues with lingering connections
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

911813d7

SUNRPC: Split the xdr_buf event class · c509f15a

由 Chuck Lever 提交于 5月 12, 2020

To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c509f15a

17 3月, 2020 1 次提交

svcrdma: Create a generic tracing class for displaying xdr_buf layout · b20dfc3f

由 Chuck Lever 提交于 3月 02, 2020

This class can be used to create trace points in either the RPC
client or RPC server paths. It simply displays the length of each
part of an xdr_buf, which is useful to determine that the transport
and XDR codecs are operating correctly.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

b20dfc3f

31 10月, 2019 1 次提交

SUNRPC: Destroy the back channel when we destroy the host transport · 669996ad

由 Trond Myklebust 提交于 10月 17, 2019

When we're destroying the host transport mechanism, we should ensure
that we do not leak memory by failing to release any back channel
slots that might still exist.
Reported-by: NNeil Brown <neilb@suse.de>
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

669996ad

24 10月, 2019 1 次提交

SUNRPC: Add trace points to observe transport congestion control · bf7ca707

由 Chuck Lever 提交于 10月 09, 2019

To help debug problems with RPC/RDMA credit management, replace
dprintk() call sites in the transport send lock paths with trace
events.

Similar trace points are defined for the non-congestion paths.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bf7ca707

21 9月, 2019 1 次提交

SUNRPC: Fix congestion window race with disconnect · 8593e010

由 Chuck Lever 提交于 9月 13, 2019

If the congestion window closes just as the transport disconnects,
a reconnect is never driven because:

1. The XPRT_CONG_WAIT flag prevents tasks from taking the write lock
2. There's no wake-up of the first task on the xprt->sending queue

To address this, clear the congestion wait flag as part of
completing a disconnect.

Fixes: 75891f50 ("SUNRPC: Support for congestion control ... ")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8593e010

18 9月, 2019 1 次提交

SUNRPC: Dequeue the request from the receive queue while we're re-encoding · cc204d01

由 Trond Myklebust 提交于 9月 10, 2019

Ensure that we dequeue the request from the transport receive queue
while we're re-encoding to prevent issues like use-after-free when
we release the bvec.

Fixes: 75369089 ("SUNRPC: Ensure the bvecs are reset when we re-encode...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

cc204d01

27 8月, 2019 1 次提交

Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" · d5711920

由 Trond Myklebust 提交于 8月 16, 2019

This reverts commit a79f194a.
The mechanism for aborting I/O is racy, since we are not guaranteed that
the request is asleep while we're changing both task->tk_status and
task->tk_action.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v5.1

d5711920

19 7月, 2019 1 次提交

SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request · 75369089

由 Trond Myklebust 提交于 7月 17, 2019

The bvec tracks the list of pages, so if the number of pages changes
due to a re-encode, we need to reset the bvec as well.

Fixes: 277e4ab7 ("SUNRPC: Simplify TCP receive code by switching...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+

75369089

09 7月, 2019 1 次提交

xprtrdma: Modernize ops->connect · 675dd90a

由 Chuck Lever 提交于 6月 19, 2019

Adapt and apply changes that were made to the TCP socket connect
code. See the following commits for details on the purpose of
these changes:

Commit 7196dbb0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly")
Commit 3851f1cd ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout")
Commit 02910177 ("SUNRPC: Fix reconnection timeouts")

Some common transport code is moved to xprt.c to satisfy the code
duplication police.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

675dd90a

07 7月, 2019 3 次提交

SUNRPC: Fix possible autodisconnect during connect due to old last_used · 80d3c45f

由 Dave Wysochanski 提交于 6月 26, 2019

Ensure last_used is updated before calling mod_timer inside
xprt_schedule_autodisconnect. This avoids a possible xprt_autoclose
firing immediately after a successful connect when xprt_unlock_connect
calls xprt_schedule_autodisconnect with an old value of last_used.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

80d3c45f

SUNRPC: Move call to rpc_count_iostats before rpc_call_done · 9dfe52a9

由 Dave Wysochanski 提交于 5月 23, 2019

For diagnostic purposes, it would be useful to have an rpc_iostats
metric of RPCs completing with tk_status < 0. Unfortunately,
tk_status is reset inside the rpc_call_done functions for each
operation, and the call to tally the per-op metrics comes after
rpc_call_done. Refactor the call to rpc_count_iostat earlier in
rpc_exit_task so we can count these RPCs completing in error.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9dfe52a9

T
SUNRPC: Remove the bh-safe lock requirement on xprt->transport_lock · b5e92419
由 Trond Myklebust 提交于 5月 02, 2019
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
b5e92419

22 6月, 2019 1 次提交

Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE" · 502980e8

由 Anna Schumaker 提交于 6月 18, 2019

Jon Hunter reports:
  "I have been noticing intermittent failures with a system suspend test on
   some of our machines that have a NFS mounted root file-system. Bisecting
   this issue points to your commit 43123581 ("SUNRPC: Declare RPC
   timers as TIMER_DEFERRABLE") and reverting this on top of v5.2-rc3 does
   appear to resolve the problem.

   The cause of the suspend failure appears to be a long delay observed
   sometimes when resuming from suspend, and this is causing our test to
   timeout."

This reverts commit 43123581.
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

502980e8

21 5月, 2019 1 次提交

treewide: Add SPDX license identifier for missed files · 457c8996

由 Thomas Gleixner 提交于 5月 19, 2019

Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

457c8996

26 4月, 2019 7 次提交

SUNRPC: Update comments based on recent changes · 1f7d1c73

由 Chuck Lever 提交于 4月 24, 2019

Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

1f7d1c73

SUNRPC: Start the first major timeout calculation at task creation · da953063

由 Trond Myklebust 提交于 4月 07, 2019

When calculating the major timeout for a new task, when we know that the
connection has been broken, use the task->tk_start to ensure that we also
take into account the time spent waiting for a slot or session slot. This
ensures that we fail over soft requests relatively quickly once the
connection has actually been broken, and the first requests have
started to fail.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

da953063

SUNRPC: Ensure that the transport layer respect major timeouts · 9e910bff

由 Trond Myklebust 提交于 4月 07, 2019

Ensure that when in the transport layer, we don't sleep past
a major timeout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9e910bff

SUNRPC: Declare RPC timers as TIMER_DEFERRABLE · 43123581

由 Trond Myklebust 提交于 4月 07, 2019

Don't wake idle CPUs only for the purpose of servicing an RPC
queue timeout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

43123581

SUNRPC: Add function rpc_sleep_on_timeout() · 6b2e6856

由 Trond Myklebust 提交于 4月 07, 2019

Clean up the RPC task sleep interfaces by replacing the task->tk_timeout
'hidden parameter' to rpc_sleep_on() with a new function that takes an
absolute timeout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6b2e6856

SUNRPC: Refactor xprt_request_wait_receive() · 8ba6a92d

由 Trond Myklebust 提交于 4月 07, 2019

Convert the transport callback to actually put the request to sleep
instead of just setting a timeout. This is in preparation for
rpc_sleep_on_timeout().
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8ba6a92d

SUNRPC: Fix up task signalling · ae67bd38

由 Trond Myklebust 提交于 4月 07, 2019

The RPC_TASK_KILLED flag should really not be set from another context
because it can clobber data in the struct task when task->tk_flags is
changed non-atomically.
Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add
a function to set that flag and safely wake up the task.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ae67bd38

16 3月, 2019 1 次提交

SUNRPC: Use the ENOTCONN error on socket disconnect · 27adc785

由 Trond Myklebust 提交于 3月 15, 2019

When the socket is closed, we currently send an EAGAIN error to all
pending requests in order to ask them to retransmit. Use ENOTCONN
instead, to ensure that they try to reconnect before attempting to
transmit.
This also helps SOFTCONN tasks to behave correctly in this
situation.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

27adc785

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功