提交 · 3be232f11a3cc9b0ef0795e39fa11bdb8e422a06 · openeuler / Kernel

05 11月, 2021 2 次提交

SUNRPC: Prevent immediate close+reconnect · 3be232f1

由 Trond Myklebust 提交于 10月 26, 2021

If we have already set up the socket and are waiting for it to connect,
then don't immediately close and retry.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

3be232f1

SUNRPC: Fix races when closing the socket · d896ba83

由 Trond Myklebust 提交于 10月 29, 2021

Ensure that we bump the xprt->connect_cookie when we set the
XPRT_CLOSE_WAIT flag so that another call to
xprt_conditional_disconnect() won't race with the reconnection.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d896ba83

04 10月, 2021 2 次提交

SUNRPC: xprt_clear_locked() only needs release memory semantics · 33c3214b

由 Trond Myklebust 提交于 7月 12, 2021

The clearing of the XPRT_LOCKED bit has to happen after we clear
xprt->snd_task, but we don't require any extra memory barriers after
that.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

33c3214b

SUNRPC: Partial revert of commit · ea7a1019

由 Trond Myklebust 提交于 7月 12, 2021

The premise of commit 6f9f1728 ("SUNRPC: Mitigate cond_resched() in
xprt_transmit()") was that cond_resched() is expensive and unnecessary
when there has been just a single send.
The point of cond_resched() is to ensure that tasks that should pre-empt
this one get a chance to do so when it is safe to do so. The code prior
to commit 6f9f1728 failed to take into account that it was keeping a
rpc_task pinned for longer than it needed to, and so rather than doing a
full revert, let's just move the cond_resched.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

ea7a1019

21 8月, 2021 1 次提交

SUNRPC: Move client-side disconnect injection · a4ae3081

由 Chuck Lever 提交于 8月 05, 2021

Disconnect injection stress-tests the ability for both client and
server implementations to behave resiliently in the face of network
instability.

Convert the existing client-side disconnect injection infrastructure
to use the kernel's generic error injection facility. The generic
facility has a richer set of injection criteria.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

a4ae3081

10 8月, 2021 3 次提交

SUNRPC/xprtrdma: Fix reconnection locking · f99fa508

由 Trond Myklebust 提交于 7月 26, 2021

The xprtrdma client code currently relies on the task that initiated the
connect to hold the XPRT_LOCK for the duration of the connection
attempt. If the task is woken early, due to some other event, then that
lock could get released early.
Avoid races by using the same mechanism that the socket code uses of
transferring lock ownership to the RDMA connect worker itself. That
frees us to call rpcrdma_xprt_disconnect() directly since we're now
guaranteed exclusion w.r.t. other callers.

Fixes: 4cf44be6 ("xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

f99fa508

SUNRPC: Clean up scheduling of autoclose · e26d9972

由 Trond Myklebust 提交于 7月 26, 2021

Consolidate duplicated code in xprt_force_disconnect() and
xprt_conditional_disconnect().
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e26d9972

SUNRPC: Fix potential memory corruption · c2dc3e5f

由 Trond Myklebust 提交于 7月 26, 2021

We really should not call rpc_wake_up_queued_task_set_status() with
xprt->snd_task as an argument unless we are certain that is actually an
rpc_task.

Fixes: 0445f92c ("SUNRPC: Fix disconnection races")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c2dc3e5f

09 7月, 2021 2 次提交

sunrpc: add dst_attr attributes to the sysfs xprt directory · 587bc725

由 Olga Kornievskaia 提交于 6月 08, 2021

Allow to query and set the destination's address of a transport.
Setting of the destination address is allowed only for TCP or RDMA
based connections.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

587bc725

sunrpc: add xprt id · 572caba4

由 Olga Kornievskaia 提交于 6月 08, 2021

This adds a unique identifier for a sunrpc transport in sysfs, which is
similarly managed to the unique IDs of clients.
Signed-off-by: NDan Aloni <dan@kernelim.com>
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

572caba4

26 5月, 2021 1 次提交

SUNRPC: More fixes for backlog congestion · e86be3a0

由 Trond Myklebust 提交于 5月 25, 2021

Ensure that we fix the XPRT_CONGESTED starvation issue for RDMA as well
as socket based transports.
Ensure we always initialise the request after waking up from the backlog
list.

Fixes: e877a88d ("SUNRPC in case of backlog, hand free slots directly to waiting task")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e86be3a0

21 5月, 2021 1 次提交

SUNRPC in case of backlog, hand free slots directly to waiting task · e877a88d

由 NeilBrown 提交于 5月 17, 2021

If sunrpc.tcp_max_slot_table_entries is small and there are tasks
on the backlog queue, then when a request completes it is freed and the
first task on the queue is woken. The expectation is that it will wake
and claim that request. However if it was a sync task and the waiting
process was killed at just that moment, it will wake and NOT claim the
request.

As long as TASK_CONGESTED remains set, requests can only be claimed by
tasks woken from the backlog, and they are woken only as requests are
freed, so when a task doesn't claim a request, no other task can ever
get that request until TASK_CONGESTED is cleared. Each time this
happens the number of available requests is decreased by one.

With a sufficiently high workload and sufficiently low setting of
max_slot (16 in the case where this was seen), TASK_CONGESTED can remain
set for an extended period, and the above scenario (of a process being
killed just as its task was woken) can repeat until no requests can be
allocated. Then traffic stops.

This patch addresses the problem by introducing a positive handover of a
request from a completing task to a backlog task - the request is never
freed when there is a backlog.

When a task is woken it might not already have a request attached in
which case it is *not* freed (as with current code) but is initialised
(if needed) and used. If it isn't used it will eventually be freed by
rpc_exit_task(). xprt_release() is enhanced to be able to correctly
release an uninitialised request.

Fixes: ba60eb25 ("SUNRPC: Fix a livelock problem in the xprt->backlog queue")
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e877a88d

14 4月, 2021 4 次提交

SUNRPC: Handle major timeout in xprt_adjust_timeout() · 09252177

由 Chris Dion 提交于 4月 04, 2021

Currently if a major timeout value is reached, but the minor value has
not been reached, an ETIMEOUT will not be sent back to the caller.
This can occur if the v4 server is not responding to requests and
retrans is configured larger than the default of two.

For example, A TCP mount with a configured timeout value of 50 and a
retransmission count of 3 to a v4 server which is not responding:

1. Initial value and increment set to 5s, maxval set to 20s, retries at 3
2. Major timeout is set to 20s, minor timeout set to 5s initially
3. xport_adjust_timeout() is called after 5s, retry with 10s timeout,
   minor timeout is bumped to 10s
4. And again after another 10s, 15s total time with minor timeout set
   to 15s
5. After 20s total time xport_adjust_timeout is called as major timeout is
   reached, but skipped because the minor timeout is not reached
       - After this time the cpu spins continually calling
       	 xport_adjust_timeout() and returning 0 for 10 seconds.
	 As seen on perf sched:
   	 39243.913182 [0005]  mount.nfs[3794] 4607.938      0.017   9746.863
6. This continues until the 15s minor timeout condition is reached (in
   this case for 10 seconds). After which the ETIMEOUT is processed
   back to the caller, the cpu spinning stops, and normal operations
   continue

Fixes: 7de62bc0 ("SUNRPC dont update timeout value on connection reset")
Signed-off-by: NChris Dion <Christopher.Dion@dell.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

09252177

SUNRPC: Remove trace_xprt_transmit_queued · 6cf23783

由 Chuck Lever 提交于 3月 31, 2021

This tracepoint can crash when dereferencing snd_task because
when some transports connect, they put a cookie in that field
instead of a pointer to an rpc_task.

BUG: KASAN: use-after-free in trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
Read of size 2 at addr ffff8881a83bd3a0 by task git/331872

CPU: 11 PID: 331872 Comm: git Tainted: G S                5.12.0-rc2-00007-g3ab6e585a7f9 #1453
Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Call Trace:
 dump_stack+0x9c/0xcf
 print_address_description.constprop.0+0x18/0x239
 kasan_report+0x174/0x1b0
 trace_event_raw_event_xprt_writelock_event+0x141/0x18e [sunrpc]
 xprt_prepare_transmit+0x8e/0xc1 [sunrpc]
 call_transmit+0x4d/0xc6 [sunrpc]

Fixes: 9ce07ae5 ("SUNRPC: Replace dprintk() call site in xprt_prepare_transmit")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

6cf23783

SUNRPC: Add tracepoint that fires when an RPC is retransmitted · e936a597

由 Chuck Lever 提交于 3月 31, 2021

A separate tracepoint can be left enabled all the time to capture
rare but important retransmission events. So for example:

kworker/u26:3-568 [009] 156.967933: xprt_retransmit: task:44093@5 xid=0xa25dbc79 nfsv3 WRITE ntrans=2

Or, for example, enable all nfs and nfs4 tracepoints, and set up a
trigger to disable tracing when xprt_retransmit fires to capture
everything that leads up to it.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e936a597

SUNRPC: Move fault injection call sites · 7638e0bf

由 Chuck Lever 提交于 3月 31, 2021

I've hit some crashes that occur in the xprt_rdma_inject_disconnect
path. It appears that, for some provides, rdma_disconnect() can
take so long that the transport can disconnect and release its
hardware resources while rdma_disconnect() is still running,
resulting in a UAF in the provider.

The transport's fault injection method may depend on the stability
of transport data structures. That means it needs to be invoked
only from contexts that hold the transport write lock.

Fixes: 4a068258 ("SUNRPC: Transport fault injection")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7638e0bf

05 4月, 2021 1 次提交

SUNRPC: Set TCP_CORK until the transmit queue is empty · d737e5d4

由 Trond Myklebust 提交于 2月 09, 2021

When we have multiple RPC requests queued up, it makes sense to set the
TCP_CORK option while the transmit queue is non-empty.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d737e5d4

03 12月, 2020 4 次提交

T
SUNRPC: Remove unused function xprt_load_transport() · c87b056e
由 Trond Myklebust 提交于 11月 10, 2020
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
c87b056e
T
SUNRPC: Add a helper to return the transport identifier given a netid · 1fc5f131
由 Trond Myklebust 提交于 11月 10, 2020
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
1fc5f131

SUNRPC: Close a race with transport setup and module put · 9bccd264

由 Trond Myklebust 提交于 11月 10, 2020

After we've looked up the transport module, we need to ensure it can't
go away until we've finished running the transport setup code.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

9bccd264

SUNRPC: xprt_load_transport() needs to support the netid "rdma6" · d5aa6b22

由 Trond Myklebust 提交于 11月 06, 2020

According to RFC5666, the correct netid for an IPv6 addressed RDMA
transport is "rdma6", which we've supported as a mount option since
Linux-4.7. The problem is when we try to load the module "xprtrdma6",
that will fail, since there is no modulealias of that name.

Fixes: 181342c5 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d5aa6b22

21 9月, 2020 6 次提交

SUNRPC: Mitigate cond_resched() in xprt_transmit() · 6f9f1728

由 Chuck Lever 提交于 7月 08, 2020

The original purpose of this expensive call is to prevent a long
queue of requests from blocking other work.

The cond_resched() call is unnecessary after just a single send
operation.

For longer queues, instead of invoking the kernel scheduler, simply
release the transport send lock and return to the RPC scheduler.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6f9f1728

SUNRPC: Replace connect dprintk call sites with a tracepoint · db0a86c4

由 Chuck Lever 提交于 7月 08, 2020

This trace event can be used to audit transport connections from the
client.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

db0a86c4

SUNRPC: Replace dprintk() call site in xprt_prepare_transmit · 9ce07ae5

由 Chuck Lever 提交于 7月 08, 2020

Generate a trace event when an RPC request is queued without being
sent immediately.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9ce07ae5

SUNRPC: Update debugging instrumentation in xprt_do_reserve() · 09d2ba0c

由 Chuck Lever 提交于 7月 08, 2020

Replace a dprintk() with a tracepoint. The tracepoint marks the
point where an RPC request is assigned an XID.

Additional clean up: Remove trace_xprt_enq_xmit, which reports much
the same thing. That tracepoint was added for debugging commit
918f3c1f ("SUNRPC: Improve latency for interactive tasks").
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

09d2ba0c

SUNRPC: Remove debugging instrumentation from xprt_release · 78069487

由 Chuck Lever 提交于 7月 08, 2020

These instruments don't appear to add any substantial value.

We already have this at the termination of each RPC:

iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58
iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384
iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

78069487

SUNRPC: Remove trace_xprt_complete_rqst() · e4378a0f

由 Chuck Lever 提交于 7月 08, 2020

Request completion is already recorded by an "rpc_task_wakeup
queue=xprt_pending" trace record. A subsequent rpc_xdr_recvfrom
trace record shows the number of bytes received.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e4378a0f

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

05 8月, 2020 1 次提交

SUNRPC dont update timeout value on connection reset · 7de62bc0

由 Olga Kornievskaia 提交于 7月 15, 2020

Current behaviour: every time a v3 operation is re-sent to the server
we update (double) the timeout. There is no distinction between whether
or not the previous timer had expired before the re-sent happened.

Here's the scenario:
1. Client sends a v3 operation
2. Server RST-s the connection (prior to the timeout) (eg., connection
is immediately reset)
3. Client re-sends a v3 operation but the timeout is now 120sec.

As a result, an application sees 2mins pause before a retry in case
server again does not reply.

Instead, this patch proposes to keep track off when the minor timeout
should happen and if it didn't, then don't update the new timeout.
Value is updated based on the previous value to make timeouts
predictable.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7de62bc0

12 6月, 2020 2 次提交

SUNRPC: Trace transport lifetime events · 911813d7

由 Chuck Lever 提交于 5月 12, 2020

Refactor: Hoist create/destroy/disconnect tracepoints out of
xprtrdma and into the generic RPC client. Some benefits include:

- Enable tracing of xprt lifetime events for the socket transport
  types

- Expose the different types of disconnect to help run down
  issues with lingering connections
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

911813d7

SUNRPC: Split the xdr_buf event class · c509f15a

由 Chuck Lever 提交于 5月 12, 2020

To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c509f15a

17 3月, 2020 1 次提交

svcrdma: Create a generic tracing class for displaying xdr_buf layout · b20dfc3f

由 Chuck Lever 提交于 3月 02, 2020

This class can be used to create trace points in either the RPC
client or RPC server paths. It simply displays the length of each
part of an xdr_buf, which is useful to determine that the transport
and XDR codecs are operating correctly.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

b20dfc3f

31 10月, 2019 1 次提交

SUNRPC: Destroy the back channel when we destroy the host transport · 669996ad

由 Trond Myklebust 提交于 10月 17, 2019

When we're destroying the host transport mechanism, we should ensure
that we do not leak memory by failing to release any back channel
slots that might still exist.
Reported-by: NNeil Brown <neilb@suse.de>
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

669996ad

24 10月, 2019 1 次提交

SUNRPC: Add trace points to observe transport congestion control · bf7ca707

由 Chuck Lever 提交于 10月 09, 2019

To help debug problems with RPC/RDMA credit management, replace
dprintk() call sites in the transport send lock paths with trace
events.

Similar trace points are defined for the non-congestion paths.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bf7ca707

21 9月, 2019 1 次提交

SUNRPC: Fix congestion window race with disconnect · 8593e010

由 Chuck Lever 提交于 9月 13, 2019

If the congestion window closes just as the transport disconnects,
a reconnect is never driven because:

1. The XPRT_CONG_WAIT flag prevents tasks from taking the write lock
2. There's no wake-up of the first task on the xprt->sending queue

To address this, clear the congestion wait flag as part of
completing a disconnect.

Fixes: 75891f50 ("SUNRPC: Support for congestion control ... ")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8593e010

18 9月, 2019 1 次提交

SUNRPC: Dequeue the request from the receive queue while we're re-encoding · cc204d01

由 Trond Myklebust 提交于 9月 10, 2019

Ensure that we dequeue the request from the transport receive queue
while we're re-encoding to prevent issues like use-after-free when
we release the bvec.

Fixes: 75369089 ("SUNRPC: Ensure the bvecs are reset when we re-encode...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

cc204d01

27 8月, 2019 1 次提交

Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" · d5711920

由 Trond Myklebust 提交于 8月 16, 2019

This reverts commit a79f194a.
The mechanism for aborting I/O is racy, since we are not guaranteed that
the request is asleep while we're changing both task->tk_status and
task->tk_action.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v5.1

d5711920

19 7月, 2019 1 次提交

SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request · 75369089

由 Trond Myklebust 提交于 7月 17, 2019

The bvec tracks the list of pages, so if the number of pages changes
due to a re-encode, we need to reset the bvec as well.

Fixes: 277e4ab7 ("SUNRPC: Simplify TCP receive code by switching...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+

75369089

09 7月, 2019 1 次提交

xprtrdma: Modernize ops->connect · 675dd90a

由 Chuck Lever 提交于 6月 19, 2019

Adapt and apply changes that were made to the TCP socket connect
code. See the following commits for details on the purpose of
these changes:

Commit 7196dbb0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly")
Commit 3851f1cd ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout")
Commit 02910177 ("SUNRPC: Fix reconnection timeouts")

Some common transport code is moved to xprt.c to satisfy the code
duplication police.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

675dd90a

07 7月, 2019 1 次提交

SUNRPC: Fix possible autodisconnect during connect due to old last_used · 80d3c45f

由 Dave Wysochanski 提交于 6月 26, 2019

Ensure last_used is updated before calling mod_timer inside
xprt_schedule_autodisconnect. This avoids a possible xprt_autoclose
firing immediately after a successful connect when xprt_unlock_connect
calls xprt_schedule_autodisconnect with an old value of last_used.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

80d3c45f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功