提交 · 7c4310ff56422ea43418305d22bbc5fe19150ec4 · openeuler / Kernel

29 4月, 2020 1 次提交

SUNRPC: defer slow parts of rpc_free_client() to a workqueue. · 7c4310ff

由 NeilBrown 提交于 4月 03, 2020

The rpciod workqueue is on the write-out path for freeing dirty memory,
so it is important that it never block waiting for memory to be
allocated - this can lead to a deadlock.

rpc_execute() - which is often called by an rpciod work item - calls
rcp_task_release_client() which can lead to rpc_free_client().

rpc_free_client() makes two calls which could potentially block wating
for memory allocation.

rpc_clnt_debugfs_unregister() calls into debugfs and will block while
any of the debugfs files are being accessed.  In particular it can block
while any of the 'open' methods are being called and all of these use
malloc for one thing or another.  So this can deadlock if the memory
allocation waits for NFS to complete some writes via rpciod.

rpc_clnt_remove_pipedir() can take the inode_lock() and while it isn't
obvious that memory allocations can happen while the lock it held, it is
safer to assume they might and to not let rpciod call
rpc_clnt_remove_pipedir().

So this patch moves these two calls (together with the final kfree() and
rpciod_down()) into a work-item to be run from the system work-queue.
rpciod can continue its important work, and the final stages of the free
can happen whenever they happen.

I have seen this deadlock on a 4.12 based kernel where debugfs used
synchronize_srcu() when removing objects.  synchronize_srcu() requires a
workqueue and there were no free workther threads and none could be
allocated.  While debugsfs no longer uses SRCU, I believe the deadlock
is still possible.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7c4310ff

22 4月, 2020 1 次提交

SUNRPC: Remove unreachable error condition · efe57fd5

由 Xiyu Yang 提交于 4月 20, 2020

rpc_clnt_test_and_add_xprt() invokes rpc_call_null_helper(), which
return the value of rpc_run_task() to "task". Since rpc_run_task() is
impossible to return an ERR pointer, there is no need to add the
IS_ERR() condition on "task" here. So we need to remove it.

Fixes: 7f554890 ("SUNRPC: Allow addition of new transports to a struct rpc_clnt")
Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: NXin Tan <tanxin.ctf@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

efe57fd5

17 3月, 2020 1 次提交

svcrdma: Create a generic tracing class for displaying xdr_buf layout · b20dfc3f

由 Chuck Lever 提交于 3月 02, 2020

This class can be used to create trace points in either the RPC
client or RPC server paths. It simply displays the length of each
part of an xdr_buf, which is useful to determine that the transport
and XDR codecs are operating correctly.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

b20dfc3f

16 3月, 2020 2 次提交

SUNRPC: Don't take a reference to the cred on synchronous tasks · 263fb9c2

由 Trond Myklebust 提交于 2月 07, 2020

If the RPC call is synchronous, assume the cred is already pinned
by the caller.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

263fb9c2

SUNRPC: Add a flag to avoid reference counts on credentials · 7eac5264

由 Trond Myklebust 提交于 2月 07, 2020

Add a flag to signal to the RPC layer that the credential is already
pinned for the duration of the RPC call.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7eac5264

15 1月, 2020 1 次提交

SUNRPC: call_connect_status should handle -EPROTO · b8457606

由 Chuck Lever 提交于 12月 23, 2019

The xprtrdma connect logic can return -EPROTO if the underlying
device or network path does not support RDMA. This can happen
after a device removal/insertion.

- When SOFTCONN is set, EPROTO is a permanent error.

- When SOFTCONN is not set, EPROTO is treated as a temporary error.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

b8457606

04 11月, 2019 1 次提交

NFSv4.1: Don't rebind to the same source port when reconnecting to the server · e6237b6f

由 Trond Myklebust 提交于 10月 17, 2019

NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
at the source port when deciding whether or not an RPC call is a replay
of a previous call. This requires clients to perform strange TCP gymnastics
in order to ensure that when they reconnect to the server, they bind
to the same source port.

NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
that do not look at the source port of the connection. This patch therefore
ensures they can ignore the rebind requirement.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

e6237b6f

24 10月, 2019 1 次提交

SUNRPC: Eliminate log noise in call_reserveresult · 5cd8b0d4

由 Chuck Lever 提交于 10月 09, 2019

Sep 11 16:35:20 manet kernel:
		call_reserveresult: unrecognized error -512, exiting

Diagnostic error messages such as this likely have no value for NFS
client administrators.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5cd8b0d4

21 9月, 2019 1 次提交

SUNRPC: Don't try to parse incomplete RPC messages · 9ba82886

由 Trond Myklebust 提交于 9月 16, 2019

If the copy of the RPC reply into our buffers did not complete, and
we could end up with a truncated message. In that case, just resend
the call.

Fixes: a0584ee9 ("SUNRPC: Use struct xdr_stream when decoding...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9ba82886

18 9月, 2019 2 次提交

SUNRPC: RPC level errors should always set task->tk_rpc_status · 714fbc73

由 Trond Myklebust 提交于 9月 12, 2019

Ensure that we set task->tk_rpc_status for all RPC level errors so that
the caller can distinguish between those and server reply status errors.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

714fbc73

SUNRPC: Dequeue the request from the receive queue while we're re-encoding · cc204d01

由 Trond Myklebust 提交于 9月 10, 2019

Ensure that we dequeue the request from the transport receive queue
while we're re-encoding to prevent issues like use-after-free when
we release the bvec.

Fixes: 75369089 ("SUNRPC: Ensure the bvecs are reset when we re-encode...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

cc204d01

27 8月, 2019 3 次提交

SUNRPC: Handle connection breakages correctly in call_status() · c82e5472

由 Trond Myklebust 提交于 8月 16, 2019

If the connection breaks while we're waiting for a reply from the
server, then we want to immediately try to reconnect.

Fixes: ec6017d9 ("SUNRPC fix regression in umount of a secure mount")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

c82e5472

SUNRPC: Handle EADDRINUSE and ENOBUFS correctly · 80f455da

由 Trond Myklebust 提交于 8月 15, 2019

If a connect or bind attempt returns EADDRINUSE, that means we want to
retry with a different port. It is not a fatal connection error.
Similarly, ENOBUFS is not fatal, but just indicates a memory allocation
issue. Retry after a short delay.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

80f455da

SUNRPC: Don't handle errors if the bind/connect succeeded · bd736ed3

由 Trond Myklebust 提交于 8月 15, 2019

Don't handle errors in call_bind_status()/call_connect_status()
if it turns out that a previous call caused it to succeed.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v5.1+

bd736ed3

19 7月, 2019 1 次提交

SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request · 75369089

由 Trond Myklebust 提交于 7月 17, 2019

The bvec tracks the list of pages, so if the number of pages changes
due to a re-encode, we need to reset the bvec as well.

Fixes: 277e4ab7 ("SUNRPC: Simplify TCP receive code by switching...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+

75369089

18 7月, 2019 1 次提交

SUNRPC: Fix up backchannel slot table accounting · 7402a4fe

由 Trond Myklebust 提交于 7月 16, 2019

Add a per-transport maximum limit in the socket case, and add
helpers to allow the NFSv4 code to discover that limit.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7402a4fe

13 7月, 2019 1 次提交

SUNRPC: Fix transport accounting when caller specifies an rpc_xprt · a101b043

由 Trond Myklebust 提交于 7月 11, 2019

Ensure that we do the required accounting for the round robin queue
when the caller to rpc_init_task() has passed in a transport to be
used.
Reported-by: NOlga Kornievskaia <aglo@umich.edu>
Reported-by: NNeil Brown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

a101b043

07 7月, 2019 3 次提交

NFS: send state management on a single connection. · 5a0c257f

由 NeilBrown 提交于 5月 30, 2019

With NFSv4.1, different network connections need to be explicitly
bound to a session.  During session startup, this is not possible
so only a single connection must be used for session startup.

So add a task flag to disable the default round-robin choice of
connections (when nconnect > 1) and force the use of a single
connection.
Then use that flag on all requests for session management - for
consistence, include NFSv4.0 management (SETCLIENTID) and session
destruction
Reported-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

5a0c257f

SUNRPC: Allow creation of RPC clients with multiple connections · 612b41f8

由 Trond Myklebust 提交于 4月 27, 2017

Add an argument to struct rpc_create_args that allows the specification
of how many transport connections you want to set up to the server.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

612b41f8

SUNRPC: Add basic load balancing to the transport switch · 21f0ffaf

由 Trond Myklebust 提交于 4月 28, 2017

For now, just count the queue length. It is less accurate than counting
number of bytes queued, but easier to implement.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

21f0ffaf

22 6月, 2019 2 次提交

SUNRPC: Fix a credential refcount leak · 19d55046

由 Trond Myklebust 提交于 6月 20, 2019

All callers of __rpc_clone_client() pass in a value for args->cred,
meaning that the credential gets assigned and referenced in
the call to rpc_new_client().
Reported-by: NIdo Schimmel <idosch@idosch.org>
Fixes: 79caa5fa ("SUNRPC: Cache cred of process creating the rpc_client")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

19d55046

net :sunrpc :clnt :Fix xps refcount imbalance on the error path · b9622614

由 Lin Yi 提交于 6月 10, 2019

rpc_clnt_add_xprt take a reference to struct rpc_xprt_switch, but forget
to release it before return, may lead to a memory leak.
Signed-off-by: NLin Yi <teroincn@163.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

b9622614

31 5月, 2019 2 次提交

SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential · 7987b694

由 Trond Myklebust 提交于 5月 29, 2019

The addition of rpc_check_timeout() to call_decode causes an Oops
when the RPCSEC_GSS credential is rejected.
The reason is that rpc_decode_header() will call xprt_release() in
order to free task->tk_rqstp, which is needed by rpc_check_timeout()
to check whether or not we should exit due to a soft timeout.

The fix is to move the call to xprt_release() into call_decode() so
we can perform it after rpc_check_timeout().
Reported-by: NOlga Kornievskaia <olga.kornievskaia@gmail.com>
Reported-by: NNick Bowler <nbowler@draconx.ca>
Fixes: cea57789 ("SUNRPC: Clean up")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

7987b694

SUNRPC fix regression in umount of a secure mount · ec6017d9

由 Olga Kornievskaia 提交于 5月 29, 2019

If call_status returns ENOTCONN, we need to re-establish the connection
state after. Otherwise the client goes into an infinite loop of call_encode,
call_transmit, call_status (ENOTCONN), call_encode.

Fixes: c8485e4d ("SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit()")
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Cc: stable@vger.kernel.org # v2.6.29+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ec6017d9

21 5月, 2019 1 次提交

treewide: Add SPDX license identifier for missed files · 457c8996

由 Thomas Gleixner 提交于 5月 19, 2019

Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

457c8996

10 5月, 2019 1 次提交

SUNRPC: task should be exit if encode return EKEYEXPIRED more times · 9c5948c2

由 ZhangXiaoxu 提交于 4月 29, 2019

If the rpc.gssd always return cred success, but now the cred is
expired, then the task will loop in call_refresh and call_transmit.

Exit the rpc task after retry.
Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9c5948c2

27 4月, 2019 1 次提交

SUNRPC: Cache cred of process creating the rpc_client · 79caa5fa

由 Trond Myklebust 提交于 4月 24, 2019

When converting kuids to AUTH_UNIX creds, etc we will want to use the
same user namespace as the process that created the rpc client.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

79caa5fa

26 4月, 2019 8 次提交

SUNRPC: Add the 'softerr' rpc_client flag · ae6ec918

由 Trond Myklebust 提交于 4月 07, 2019

Add the 'softerr' rpc client flag that sets the RPC_TASK_TIMEOUT
flag on all new rpc tasks that are attached to that rpc client.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ae6ec918

SUNRPC: Ensure to ratelimit the "server not responding" syslog messages · 0729d995

由 Trond Myklebust 提交于 4月 07, 2019

In particular, the timeout messages can be very noisy, so we ought to
ratelimit them in order to avoid spamming the syslog.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0729d995

SUNRPC: Make "no retrans timeout" soft tasks behave like softconn for timeouts · e4ec48d3

由 Trond Myklebust 提交于 4月 07, 2019

If a soft NFSv4 request is sent, then we don't need it to time out unless
the connection breaks. The reason is that as long as the connection is
unbroken, the protocol states that the server is not allowed to drop the
request. IOW: as long as the connection remains unbroken, the client may
assume that all transmitted RPC requests are being processed by the server,
and that retransmissions and timeouts of those requests are unwarranted.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e4ec48d3

SUNRPC: Add tracking of RPC level errors · 5ad64b36

由 Trond Myklebust 提交于 4月 07, 2019

Add variables to track RPC level errors so that we can distinguish
between issue that arose in the RPC transport layer as opposed to
those arising from the reply message.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5ad64b36

SUNRPC: Fix up tracking of timeouts · 5efd1876

由 Trond Myklebust 提交于 4月 07, 2019

Add a helper to ensure that debugfs and friends print out the
correct current task timeout value.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5efd1876

SUNRPC: Add function rpc_sleep_on_timeout() · 6b2e6856

由 Trond Myklebust 提交于 4月 07, 2019

Clean up the RPC task sleep interfaces by replacing the task->tk_timeout
'hidden parameter' to rpc_sleep_on() with a new function that takes an
absolute timeout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6b2e6856

SUNRPC: Refactor rpc_restart_call/rpc_restart_call_prepare · 9e6fa0bb

由 Trond Myklebust 提交于 4月 07, 2019

Clean up.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9e6fa0bb

SUNRPC: Fix up task signalling · ae67bd38

由 Trond Myklebust 提交于 4月 07, 2019

The RPC_TASK_KILLED flag should really not be set from another context
because it can clobber data in the struct task when task->tk_flags is
changed non-atomically.
Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add
a function to set that flag and safely wake up the task.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ae67bd38

18 4月, 2019 1 次提交

SUNRPC: Ignore queue transmission errors on successful transmission · a7b1a483

由 Trond Myklebust 提交于 4月 15, 2019

If a request transmission fails due to write space or slot unavailability
errors, but the queued task then gets transmitted before it has time to
process the error in call_transmit_status() or call_bc_transmit_status(),
we need to suppress the transmission error code to prevent it from leaking
out of the RPC layer.
Reported-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: NChuck Lever <chuck.lever@oracle.com>

a7b1a483

12 4月, 2019 1 次提交

Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping" · af6b61d7

由 Trond Myklebust 提交于 4月 11, 2019

This reverts commit 009a82f6.

The ability to optimise here relies on compiler being able to optimise
away tail calls to avoid stack overflows. Unfortunately, we are seeing
reports of problems, so let's just revert.
Reported-by: NDaniel Mack <daniel@zonque.org>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

af6b61d7

23 3月, 2019 1 次提交

SUNRPC: Don't let RPC_SOFTCONN tasks time out if the transport is connected · d84dd3fb

由 Trond Myklebust 提交于 3月 19, 2019

If the transport is still connected, then we do want to allow
RPC_SOFTCONN tasks to retry. They should time out if and only if
the connection is broken.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d84dd3fb

16 3月, 2019 2 次提交

SUNRPC: Remove redundant check for the reply length in call_decode() · 5e3863fd

由 Trond Myklebust 提交于 3月 15, 2019

Now that we're using the xdr_stream functions to decode the header,
the test for the minimum reply length is redundant.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

5e3863fd

SUNRPC: Handle the SYSTEM_ERR rpc error · 928d42f7

由 Trond Myklebust 提交于 3月 15, 2019

Handle the SYSTEM_ERR rpc error by retrying the RPC call as if it
were a garbage argument.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

928d42f7

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功