- 29 4月, 2020 1 次提交
-
-
由 NeilBrown 提交于
The rpciod workqueue is on the write-out path for freeing dirty memory, so it is important that it never block waiting for memory to be allocated - this can lead to a deadlock. rpc_execute() - which is often called by an rpciod work item - calls rcp_task_release_client() which can lead to rpc_free_client(). rpc_free_client() makes two calls which could potentially block wating for memory allocation. rpc_clnt_debugfs_unregister() calls into debugfs and will block while any of the debugfs files are being accessed. In particular it can block while any of the 'open' methods are being called and all of these use malloc for one thing or another. So this can deadlock if the memory allocation waits for NFS to complete some writes via rpciod. rpc_clnt_remove_pipedir() can take the inode_lock() and while it isn't obvious that memory allocations can happen while the lock it held, it is safer to assume they might and to not let rpciod call rpc_clnt_remove_pipedir(). So this patch moves these two calls (together with the final kfree() and rpciod_down()) into a work-item to be run from the system work-queue. rpciod can continue its important work, and the final stages of the free can happen whenever they happen. I have seen this deadlock on a 4.12 based kernel where debugfs used synchronize_srcu() when removing objects. synchronize_srcu() requires a workqueue and there were no free workther threads and none could be allocated. While debugsfs no longer uses SRCU, I believe the deadlock is still possible. Signed-off-by: NNeilBrown <neilb@suse.de> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 22 4月, 2020 1 次提交
-
-
由 Xiyu Yang 提交于
rpc_clnt_test_and_add_xprt() invokes rpc_call_null_helper(), which return the value of rpc_run_task() to "task". Since rpc_run_task() is impossible to return an ERR pointer, there is no need to add the IS_ERR() condition on "task" here. So we need to remove it. Fixes: 7f554890 ("SUNRPC: Allow addition of new transports to a struct rpc_clnt") Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: NXin Tan <tanxin.ctf@gmail.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 17 3月, 2020 1 次提交
-
-
由 Chuck Lever 提交于
This class can be used to create trace points in either the RPC client or RPC server paths. It simply displays the length of each part of an xdr_buf, which is useful to determine that the transport and XDR codecs are operating correctly. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 16 3月, 2020 2 次提交
-
-
由 Trond Myklebust 提交于
If the RPC call is synchronous, assume the cred is already pinned by the caller. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Add a flag to signal to the RPC layer that the credential is already pinned for the duration of the RPC call. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 15 1月, 2020 1 次提交
-
-
由 Chuck Lever 提交于
The xprtrdma connect logic can return -EPROTO if the underlying device or network path does not support RDMA. This can happen after a device removal/insertion. - When SOFTCONN is set, EPROTO is a permanent error. - When SOFTCONN is not set, EPROTO is treated as a temporary error. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 04 11月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look at the source port when deciding whether or not an RPC call is a replay of a previous call. This requires clients to perform strange TCP gymnastics in order to ensure that when they reconnect to the server, they bind to the same source port. NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics, that do not look at the source port of the connection. This patch therefore ensures they can ignore the rebind requirement. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 24 10月, 2019 1 次提交
-
-
由 Chuck Lever 提交于
Sep 11 16:35:20 manet kernel: call_reserveresult: unrecognized error -512, exiting Diagnostic error messages such as this likely have no value for NFS client administrators. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 21 9月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
If the copy of the RPC reply into our buffers did not complete, and we could end up with a truncated message. In that case, just resend the call. Fixes: a0584ee9 ("SUNRPC: Use struct xdr_stream when decoding...") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 18 9月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we set task->tk_rpc_status for all RPC level errors so that the caller can distinguish between those and server reply status errors. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Ensure that we dequeue the request from the transport receive queue while we're re-encoding to prevent issues like use-after-free when we release the bvec. Fixes: 75369089 ("SUNRPC: Ensure the bvecs are reset when we re-encode...") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 27 8月, 2019 3 次提交
-
-
由 Trond Myklebust 提交于
If the connection breaks while we're waiting for a reply from the server, then we want to immediately try to reconnect. Fixes: ec6017d9 ("SUNRPC fix regression in umount of a secure mount") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
If a connect or bind attempt returns EADDRINUSE, that means we want to retry with a different port. It is not a fatal connection error. Similarly, ENOBUFS is not fatal, but just indicates a memory allocation issue. Retry after a short delay. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Don't handle errors in call_bind_status()/call_connect_status() if it turns out that a previous call caused it to succeed. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v5.1+
-
- 19 7月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
The bvec tracks the list of pages, so if the number of pages changes due to a re-encode, we need to reset the bvec as well. Fixes: 277e4ab7 ("SUNRPC: Simplify TCP receive code by switching...") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20+
-
- 18 7月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
Add a per-transport maximum limit in the socket case, and add helpers to allow the NFSv4 code to discover that limit. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 13 7月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
Ensure that we do the required accounting for the round robin queue when the caller to rpc_init_task() has passed in a transport to be used. Reported-by: NOlga Kornievskaia <aglo@umich.edu> Reported-by: NNeil Brown <neilb@suse.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 07 7月, 2019 3 次提交
-
-
由 NeilBrown 提交于
With NFSv4.1, different network connections need to be explicitly bound to a session. During session startup, this is not possible so only a single connection must be used for session startup. So add a task flag to disable the default round-robin choice of connections (when nconnect > 1) and force the use of a single connection. Then use that flag on all requests for session management - for consistence, include NFSv4.0 management (SETCLIENTID) and session destruction Reported-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Add an argument to struct rpc_create_args that allows the specification of how many transport connections you want to set up to the server. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
For now, just count the queue length. It is less accurate than counting number of bytes queued, but easier to implement. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 22 6月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
All callers of __rpc_clone_client() pass in a value for args->cred, meaning that the credential gets assigned and referenced in the call to rpc_new_client(). Reported-by: NIdo Schimmel <idosch@idosch.org> Fixes: 79caa5fa ("SUNRPC: Cache cred of process creating the rpc_client") Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Tested-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Lin Yi 提交于
rpc_clnt_add_xprt take a reference to struct rpc_xprt_switch, but forget to release it before return, may lead to a memory leak. Signed-off-by: NLin Yi <teroincn@163.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 31 5月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
The addition of rpc_check_timeout() to call_decode causes an Oops when the RPCSEC_GSS credential is rejected. The reason is that rpc_decode_header() will call xprt_release() in order to free task->tk_rqstp, which is needed by rpc_check_timeout() to check whether or not we should exit due to a soft timeout. The fix is to move the call to xprt_release() into call_decode() so we can perform it after rpc_check_timeout(). Reported-by: NOlga Kornievskaia <olga.kornievskaia@gmail.com> Reported-by: NNick Bowler <nbowler@draconx.ca> Fixes: cea57789 ("SUNRPC: Clean up") Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Olga Kornievskaia 提交于
If call_status returns ENOTCONN, we need to re-establish the connection state after. Otherwise the client goes into an infinite loop of call_encode, call_transmit, call_status (ENOTCONN), call_encode. Fixes: c8485e4d ("SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit()") Signed-off-by: NOlga Kornievskaia <kolga@netapp.com> Cc: stable@vger.kernel.org # v2.6.29+ Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 21 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 5月, 2019 1 次提交
-
-
由 ZhangXiaoxu 提交于
If the rpc.gssd always return cred success, but now the cred is expired, then the task will loop in call_refresh and call_transmit. Exit the rpc task after retry. Signed-off-by: NZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 27 4月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
When converting kuids to AUTH_UNIX creds, etc we will want to use the same user namespace as the process that created the rpc client. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 26 4月, 2019 8 次提交
-
-
由 Trond Myklebust 提交于
Add the 'softerr' rpc client flag that sets the RPC_TASK_TIMEOUT flag on all new rpc tasks that are attached to that rpc client. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
In particular, the timeout messages can be very noisy, so we ought to ratelimit them in order to avoid spamming the syslog. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
If a soft NFSv4 request is sent, then we don't need it to time out unless the connection breaks. The reason is that as long as the connection is unbroken, the protocol states that the server is not allowed to drop the request. IOW: as long as the connection remains unbroken, the client may assume that all transmitted RPC requests are being processed by the server, and that retransmissions and timeouts of those requests are unwarranted. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Add variables to track RPC level errors so that we can distinguish between issue that arose in the RPC transport layer as opposed to those arising from the reply message. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Add a helper to ensure that debugfs and friends print out the correct current task timeout value. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Clean up the RPC task sleep interfaces by replacing the task->tk_timeout 'hidden parameter' to rpc_sleep_on() with a new function that takes an absolute timeout. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
Clean up. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Trond Myklebust 提交于
The RPC_TASK_KILLED flag should really not be set from another context because it can clobber data in the struct task when task->tk_flags is changed non-atomically. Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add a function to set that flag and safely wake up the task. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 18 4月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
If a request transmission fails due to write space or slot unavailability errors, but the queued task then gets transmitted before it has time to process the error in call_transmit_status() or call_bc_transmit_status(), we need to suppress the transmission error code to prevent it from leaking out of the RPC layer. Reported-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com> Tested-by: NChuck Lever <chuck.lever@oracle.com>
-
- 12 4月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
This reverts commit 009a82f6. The ability to optimise here relies on compiler being able to optimise away tail calls to avoid stack overflows. Unfortunately, we are seeing reports of problems, so let's just revert. Reported-by: NDaniel Mack <daniel@zonque.org> Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 23 3月, 2019 1 次提交
-
-
由 Trond Myklebust 提交于
If the transport is still connected, then we do want to allow RPC_SOFTCONN tasks to retry. They should time out if and only if the connection is broken. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
- 16 3月, 2019 2 次提交
-
-
由 Trond Myklebust 提交于
Now that we're using the xdr_stream functions to decode the header, the test for the minimum reply length is redundant. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-
由 Trond Myklebust 提交于
Handle the SYSTEM_ERR rpc error by retrying the RPC call as if it were a garbage argument. Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
-