- 06 8月, 2016 2 次提交
-
-
由 Trond Myklebust 提交于
...and ensure that we propagate it to new transports on the same client. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
When the connect attempt fails and backs off, we should start the clock at the last connection attempt, not time at which we queue up the reconnect job. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 05 8月, 2016 1 次提交
-
-
由 NeilBrown 提交于
If the net.ipv6.conf.*.use_temp_addr sysctl is set to '2', then TCP connections over IPv6 will prefer a 'private' source address. These eventually expire and become invalid, typically after a week, but the time is configurable. When the local address becomes invalid the client will not be able to receive replies from the server. Eventually the connection will timeout or break and a new connection will be established, but this can take half an hour (typically TCP connection break time). RFC 4941, which describes private IPv6 addresses, acknowledges that some applications might not work well with them and that the application may explicitly a request non-temporary (i.e. "public") address. I believe this is correct for SUNRPC clients. Without this change, a client will occasionally experience a long delay if private addresses have been enabled. The privacy offered by private addresses is of little value for an NFS server which requires client authentication. For NFSv3 this will often not be a problem because idle connections are closed after 5 minutes. For NFSv4 connections never go idle due to the period RENEW (or equivalent) request. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 02 8月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
If the connect attempt immediately fails with an EADDRNOTAVAIL error, then that means our choice of source port number was bad. This error is expected when we set the SO_REUSEPORT socket option and we have 2 sockets sharing the same source and destination address and port combinations. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Fixes: 402e23b4 ("SUNRPC: Fix stupid typo in xs_sock_set_reuseport") Cc: stable@vger.kernel.org # v4.0+
-
- 20 7月, 2016 3 次提交
-
-
由 Frank Sorenson 提交于
The current min/max resvport settings are independently limited by the entire range of allowed ports, so max_resvport can be set to a port lower than min_resvport. Prevent inversion of min/max values when set through sysfs and module parameter by setting the limits dependent on each other. Signed-off-by: NFrank Sorenson <sorenson@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Frank Sorenson 提交于
The current min/max resvport settings are independently limited by the entire range of allowed ports, so max_resvport can be set to a port lower than min_resvport. Prevent inversion of min/max values when set through sysctl by setting the limits dependent on each other. Signed-off-by: NFrank Sorenson <sorenson@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Frank Sorenson 提交于
The range calculation for choosing the random reserved port will panic with divide-by-zero when min_resvport == max_resvport, a range of one port, not zero. Fix the reserved port range calculation by adding one to the difference. Signed-off-by: NFrank Sorenson <sorenson@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 15 6月, 2016 1 次提交
-
-
由 J. Bruce Fields 提交于
The spec allows backchannels for multiple clients to share the same tcp connection. When that happens, we need to use the same xprt for all of them. Similarly, we need the same xps. This fixes list corruption introduced by the multipath code. Cc: stable@vger.kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com> Acked-by: NTrond Myklebust <trondmy@primarydata.com>
-
- 14 6月, 2016 4 次提交
-
-
由 Trond Myklebust 提交于
The current test is racy when dealing with fast NICs. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
rpciod can easily get congested due to the long list of queued rpc_tasks. Having the receive queue wait in turn for those tasks to complete can therefore be a bottleneck. Address the problem by separating the workqueues into: - rpciod: manages rpc_tasks - xprtiod: manages transport related work. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
The only difference between the two at this point is the reset of the connection timeout, and since everyone expect tcp ignore that value, we can just throw it into the generic function. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Do not queue the client receive work if we're still processing. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 5月, 2016 1 次提交
-
-
由 Chuck Lever 提交于
RPC-over-RDMA transports have a limit on how large a backward direction (backchannel) RPC message can be. Ensure that the NFSv4.x CREATE_SESSION operation advertises this limit to servers. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Tested-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 13 5月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
sunrpc is using SOCKWQ_ASYNC_NOSPACE without setting SOCK_FASYNC, so the recent optimizations done in sk_set_bit() and sk_clear_bit() broke it. There is still the risk that a subsequent sock_fasync() call would clear SOCK_FASYNC, but sunrpc does not use this yet. Fixes: 9317bb69 ("net: SOCKWQ_ASYNC_NOSPACE optimizations") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NJiri Pirko <jiri@resnulli.us> Reported-by: NHuang, Ying <ying.huang@intel.com> Tested-by: NJiri Pirko <jiri@resnulli.us> Tested-by: NHuang, Ying <ying.huang@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 4月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
Rename UDP_INC_STATS_BH() to __UDP_INC_STATS(), and UDP6_INC_STATS_BH() to __UDP6_INC_STATS() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 4月, 2016 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
sock_owned_by_user should not be used without socket lock held. It seems to be a common practice to check .owned before lock reclassification, so provide a little help to abstract this check away. Cc: linux-cifs@vger.kernel.org Cc: linux-bluetooth@vger.kernel.org Cc: linux-nfs@vger.kernel.org Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 4月, 2016 1 次提交
-
-
由 Willem de Bruijn 提交于
Commit e6afc8ac modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing") Reported-by: NFranklin S Cooper Jr. <fcooper@ti.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Tested-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 07 1月, 2016 1 次提交
-
-
由 Trond Myklebust 提交于
We're seeing hangs in the NFS client code, with loops of the form: RPC: 30317 xmit incomplete (267368 left of 524448) RPC: 30317 call_status (status -11) RPC: 30317 call_transmit (status 0) RPC: 30317 xprt_prepare_transmit RPC: 30317 xprt_transmit(524448) RPC: xs_tcp_send_request(267368) = -11 RPC: 30317 xmit incomplete (267368 left of 524448) RPC: 30317 call_status (status -11) RPC: 30317 call_transmit (status 0) RPC: 30317 xprt_prepare_transmit RPC: 30317 xprt_transmit(524448) Turns out commit ceb5d58b ("net: fix sock_wake_async() rcu protection") moved SOCKWQ_ASYNC_NOSPACE out of sock->flags and into sk->sk_wq->flags, however it never tried to fix up the code in net/sunrpc. The new idiom is to use the flags in the RCU protected struct socket_wq. While we're at it, clear out the now redundant places where we set/clear SOCKWQ_ASYNC_NOSPACE and SOCK_NOSPACE. In principle, sk_stream_wait_memory() is supposed to set these for us, so we only need to clear them in the particular case of our ->write_space() callback. Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection") Cc: Eric Dumazet <edumazet@google.com> Cc: stable@vger.kernel.org # 4.4 Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 28 12月, 2015 1 次提交
-
-
由 Stefan Hajnoczi 提交于
xs_reclassify_socket4() and friends used to be called directly. xs_reclassify_socket() is called instead nowadays. The xs_reclassify_socketX() helper functions are empty when CONFIG_DEBUG_LOCK_ALLOC is not defined. Drop them since they have no callers. Note that AF_LOCAL still calls xs_reclassify_socketu() directly but is easily converted to generic xs_reclassify_socket(). Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 02 12月, 2015 1 次提交
-
-
由 Eric Dumazet 提交于
This patch is a cleanup to make following patch easier to review. Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA from (struct socket)->flags to a (struct socket_wq)->flags to benefit from RCU protection in sock_wake_async() To ease backports, we rename both constants. Two new helpers, sk_set_bit(int nr, struct sock *sk) and sk_clear_bit(int net, struct sock *sk) are added so that following patch can change their implementation. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2015 1 次提交
-
-
由 Andrzej Hajda 提交于
Due to incorrect len type bc_send_request returned always zero. The problem has been detected using proposed semantic patch scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2046107Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 03 11月, 2015 2 次提交
-
-
由 Chuck Lever 提交于
Forechannel transports get their own "bc_up" method to create an endpoint for the backchannel service. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> [Anna Schumaker: Add forward declaration of struct net to xprt.h] Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
由 Chuck Lever 提交于
xprt_{setup,destroy}_backchannel() won't be adequate for RPC/RMDA bi-direction. In particular, receive buffers have to be pre- registered and posted in order to receive incoming backchannel requests. Add a virtual function call to allow the insertion of appropriate backchannel setup and destruction methods for each transport. In addition, freeing a backchannel request is a little different for RPC/RDMA. Introduce an rpc_xprt_op to handle the difference. Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
-
- 24 10月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
If we're sending more pages via kernel_sendpage(), then set MSG_SENDPAGE_NOTLAST. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 08 10月, 2015 5 次提交
-
-
由 Trond Myklebust 提交于
If we're sending more than one page via kernel_sendpage(), then set MSG_SENDPAGE_NOTLAST between the pages so that we don't send suboptimal frames (see commit 2f533844 and commit 35f9c09f). Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Now that we've done it for TCP and UDP, let's convert AF_LOCAL as well. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Now that we've done it for TCP, let's convert UDP as well. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Stream protocols such as TCP can often build up a backlog of data to be read due to ordering. Combine this with the fact that some workloads such as NFS read()-intensive workloads need to receive a lot of data per RPC call, and it turns out that receiving the data from inside a softirq context can cause starvation. The following patch moves the TCP data receive into a workqueue context. We still end up calling tcp_read_sock(), but we do so from a process context, meaning that softirqs are enabled for most of the time. With this patch, I see a doubling of read bandwidth when running a multi-threaded iozone workload between a virtual client and server setup. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
Move the TCP data receive loop out of xs_tcp_data_ready(). Doing so will allow us to move the data receive out of the softirq context in a set of followup patches. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 20 9月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
Under all conditions, it should be quite sufficient just to mark the socket as disconnected. It will then be closed by the transport shutdown or reconnect code. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 9月, 2015 2 次提交
-
-
由 Trond Myklebust 提交于
Commit 718ba5b8, moved the responsibility for unlocking the socket to xs_tcp_setup_socket, meaning that the socket will be unlocked before we know that it has finished trying to connect. The following patch is based on an initial patch by Russell King to ensure that we delay clearing the XPRT_CONNECTING flag until we either know that we failed to initiate a connection attempt, or the connection attempt itself failed. Fixes: 718ba5b8 ("SUNRPC: Add helpers to prevent socket create from racing") Reported-by: NRussell King <linux@arm.linux.org.uk> Reported-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
When we're destroying the socket transport, we need to ensure that we cancel any existing delayed connection attempts, and order them w.r.t. the call to xs_close(). Reported-by: N"Suzuki K. Poulose" <suzuki.poulose@arm.com> Acked-by: NJeff Layton <jlayton@poochiereds.net> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 30 8月, 2015 2 次提交
-
-
由 Trond Myklebust 提交于
Add a shutdown() call before we release the socket in order to ensure the reset is sent before we try to reconnect. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Trond Myklebust 提交于
In case the reconnection attempt fails. Cc: stable@vger.kernel.org Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 20 8月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
Follow up to commit c4a7ca77 ("SUNRPC: Allow waiting on memory allocation"). Allows the RPC socket code to do non-IO blocking. Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 8月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
It is rather pointless to test the value of transport->inet after calling xs_reset_transport(), since it will always be zero, and so we will never see any exponential back off behaviour. Also don't force early connections for SOFTCONN tasks. If the server disconnects us, we should respect the exponential backoff. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 28 7月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 27 7月, 2015 1 次提交
-
-
由 NeilBrown 提交于
The networking layer does not reliably report the distinction between a non-block write failing because: 1/ the queue is too full already and 2/ a memory allocation attempt failed. The distinction is important because in the first case it is appropriate to retry as soon as the socket reports that it is writable, and in the second case a small delay is required as the socket will most likely report as writable but kmalloc could still fail. sk_stream_wait_memory() exhibits this distinction nicely, setting 'vm_wait' if a small wait is needed. However in the non-blocking case it always returns -EAGAIN no matter the cause of the failure. This -EAGAIN call get all the way to sunrpc. The sunrpc layer expects EAGAIN to indicate the first cause, and ENOBUFS to indicate the second. Various documentation suggests that this is not unreasonable, but does not guarantee the desired error codes. The result of getting -EAGAIN when -ENOBUFS is expected is that the send is tried again in a tight loop and soft lockups are reported. so: add tests after calls to xs_sendpages() to translate -EAGAIN into -ENOBUFS if the socket is writable. This cannot happen inside xs_sendpages() as the test for "is socket writable" is different between TCP and UDP. With this change, the tight loop retrying xs_sendpages() becomes a loop which only retries every 250ms, and so will not trigger a soft-lockup warning. It is possible that the write did fail because the queue was too full and by the time xs_sendpages() completed, the queue was writable again. In this case an extra 250ms delay is inserted that isn't really needed. This circumstance suggests a degree of congestion so a delay is not necessarily a bad thing, and it can only cause a single 250ms delay, not a series of them. Signed-off-by: NNeilBrown <neilb@suse.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 03 7月, 2015 1 次提交
-
-
由 Trond Myklebust 提交于
ENOBUFS means that memory allocations are failing due to an actual low memory situation. It should not be confused with being out of socket buffer space. Handle the problem by just punting to the delay in call_status. Reported-by: NNeil Brown <neilb@suse.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-