提交 · 99091700659f4df965e138b38b4fa26a29b7eade · openeuler / raspberrypi-kernel

06 8月, 2016 2 次提交

SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout · 3851f1cd

由 Trond Myklebust 提交于 8月 04, 2016

...and ensure that we propagate it to new transports on the same
client.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3851f1cd

SUNRPC: Fix reconnection timeouts · 02910177

由 Trond Myklebust 提交于 8月 04, 2016

When the connect attempt fails and backs off, we should start the clock
at the last connection attempt, not time at which we queue up the
reconnect job.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

02910177

05 8月, 2016 1 次提交

SUNRPC: disable the use of IPv6 temporary addresses. · d88e4d82

由 NeilBrown 提交于 8月 04, 2016

If the net.ipv6.conf.*.use_temp_addr sysctl is set to '2',
then TCP connections over IPv6 will prefer a 'private' source
address.
These eventually expire and become invalid, typically after a week,
but the time is configurable.

When the local address becomes invalid the client will not be able to
receive replies from the server.  Eventually the connection will timeout
or break and a new connection will be established, but this can take
half an hour (typically TCP connection break time).

RFC 4941, which describes private IPv6 addresses, acknowledges that some
applications might not work well with them and that the application may
explicitly a request non-temporary (i.e. "public") address.

I believe this is correct for SUNRPC clients.  Without this change, a
client will occasionally experience a long delay if private addresses
have been enabled.

The privacy offered by private addresses is of little value for an NFS
server which requires client authentication.

For NFSv3 this will often not be a problem because idle connections are
closed after 5 minutes.  For NFSv4 connections never go idle due to the
period RENEW (or equivalent) request.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d88e4d82

02 8月, 2016 1 次提交

SUNRPC: Handle EADDRNOTAVAIL on connection failures · 1f4c17a0

由 Trond Myklebust 提交于 8月 01, 2016

If the connect attempt immediately fails with an EADDRNOTAVAIL error, then
that means our choice of source port number was bad.
This error is expected when we set the SO_REUSEPORT socket option and we
have 2 sockets sharing the same source and destination address and port
combinations.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Fixes: 402e23b4 ("SUNRPC: Fix stupid typo in xs_sock_set_reuseport")
Cc: stable@vger.kernel.org # v4.0+

1f4c17a0

20 7月, 2016 3 次提交

sunrpc: Prevent resvport min/max inversion via sysfs and module parameter · ffb6ca33

由 Frank Sorenson 提交于 7月 08, 2016

The current min/max resvport settings are independently limited
by the entire range of allowed ports, so max_resvport can be
set to a port lower than min_resvport.

Prevent inversion of min/max values when set through sysfs and
module parameter by setting the limits dependent on each other.
Signed-off-by: NFrank Sorenson <sorenson@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ffb6ca33

sunrpc: Prevent resvport min/max inversion via sysctl · e08ea3a9

由 Frank Sorenson 提交于 7月 08, 2016

The current min/max resvport settings are independently limited
by the entire range of allowed ports, so max_resvport can be
set to a port lower than min_resvport.

Prevent inversion of min/max values when set through sysctl by
setting the limits dependent on each other.
Signed-off-by: NFrank Sorenson <sorenson@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

e08ea3a9

sunrpc: Fix reserved port range calculation · 5d71899a

由 Frank Sorenson 提交于 7月 08, 2016

The range calculation for choosing the random reserved port will panic
with divide-by-zero when min_resvport == max_resvport, a range of one
port, not zero.

Fix the reserved port range calculation by adding one to the difference.
Signed-off-by: NFrank Sorenson <sorenson@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5d71899a

15 6月, 2016 1 次提交

rpc: share one xps between all backchannels · 39a9beab

由 J. Bruce Fields 提交于 5月 17, 2016

The spec allows backchannels for multiple clients to share the same tcp
connection.  When that happens, we need to use the same xprt for all of
them.  Similarly, we need the same xps.

This fixes list corruption introduced by the multipath code.

Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Acked-by: NTrond Myklebust <trondmy@primarydata.com>

39a9beab

14 6月, 2016 4 次提交

SUNRPC: Fix suspicious enobufs issues. · 9ffadfbc

由 Trond Myklebust 提交于 5月 29, 2016

The current test is racy when dealing with fast NICs.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9ffadfbc

SUNRPC: RPC transport queue must be low latency · 40a5f1b1

由 Trond Myklebust 提交于 5月 27, 2016

rpciod can easily get congested due to the long list of queued rpc_tasks.
Having the receive queue wait in turn for those tasks to complete can
therefore be a bottleneck.

Address the problem by separating the workqueues into:
- rpciod: manages rpc_tasks
- xprtiod: manages transport related work.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

40a5f1b1

SUNRPC: Consolidate xs_tcp_data_ready and xs_data_ready · 5157b956

由 Trond Myklebust 提交于 5月 29, 2016

The only difference between the two at this point is the reset of
the connection timeout, and since everyone expect tcp ignore that value,
we can just throw it into the generic function.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

5157b956

SUNRPC: Small optimisation of client receive · 42d42a5b

由 Trond Myklebust 提交于 5月 23, 2016

Do not queue the client receive work if we're still processing.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

42d42a5b

18 5月, 2016 1 次提交

sunrpc: Advertise maximum backchannel payload size · 6b26cc8c

由 Chuck Lever 提交于 5月 02, 2016

RPC-over-RDMA transports have a limit on how large a backward
direction (backchannel) RPC message can be. Ensure that the NFSv4.x
CREATE_SESSION operation advertises this limit to servers.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6b26cc8c

13 5月, 2016 1 次提交

sunrpc: set SOCK_FASYNC · b4411457

由 Eric Dumazet 提交于 5月 12, 2016

sunrpc is using SOCKWQ_ASYNC_NOSPACE without setting SOCK_FASYNC,
so the recent optimizations done in sk_set_bit() and sk_clear_bit()
broke it.

There is still the risk that a subsequent sock_fasync() call
would clear SOCK_FASYNC, but sunrpc does not use this yet.

Fixes: 9317bb69 ("net: SOCKWQ_ASYNC_NOSPACE optimizations")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJiri Pirko <jiri@resnulli.us>
Reported-by: NHuang, Ying <ying.huang@intel.com>
Tested-by: NJiri Pirko <jiri@resnulli.us>
Tested-by: NHuang, Ying <ying.huang@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4411457

28 4月, 2016 1 次提交

net: udp: rename UDP_INC_STATS_BH() · 02c22347

由 Eric Dumazet 提交于 4月 27, 2016

Rename UDP_INC_STATS_BH() to __UDP_INC_STATS(),
and UDP6_INC_STATS_BH() to __UDP6_INC_STATS()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02c22347

14 4月, 2016 1 次提交

sock: tigthen lockdep checks for sock_owned_by_user · fafc4e1e

由 Hannes Frederic Sowa 提交于 4月 08, 2016

sock_owned_by_user should not be used without socket lock held. It seems
to be a common practice to check .owned before lock reclassification, so
provide a little help to abstract this check away.

Cc: linux-cifs@vger.kernel.org
Cc: linux-bluetooth@vger.kernel.org
Cc: linux-nfs@vger.kernel.org
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fafc4e1e

12 4月, 2016 1 次提交

sunrpc: do not pull udp headers on receive · 1da8c681

由 Willem de Bruijn 提交于 4月 07, 2016

Commit e6afc8ac modified the udp receive path by pulling the udp
header before queuing an skbuff onto the receive queue.

Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp
socket. Modify this receive path to also no longer expect udp
headers.

Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing")
Reported-by: NFranklin S Cooper Jr. <fcooper@ti.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Tested-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1da8c681

06 2月, 2016 1 次提交
- T
  SUNRPC: Use the multipath iterator to assign a transport to each task · fb43d172
  由 Trond Myklebust 提交于 1月 30, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  fb43d172
07 1月, 2016 1 次提交

SUNRPC: Fixup socket wait for memory · 13331a55

由 Trond Myklebust 提交于 1月 06, 2016

We're seeing hangs in the NFS client code, with loops of the form:

 RPC: 30317 xmit incomplete (267368 left of 524448)
 RPC: 30317 call_status (status -11)
 RPC: 30317 call_transmit (status 0)
 RPC: 30317 xprt_prepare_transmit
 RPC: 30317 xprt_transmit(524448)
 RPC:       xs_tcp_send_request(267368) = -11
 RPC: 30317 xmit incomplete (267368 left of 524448)
 RPC: 30317 call_status (status -11)
 RPC: 30317 call_transmit (status 0)
 RPC: 30317 xprt_prepare_transmit
 RPC: 30317 xprt_transmit(524448)

Turns out commit ceb5d58b ("net: fix sock_wake_async() rcu protection")
moved SOCKWQ_ASYNC_NOSPACE out of sock->flags and into sk->sk_wq->flags,
however it never tried to fix up the code in net/sunrpc.

The new idiom is to use the flags in the RCU protected struct socket_wq.
While we're at it, clear out the now redundant places where we set/clear
SOCKWQ_ASYNC_NOSPACE and SOCK_NOSPACE. In principle, sk_stream_wait_memory()
is supposed to set these for us, so we only need to clear them in the
particular case of our ->write_space() callback.

Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection")
Cc: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org # 4.4
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

13331a55

28 12月, 2015 1 次提交

SUNRPC: drop unused xs_reclassify_socketX() helpers · d1358917

由 Stefan Hajnoczi 提交于 12月 02, 2015

xs_reclassify_socket4() and friends used to be called directly.
xs_reclassify_socket() is called instead nowadays.

The xs_reclassify_socketX() helper functions are empty when
CONFIG_DEBUG_LOCK_ALLOC is not defined.  Drop them since they have no
callers.

Note that AF_LOCAL still calls xs_reclassify_socketu() directly but is
easily converted to generic xs_reclassify_socket().
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d1358917

02 12月, 2015 1 次提交

net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA · 9cd3e072

由 Eric Dumazet 提交于 11月 29, 2015

This patch is a cleanup to make following patch easier to
review.

Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
from (struct socket)->flags to a (struct socket_wq)->flags
to benefit from RCU protection in sock_wake_async()

To ease backports, we rename both constants.

Two new helpers, sk_set_bit(int nr, struct sock *sk)
and sk_clear_bit(int net, struct sock *sk) are added so that
following patch can change their implementation.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9cd3e072

04 11月, 2015 1 次提交

SUNRPC: fix variable type · 7fc56136

由 Andrzej Hajda 提交于 9月 24, 2015

Due to incorrect len type bc_send_request returned always zero.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7fc56136

03 11月, 2015 2 次提交

NFS: Enable client side NFSv4.1 backchannel to use other transports · 76566773

由 Chuck Lever 提交于 10月 24, 2015

Forechannel transports get their own "bc_up" method to create an
endpoint for the backchannel service.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
[Anna Schumaker: Add forward declaration of struct net to xprt.h]
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

76566773

SUNRPC: Abstract backchannel operations · 42e5c3e2

由 Chuck Lever 提交于 10月 24, 2015

xprt_{setup,destroy}_backchannel() won't be adequate for RPC/RMDA
bi-direction. In particular, receive buffers have to be pre-
registered and posted in order to receive incoming backchannel
requests.

Add a virtual function call to allow the insertion of appropriate
backchannel setup and destruction methods for each transport.

In addition, freeing a backchannel request is a little different
for RPC/RDMA. Introduce an rpc_xprt_op to handle the difference.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Tested-By: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

42e5c3e2

24 10月, 2015 1 次提交

SUNRPC: Use MSG_SENDPAGE_NOTLAST when calling sendpage() · 226453d8

由 Trond Myklebust 提交于 10月 07, 2015

If we're sending more pages via kernel_sendpage(), then set
MSG_SENDPAGE_NOTLAST.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

226453d8

08 10月, 2015 5 次提交

SUNRPC: Use MSG_SENDPAGE_NOTLAST in xs_send_pagedata() · 31303d6c

由 Trond Myklebust 提交于 10月 06, 2015

If we're sending more than one page via kernel_sendpage(), then set
MSG_SENDPAGE_NOTLAST between the pages so that we don't send suboptimal
frames (see commit 2f533844 and commit 35f9c09f).
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

31303d6c

SUNRPC: Move AF_LOCAL receive data path into a workqueue context · a2648094

由 Trond Myklebust 提交于 10月 06, 2015

Now that we've done it for TCP and UDP, let's convert AF_LOCAL as well.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a2648094

SUNRPC: Move UDP receive data path into a workqueue context · f9b2ee71

由 Trond Myklebust 提交于 10月 06, 2015

Now that we've done it for TCP, let's convert UDP as well.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f9b2ee71

SUNRPC: Move TCP receive data path into a workqueue context · edc1b01c

由 Trond Myklebust 提交于 10月 05, 2015

Stream protocols such as TCP can often build up a backlog of data to be
read due to ordering. Combine this with the fact that some workloads such
as NFS read()-intensive workloads need to receive a lot of data per RPC
call, and it turns out that receiving the data from inside a softirq
context can cause starvation.

The following patch moves the TCP data receive into a workqueue context.
We still end up calling tcp_read_sock(), but we do so from a process
context, meaning that softirqs are enabled for most of the time.

With this patch, I see a doubling of read bandwidth when running a
multi-threaded iozone workload between a virtual client and server setup.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

edc1b01c

SUNRPC: Refactor TCP receive · 66d7a56a

由 Trond Myklebust 提交于 10月 05, 2015

Move the TCP data receive loop out of xs_tcp_data_ready(). Doing so
will allow us to move the data receive out of the softirq context in
a set of followup patches.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

66d7a56a

20 9月, 2015 1 次提交

SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose · 4b0ab51d

由 Trond Myklebust 提交于 9月 18, 2015

Under all conditions, it should be quite sufficient just to mark
the socket as disconnected. It will then be closed by the
transport shutdown or reconnect code.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4b0ab51d

18 9月, 2015 2 次提交

SUNRPC: Ensure that we wait for connections to complete before retrying · 0fdea1e8

由 Trond Myklebust 提交于 9月 16, 2015

Commit 718ba5b8, moved the responsibility for unlocking the socket to
xs_tcp_setup_socket, meaning that the socket will be unlocked before we
know that it has finished trying to connect. The following patch is based on
an initial patch by Russell King to ensure that we delay clearing the
XPRT_CONNECTING flag until we either know that we failed to initiate
a connection attempt, or the connection attempt itself failed.

Fixes: 718ba5b8 ("SUNRPC: Add helpers to prevent socket create from racing")
Reported-by: NRussell King <linux@arm.linux.org.uk>
Reported-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Tested-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Tested-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

0fdea1e8

SUNRPC: Fix races between socket connection and destroy code · 03c78827

由 Trond Myklebust 提交于 9月 17, 2015

When we're destroying the socket transport, we need to ensure that
we cancel any existing delayed connection attempts, and order them
w.r.t. the call to xs_close().
Reported-by: N"Suzuki K. Poulose" <suzuki.poulose@arm.com>
Acked-by: NJeff Layton <jlayton@poochiereds.net>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

03c78827

30 8月, 2015 2 次提交

SUNRPC: Prevent SYN+SYNACK+RST storms · 09939204

由 Trond Myklebust 提交于 8月 29, 2015

Add a shutdown() call before we release the socket in order to ensure the
reset is sent before we try to reconnect.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

09939204

SUNRPC: xs_reset_transport must mark the connection as disconnected · 0c78789e

由 Trond Myklebust 提交于 8月 29, 2015

In case the reconnection attempt fails.

Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

0c78789e

20 8月, 2015 1 次提交

SUNRPC: Allow sockets to do GFP_NOIO allocations · c2126157

由 Trond Myklebust 提交于 8月 19, 2015

Follow up to commit c4a7ca77 ("SUNRPC: Allow waiting on memory
allocation"). Allows the RPC socket code to do non-IO blocking.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c2126157

18 8月, 2015 1 次提交

SUNRPC: Fix a thinko in xs_connect() · 99b1a4c3

由 Trond Myklebust 提交于 8月 13, 2015

It is rather pointless to test the value of transport->inet after
calling xs_reset_transport(), since it will always be zero, and
so we will never see any exponential back off behaviour.
Also don't force early connections for SOFTCONN tasks. If the server
disconnects us, we should respect the exponential backoff.

Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

99b1a4c3

28 7月, 2015 1 次提交
- T
  SUNRPC: Report TCP errors to the caller · f580dd04
  由 Trond Myklebust 提交于 7月 11, 2015
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  f580dd04
27 7月, 2015 1 次提交

sunrpc: translate -EAGAIN to -ENOBUFS when socket is writable. · 743c69e7

由 NeilBrown 提交于 7月 27, 2015

The networking layer does not reliably report the distinction between
a non-block write failing because:
 1/ the queue is too full already and
 2/ a memory allocation attempt failed.

The distinction is important because in the first case it is
appropriate to retry as soon as the socket reports that it is
writable, and in the second case a small delay is required as the
socket will most likely report as writable but kmalloc could still
fail.

sk_stream_wait_memory() exhibits this distinction nicely, setting
'vm_wait' if a small wait is needed.  However in the non-blocking case
it always returns -EAGAIN no matter the cause of the failure.  This
-EAGAIN call get all the way to sunrpc.

The sunrpc layer expects EAGAIN to indicate the first cause, and
ENOBUFS to indicate the second.  Various documentation suggests that
this is not unreasonable, but does not guarantee the desired error
codes.

The result of getting -EAGAIN when -ENOBUFS is expected is that the
send is tried again in a tight loop and soft lockups are reported.

so: add tests after calls to xs_sendpages() to translate -EAGAIN into
-ENOBUFS if the socket is writable.  This cannot happen inside
xs_sendpages() as the test for "is socket writable" is different
between TCP and UDP.

With this change, the tight loop retrying xs_sendpages() becomes a
loop which only retries every 250ms, and so will not trigger a
soft-lockup warning.

It is possible that the write did fail because the queue was too full
and by the time xs_sendpages() completed, the queue was writable
again.  In this case an extra 250ms delay is inserted that isn't
really needed.  This circumstance suggests a degree of congestion so a
delay is not necessarily a bad thing, and it can only cause a single
250ms delay, not a series of them.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

743c69e7

03 7月, 2015 1 次提交

SUNRPC: Don't confuse ENOBUFS with a write_space issue · b5872f0c

由 Trond Myklebust 提交于 7月 03, 2015

ENOBUFS means that memory allocations are failing due to an actual
low memory situation. It should not be confused with being out of
socket buffer space.

Handle the problem by just punting to the delay in call_status.
Reported-by: NNeil Brown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b5872f0c