提交 · 4a3a0ebad1360696125bf34d89de55d71c4d0eaa · openanolis / cloud-kernel

29 8月, 2014 1 次提交

sunrpc: fix byte-swapping of displayed XID · 71efecb3

由 Chuck Lever 提交于 8月 22, 2014

xprt_lookup_rqst() and bc_send_request() display a byte-swapped XID,
but receive_cb_reply() does not.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

71efecb3

18 8月, 2014 1 次提交

SUNRPC: Optimise away svc_recv_available · f8d1ff47

由 Trond Myklebust 提交于 8月 03, 2014

We really do not want to do ioctls in the server's fast path. Instead, let's
use the fact that we managed to read a full record as the indicator that
we should try to read the socket again.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f8d1ff47

30 7月, 2014 2 次提交

T
SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed · 51877680
由 Trond Myklebust 提交于 7月 24, 2014
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
51877680

SUNRPC: svc_tcp_write_space: don't clear SOCK_NOSPACE prematurely · c7fb3f06

由 Trond Myklebust 提交于 7月 24, 2014

If requests are queued in the socket inbuffer waiting for an
svc_tcp_has_wspace() requirement to be satisfied, then we do not want
to clear the SOCK_NOSPACE flag until we've satisfied that requirement.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c7fb3f06

18 7月, 2014 1 次提交

svcrdma: Select NFSv4.1 backchannel transport based on forward channel · 3c45ddf8

由 Chuck Lever 提交于 7月 16, 2014

The current code always selects XPRT_TRANSPORT_BC_TCP for the back
channel, even when the forward channel was not TCP (eg, RDMA). When
a 4.1 mount is attempted with RDMA, the server panics in the TCP BC
code when trying to send CB_NULL.

Instead, construct the transport protocol number from the forward
channel transport or'd with XPRT_TRANSPORT_BC. Transports that do
not support bi-directional RPC will not have registered a "BC"
transport, causing create_backchannel_client() to fail immediately.

Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3c45ddf8

31 5月, 2014 1 次提交

SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING · a48fd0f9

由 Kinglong Mee 提交于 5月 29, 2014

When debugging, rpc prints messages from dprintk(KERN_WARNING ...)
with "^A4" prefixed,

[ 2780.339988] ^A4nfsd: connect from unprivileged port: 127.0.0.1, port=35316

Trond tells,
> dprintk != printk. We have NEVER supported dprintk(KERN_WARNING...)

This patch removes using of dprintk with KERN_WARNING.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a48fd0f9

23 5月, 2014 2 次提交

SUNRPC: track whether a request is coming from a loop-back interface. · ef11ce24

由 NeilBrown 提交于 5月 12, 2014

If an incoming NFS request is coming from the local host, then
nfsd will need to perform some special handling.  So detect that
possibility and make the source visible in rq_local.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ef11ce24

NFSD: Ignore client's source port on RDMA transports · 16e4d93f

由 Chuck Lever 提交于 5月 19, 2014

An NFS/RDMA client's source port is meaningless for RDMA transports.
The transport layer typically sets the source port value on the
connection to a random ephemeral port.

Currently, NFS server administrators must specify the "insecure"
export option to enable clients to access exports via RDMA.

But this means NFS clients can access such an export via IP using an
ephemeral port, which may not be desirable.

This patch eliminates the need to specify the "insecure" export
option to allow NFS/RDMA clients access to an export.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

16e4d93f

12 4月, 2014 1 次提交

net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369

由 David S. Miller 提交于 4月 11, 2014

Several spots in the kernel perform a sequence like:

	skb_queue_tail(&sk->s_receive_queue, skb);
	sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

676d2369

01 4月, 2014 1 次提交

nfsd: check passed socket's net matches NFSd superblock's one · 30646394

由 Stanislav Kinsbursky 提交于 2月 26, 2014

There could be a case, when NFSd file system is mounted in network, different
to socket's one, like below:

"ip netns exec" creates new network and mount namespace, which duplicates NFSd
mount point, created in init_net context. And thus NFS server stop in nested
network context leads to RPCBIND client destruction in init_net.
Then, on NFSd start in nested network context, rpc.nfsd process creates socket
in nested net and passes it into "write_ports", which leads to RPCBIND sockets
creation in init_net context because of the same reason (NFSd monut point was
created in init_net context). An attempt to register passed socket in nested
net leads to panic, because no RPCBIND client present in nexted network
namespace.

This patch add check that passed socket's net matches NFSd superblock's one.
And returns -EINVAL error to user psace otherwise.

v2: Put socket on exit.
Reported-by: NWeng Meiling <wengmeiling.weng@huawei.com>
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

30646394

10 10月, 2013 1 次提交

net: fix build errors if ipv6 is disabled · c2bb06db

由 Eric Dumazet 提交于 10月 09, 2013

CONFIG_IPV6=n is still a valid choice ;)

It appears we can remove dead code.
Reported-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2bb06db

09 10月, 2013 1 次提交

ipv6: make lookups simpler and faster · efe4208f

由 Eric Dumazet 提交于 10月 03, 2013

TCP listener refactoring, part 4 :

To speed up inet lookups, we moved IPv4 addresses from inet to struct
sock_common

Now is time to do the same for IPv6, because it permits us to have fast
lookups for all kind of sockets, including upcoming SYN_RECV.

Getting IPv6 addresses in TCP lookups currently requires two extra cache
lines, plus a dereference (and memory stall).

inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6

This patch is way bigger than its IPv4 counter part, because for IPv4,
we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
it's not doable easily.

inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr

And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
at the same offset.

We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
macro.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efe4208f

01 8月, 2013 1 次提交

NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure. · 447383d2

由 NeilBrown 提交于 7月 25, 2013

Since we enabled auto-tuning for sunrpc TCP connections we do not
guarantee that there is enough write-space on each connection to
queue a reply.

If memory pressure causes the window to shrink too small, the request
throttling in sunrpc/svc will not accept any requests so no more requests
will be handled.  Even when pressure decreases the window will not
grow again until data is sent on the connection.
This means we get a deadlock:  no requests will be handled until there
is more space, and no space will be allocated until a request is
handled.

This can be simulated by modifying svc_tcp_has_wspace to inflate the
number of byte required and removing the 'svc_sock_setbufsize' calls
in svc_setup_socket.

I found that multiplying by 16 was enough to make the requirement
exceed the default allocation.  With this modification in place:
   mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt
would block and eventually time out because the nfs server could not
accept any requests.

This patch relaxes the request throttling to always allow at least one
request through per connection.  It does this by checking both
  sk_stream_min_wspace() and xprt->xpt_reserved
are zero.
The first is zero when the TCP transmit queue is empty.
The second is zero when there are no RPC requests being processed.
When both of these are zero the socket is idle and so one more
request can safely be allowed through.

Applying this patch allows the above mount command to succeed cleanly.
Tracing shows that the allocated write buffer space quickly grows and
after a few requests are handled, the extra tests are no longer needed
to permit further requests to be processed.

The main purpose of request throttling is to handle the case when one
client is slow at collecting replies and the send queue gets full of
replies that the client hasn't acknowledged (at the TCP level) yet.
As we only change behaviour when the send queue is empty this main
purpose is still preserved.
Reported-by: NBen Myers <bpm@sgi.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

447383d2

25 7月, 2013 1 次提交

net: add sk_stream_is_writeable() helper · 64dc6130

由 Eric Dumazet 提交于 7月 22, 2013

Several call sites use the hardcoded following condition :

sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)

Lets use a helper because TCP_NOTSENT_LOWAT support will change this
condition for TCP sockets.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64dc6130

02 7月, 2013 2 次提交

svcrpc: don't error out on small tcp fragment · 1f691b07

由 J. Bruce Fields 提交于 6月 26, 2013

Though clients we care about mostly don't do this, it is possible for
rpc requests to be sent in multiple fragments.  Here we have a sanity
check to ensure that the final received rpc isn't too small--except that
the number we're actually checking is the length of just the final
fragment, not of the whole rpc.  So a perfectly legal rpc that's
unluckily fragmented could cause the server to close the connection
here.

Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1f691b07

svcrpc: fix handling of too-short rpc's · cf3aa02c

由 J. Bruce Fields 提交于 6月 26, 2013

If we detect that an rpc is too short, we abort and close the
connection.  Except, there's a bug here: we're leaving sk_datalen
nonzero without leaving any pages in the sk_pages array.  The most
likely result of the inconsistency is a subsequent crash in
svc_tcp_clear_pages.

Also demote the BUG_ON in svc_tcp_clear_pages to a WARN.

Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cf3aa02c

01 2月, 2013 1 次提交

ipv6: rename datagram_send_ctl and datagram_recv_ctl · 73df66f8

由 Tom Parkin 提交于 1月 31, 2013

The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific.  Since
datagram_send_ctl is publicly exported it should be appropriately named to
reflect the fact that it's for IPv6 only.
Signed-off-by: NTom Parkin <tparkin@katalix.com>
Signed-off-by: NJames Chapman <jchapman@katalix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73df66f8

18 12月, 2012 2 次提交

nfsd4: cleanup: replace rq_resused count by rq_next_page pointer · afc59400

由 J. Bruce Fields 提交于 12月 10, 2012

It may be a matter of personal taste, but I find this makes the code
clearer.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

afc59400

svcrpc: fix some printks · 3a28e331

由 J. Bruce Fields 提交于 12月 10, 2012

Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3a28e331

04 12月, 2012 5 次提交

svcrpc: support multiple-fragment rpc's · 836fbadb

由 J. Bruce Fields 提交于 12月 03, 2012

Over TCP, RPC's are preceded by a single 4-byte field telling you how
long the rpc is (in bytes). The spec also allows you to send an RPC in
multiple such records (the high bit of the length field is used to tell
you whether this is the final record).

We've survived for years without supporting this because in practice the
clients we care about don't use it. But the userland rpc libraries do,
and every now and then an experimental client will run into this. (Most
recently I noticed it while trying to write a pynfs check.) And we're
really on the wrong side of the spec here--let's fix this.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

836fbadb

svcrpc: track rpc data length separately from sk_tcplen · 8af345f5

由 J. Bruce Fields 提交于 12月 03, 2012

Keep a separate field, sk_datalen, that tracks only the data contained
in a fragment, not including the fragment header.

For now, this is always just max(0, sk_tcplen - 4), but after we allow
multiple fragments sk_datalen will accumulate the total rpc data size
while sk_tcplen only tracks progress receiving the current fragment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8af345f5

svcrpc: fix off-by-4 error in "incomplete TCP record" dprintk · 6a72ae2e

由 J. Bruce Fields 提交于 12月 03, 2012

The full reclen doesn't include the fragment header, but sk_tcplen does.
Fix this to make it an apples-to-apples comparison.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

6a72ae2e

svcrpc: delay minimum-rpc-size check till later · ad46ccf0

由 J. Bruce Fields 提交于 12月 03, 2012

Soon we want to support multiple fragments, in which case it may be
legal for a single fragment to be smaller than 8 bytes, so we'll want to
delay this check till we've reached the last fragment.

Also fix an outdated comment.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ad46ccf0

svcrpc: don't byte-swap sk_reclen in place · cc248d4b

由 J. Bruce Fields 提交于 12月 03, 2012

Byte-swapping in place is always a little dubious.

Let's instead define this field to always be big-endian, and do the
swapping on demand where we need it.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cc248d4b

05 11月, 2012 1 次提交

SUNRPC: remove BUG_ONs from *_reclassify_socket* · 1b7a1819

由 Weston Andros Adamson 提交于 10月 23, 2012

Replace multiple BUG_ON() calls with WARN_ON_ONCE() and early return when
sanity checking socket ownership (lock). The bind call will fail if the
socket was unsuccessfully reclassified.
Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1b7a1819

10 9月, 2012 1 次提交

nfsd: remove unused listener-removal interfaces · eccf50c1

由 J. Bruce Fields 提交于 8月 15, 2012

You can use nfsd/portlist to give nfsd additional sockets to listen on.
In theory you can also remove listening sockets this way.  But nobody's
ever done that as far as I can tell.

Also this was partially broken in 2.6.25, by
a217813f "knfsd: Support adding
transports by writing portlist file".

(Note that we decide whether to take the "delfd" case by checking for a
digit--but what's actually expected in that case is something made by
svc_one_sock_name(), which won't begin with a digit.)

So, let's just rip out this stuff.
Acked-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

eccf50c1

22 8月, 2012 6 次提交

svcrpc: make xpo_recvfrom return only >=0 · 9f9d2ebe

由 J. Bruce Fields 提交于 8月 17, 2012

The only errors returned from xpo_recvfrom have been -EAGAIN and
-EAFNOSUPPORT.  The latter was removed by a previous patch.  That leaves
only -EAGAIN, which is treated just like 0 by the caller (svc_recv).

So, just ditch -EAGAIN and return 0 instead.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

9f9d2ebe

svcrpc: don't bother checking bad svc_addr_len result · af6d5721

由 J. Bruce Fields 提交于 8月 21, 2012

None of the callers should see an unsupported address family (only one
of them even bothers to check for that case), so just check for the
buggy case in svc_addr_len and don't bother elsewhere.
Acked-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

af6d5721

svcrpc: minor udp code cleanup · f23abfdb

由 J. Bruce Fields 提交于 8月 17, 2012

Order the code in a more boring way.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f23abfdb

J
svcrpc: share some setup of listening sockets · 39b55301
由 J. Bruce Fields 提交于 8月 14, 2012
```
There's some duplicate code here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
39b55301

svcrpc: clean up control flow · a8e10078

由 J. Bruce Fields 提交于 8月 13, 2012

Mainly, use the kernel standard

	err = -ERROR;
	if (something_bad)
		goto out;
	normal case;

rather than

	if (something_bad)
		err = -ERROR
	else {
		normal case;
	}
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a8e10078

svcrpc: standardize svc_setup_socket return convention · 72c35376

由 J. Bruce Fields 提交于 8月 13, 2012

Use the kernel-standard ptr-or-error return convention instead of
passing a pointer to the error.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

72c35376

21 8月, 2012 1 次提交

svcrpc: fix BUG() in svc_tcp_clear_pages · be1e4444

由 J. Bruce Fields 提交于 8月 09, 2012

Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is
consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if
sk_tcplen would lead us to expect n pages of data).

svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen.
This is OK, since both functions are serialized by XPT_BUSY.  However,
that means the inconsistency must be repaired before dropping XPT_BUSY.

Therefore we should be ensuring that svc_tcp_save_pages repairs the
problem before exiting svc_tcp_recv_record on error.

Symptoms were a BUG() in svc_tcp_clear_pages.

Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

be1e4444

28 6月, 2012 1 次提交

net: skb_free_datagram_locked() doesnt drop all packets · 22911fc5

由 Eric Dumazet 提交于 6月 27, 2012

dropwatch wrongly diagnose all received UDP packets as drops.

This patch removes trace_kfree_skb() done in skb_free_datagram_locked().

Locations calling skb_free_datagram_locked() should do it on their own.

As a result, drops are accounted on the right function.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22911fc5

16 5月, 2012 1 次提交

net: Convert net_ratelimit uses to net_<level>_ratelimited · e87cc472

由 Joe Perches 提交于 5月 13, 2012

Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e87cc472

22 4月, 2012 1 次提交

sock: Introduce named constants for sk_reuse · 4a17fd52

由 Pavel Emelyanov 提交于 4月 19, 2012

Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a17fd52

12 3月, 2012 1 次提交

SUNRPC: Fix a few sparse warnings · 09acfea5

由 Trond Myklebust 提交于 3月 11, 2012

net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment
(different address spaces)
 - svc_partial_recvfrom now takes a struct kvec, so the variable
   save_iovbase needs to be an ordinary (void *)

Make a bunch of variables in net/sunrpc/xprtsock.c static

Fix a couple of "warning: symbol 'foo' was not declared. Should it be
static?" reports.

Fix a couple of conflicting function declarations.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

09acfea5

04 2月, 2012 1 次提交

SUNPRC: remove marking service temporary sockets with XPT_CHNGBUF · 9f912ceb

由 Stanislav Kinsbursky 提交于 1月 20, 2012

This is a cleanup patch.
Service temporary sockets can be TCP or RDMA only. But XPT_CHNGBUF service
socket flag is checked only for UDP sockets on receive.
Thus (if I don't miss something non-obvious) this bit raising for temporary
sockets can be removed.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

9f912ceb

01 2月, 2012 2 次提交

SUNRPC: fixup for namespace changes · d3b773e4

由 Trond Myklebust 提交于 1月 23, 2012

Fixes this build error when CONFIG_NET_NS is not set:

net/sunrpc/svcsock.c: In function 'svc_setup_socket':
net/sunrpc/svcsock.c:1412:40: error: 'struct sock_common' has no member named 'skc_net'
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d3b773e4

SUNRPC: pass network namespace to service registering routines · 5247fab5

由 Stanislav Kinsbursky 提交于 1月 13, 2012

Lockd and NFSd services will handle requests from and to many network
nsamespaces. And thus have to be registered and unregistered per network
namespace.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5247fab5

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功