提交 · 0f90be132cbf1537d87a6a8b9e80867adac892f6 · openanolis / cloud-kernel

01 8月, 2018 1 次提交

NFSv4 client live hangs after live data migration recovery · 0f90be13

由 Bill Baker 提交于 6月 19, 2018

After a live data migration event at the NFS server, the client may send
I/O requests to the wrong server, causing a live hang due to repeated
recovery events.  On the wire, this will appear as an I/O request failing
with NFS4ERR_BADSESSION, followed by successful CREATE_SESSION, repeatedly.
NFS4ERR_BADSSESSION is returned because the session ID being used was
issued by the other server and is not valid at the old server.

The failure is caused by async worker threads having cached the transport
(xprt) in the rpc_task structure.  After the migration recovery completes,
the task is redispatched and the task resends the request to the wrong
server based on the old value still present in tk_xprt.

The solution is to recompute the tk_xprt field of the rpc_task structure
so that the request goes to the correct server.
Signed-off-by: NBill Baker <bill.baker@oracle.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NHelen Chao <helen.chao@oracle.com>
Fixes: fb43d172 ("SUNRPC: Use the multipath iterator to assign a ...")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0f90be13

07 5月, 2018 1 次提交

SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock · 37ac86c3

由 Chuck Lever 提交于 5月 04, 2018

alloc_slot is a transport-specific op, but initializing an rpc_rqst
is common to all transports. In addition, the only part of initial-
izing an rpc_rqst that needs serialization is getting a fresh XID.

Move rpc_rqst initialization to common code in preparation for
adding a transport-specific alloc_slot to xprtrdma.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

37ac86c3

11 4月, 2018 2 次提交

sunrpc: Add static trace point to report result of RPC ping · a25a4cb3

由 Chuck Lever 提交于 3月 16, 2018

This information can help track down local misconfiguration issues
as well as network partitions and unresponsive servers.

There are several ways to send a ping, and with transport multi-
plexing, the exact rpc_xprt that is used is sometimes not known by
the upper layer. The rpc_xprt pointer passed to the trace point
call also has to be RCU-safe.

I found a spot inside the client FSM where an rpc_xprt pointer is
always available and safe to use.
Suggested-by: NBill Baker <Bill.Baker@oracle.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a25a4cb3

sunrpc: Simplify synopsis of some trace points · e671edb9

由 Chuck Lever 提交于 3月 16, 2018

Clean up: struct rpc_task carries a pointer to a struct rpc_clnt,
and in fact task->tk_client is always what is passed into trace
points that are already passing @task.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e671edb9

13 2月, 2018 1 次提交

net: make getname() functions return length rather than use int* parameter · 9b2c45d4

由 Denys Vlasenko 提交于 2月 12, 2018

Changes since v1:
Added changes in these files:
    drivers/infiniband/hw/usnic/usnic_transport.c
    drivers/staging/lustre/lnet/lnet/lib-socket.c
    drivers/target/iscsi/iscsi_target_login.c
    drivers/vhost/net.c
    fs/dlm/lowcomms.c
    fs/ocfs2/cluster/tcp.c
    security/tomoyo/network.c

Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.

"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.

None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.

This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.

Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.

Userspace API is not changed.

    text    data     bss      dec     hex filename
30108430 2633624  873672 33615726 200ef6e vmlinux.before.o
30108109 2633612  873672 33615393 200ee21 vmlinux.o
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b2c45d4

15 1月, 2018 1 次提交

SUNRPC: Remove rpc_protocol() · 024fbf9c

由 Chuck Lever 提交于 12月 04, 2017

Since nfs4_create_referral_server was the only call site of
rpc_protocol, rpc_protocol can now be removed.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

024fbf9c

01 12月, 2017 1 次提交

SUNRPC: Handle ENETDOWN errors · eb5b46fa

由 Trond Myklebust 提交于 11月 30, 2017

Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

eb5b46fa

18 11月, 2017 2 次提交

sunrpc: Add rpc_request static trace point · c435da68

由 Chuck Lever 提交于 11月 03, 2017

Display information about the RPC procedure being requested in the
trace log. This sometimes critical information cannot always be
derived from other RPC trace entries.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c435da68

net: sunrpc: mark expected switch fall-throughs · e9d47639

由 Gustavo A. R. Silva 提交于 10月 20, 2017

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e9d47639

07 9月, 2017 1 次提交

SUNRPC: remove some dead code. · f1ecbc21

由 NeilBrown 提交于 8月 18, 2017

RPC_TASK_NO_RETRANS_TIMEOUT is set when cl_noretranstimeo
is set, which happens when  RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT is set,
which happens when NFS_CS_NO_RETRANS_TIMEOUT is set.

This flag means "don't resend on a timeout, only resend if the
connection gets broken for some reason".

cl_discrtry is set when RPC_CLNT_CREATE_DISCRTRY is set, which
happens when NFS_CS_DISCRTRY is set.

This flag means "always disconnect before resending".

NFS_CS_NO_RETRANS_TIMEOUT and NFS_CS_DISCRTRY are both only set
in nfs4_init_client(), and it always sets both.

So we will never have a situation where only one of the flags is set.
So this code, which tests if timeout retransmits are allowed, and
disconnection is required, will never run.

So it makes sense to remove this code as it cannot be tested and
could confuse people reading the code (like me).

(alternately we could leave it there with a comment saying
 it is never actually used).
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f1ecbc21

21 8月, 2017 1 次提交

SUNRPC: ECONNREFUSED should cause a rebind. · fd01b259

由 NeilBrown 提交于 8月 18, 2017

If you
 - mount and NFSv3 filesystem
 - do some file locking which requires the server
   to make a GRANT call back
 - unmount
 - mount again and do the same locking

then the second attempt at locking suffers a 30 second delay.
Unmounting and remounting causes lockd to stop and restart,
which causes it to bind to a new port.
The server still thinks the old port is valid and gets ECONNREFUSED
when trying to contact it.
ECONNREFUSED should be seen as a hard error that is not worth
retrying.  Rebinding is the only reasonable response.

This patch forces a rebind if that makes sense.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fd01b259

14 7月, 2017 4 次提交

sunrpc: mark all struct rpc_procinfo instances as const · 511e936b

由 Christoph Hellwig 提交于 5月 12, 2017

struct rpc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com>

511e936b

sunrpc: move p_count out of struct rpc_procinfo · c551858a

由 Christoph Hellwig 提交于 5月 08, 2017

p_count is the only writeable memeber of struct rpc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct rpc_procinfo, and into a
separate writable array that is pointed to by struct rpc_version and
indexed by p_statidx.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c551858a

sunrpc: properly type argument to kxdrdproc_t · 993328e2

由 Christoph Hellwig 提交于 5月 08, 2017

Pass struct rpc_request as the first argument instead of an untyped blob.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com>

993328e2

sunrpc: properly type argument to kxdreproc_t · 0aebdc52

由 Christoph Hellwig 提交于 5月 08, 2017

Pass struct rpc_request as the first argument instead of an untyped blob,
and mark the data object as const.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJeff Layton <jlayton@redhat.com>

0aebdc52

15 5月, 2017 4 次提交

sunrpc: mark all struct rpc_procinfo instances as const · 499b4988

由 Christoph Hellwig 提交于 5月 12, 2017

struct rpc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com>

499b4988

sunrpc: move p_count out of struct rpc_procinfo · 1c5876dd

由 Christoph Hellwig 提交于 5月 08, 2017

p_count is the only writeable memeber of struct rpc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct rpc_procinfo, and into a
separate writable array that is pointed to by struct rpc_version and
indexed by p_statidx.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1c5876dd

sunrpc: properly type argument to kxdrdproc_t · 73c8dc13

由 Christoph Hellwig 提交于 5月 08, 2017

Pass struct rpc_request as the first argument instead of an untyped blob.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NTrond Myklebust <trond.myklebust@primarydata.com>

73c8dc13

sunrpc: properly type argument to kxdreproc_t · c512f36b

由 Christoph Hellwig 提交于 5月 08, 2017

Pass struct rpc_request as the first argument instead of an untyped blob,
and mark the data object as const.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJeff Layton <jlayton@redhat.com>

c512f36b

21 4月, 2017 1 次提交

sunrpc: don't check for failure from mempool_alloc() · 62b2417e

由 NeilBrown 提交于 4月 10, 2017

When mempool_alloc() is allowed to sleep (GFP_NOIO allows
sleeping) it cannot fail.
So rpc_alloc_task() cannot fail, so rpc_new_task doesn't need
to test for failure.
Consequently rpc_new_task() cannot fail, so the callers
don't need to test.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

62b2417e

10 2月, 2017 3 次提交

NFSv4: Set the connection timeout to match the lease period · 26ae102f

由 Trond Myklebust 提交于 2月 08, 2017

Set the timeout for TCP connections to be 1 lease period to ensure
that we don't lose our lease due to a faulty TCP connection.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

26ae102f

SUNRPC: Allow changing of the TCP timeout parameters on the fly · 7196dbb0

由 Trond Myklebust 提交于 2月 08, 2017

When the NFSv4 server tells us the lease period, we usually want
to adjust down the timeout parameters on the TCP connection to
ensure that we don't miss lease renewals due to a faulty connection.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

7196dbb0

SUNRPC: Remove unused function rpc_get_timeout() · d23bb113

由 Trond Myklebust 提交于 2月 08, 2017

Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d23bb113

25 1月, 2017 1 次提交

SUNRPC: cleanup ida information when removing sunrpc module · c929ea0b

由 Kinglong Mee 提交于 1月 20, 2017

After removing sunrpc module, I get many kmemleak information as,
unreferenced object 0xffff88003316b1e0 (size 544):
  comm "gssproxy", pid 2148, jiffies 4294794465 (age 4200.081s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb0cfb58a>] kmemleak_alloc+0x4a/0xa0
    [<ffffffffb03507fe>] kmem_cache_alloc+0x15e/0x1f0
    [<ffffffffb0639baa>] ida_pre_get+0xaa/0x150
    [<ffffffffb0639cfd>] ida_simple_get+0xad/0x180
    [<ffffffffc06054fb>] nlmsvc_lookup_host+0x4ab/0x7f0 [lockd]
    [<ffffffffc0605e1d>] lockd+0x4d/0x270 [lockd]
    [<ffffffffc06061e5>] param_set_timeout+0x55/0x100 [lockd]
    [<ffffffffc06cba24>] svc_defer+0x114/0x3f0 [sunrpc]
    [<ffffffffc06cbbe7>] svc_defer+0x2d7/0x3f0 [sunrpc]
    [<ffffffffc06c71da>] rpc_show_info+0x8a/0x110 [sunrpc]
    [<ffffffffb044a33f>] proc_reg_write+0x7f/0xc0
    [<ffffffffb038e41f>] __vfs_write+0xdf/0x3c0
    [<ffffffffb0390f1f>] vfs_write+0xef/0x240
    [<ffffffffb0392fbd>] SyS_write+0xad/0x130
    [<ffffffffb0d06c37>] entry_SYSCALL_64_fastpath+0x1a/0xa9
    [<ffffffffffffffff>] 0xffffffffffffffff

I found, the ida information (dynamic memory) isn't cleanup.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Fixes: 2f048db4 ("SUNRPC: Add an identifier for struct rpc_clnt")
Cc: stable@vger.kernel.org # v3.12+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

c929ea0b

02 12月, 2016 1 次提交

sunrpc: Don't engage exponential backoff when connection attempt is rejected. · 2c2ee6d2

由 NeilBrown 提交于 11月 23, 2016

xs_connect() contains an exponential backoff mechanism so the repeated
connection attempts are delayed by longer and longer amounts.

This is appropriate when the connection failed due to a timeout, but
it not appropriate when a definitive "no" answer is received.  In such
cases, call_connect_status() imposes a minimum 3-second back-off, so
not having the exponetial back-off will never result in immediate
retries.

The current situation is a problem when the NFS server tries to
register with rpcbind but rpcbind isn't running.  All connection
attempts are made on the same "xprt" and as the connection is never
"closed", the exponential back delays successive attempts to register,
or de-register, different protocols.  This results in a multi-minute
delay with no benefit.

So, when call_connect_status() receives a definitive "no", use
xprt_conditional_disconnect() to cancel the previous connection attempt.
This will set XPRT_CLOSE_WAIT so that xprt->ops->close() calls xs_close()
which resets the reestablish_timeout.

To ensure xprt_conditional_disconnect() does the right thing, we
ensure that rq_connect_cookie is set before a connection attempt, and
allow xprt_conditional_disconnect() to complete even when the
transport is not fully connected.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2c2ee6d2

08 11月, 2016 1 次提交

SUNRPC: Fix suspicious RCU usage · bb29dd84

由 Anna Schumaker 提交于 10月 26, 2016

We need to hold the rcu_read_lock() when calling rcu_dereference(),
otherwise we can't guarantee that the object being dereferenced still
exists.

Fixes: 39e5d2df ("SUNRPC search xprt switch for sockaddr")
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bb29dd84

20 9月, 2016 9 次提交

SUNRPC: Separate buffer pointers for RPC Call and Reply messages · 68778945

由 Chuck Lever 提交于 9月 15, 2016

For xprtrdma, the RPC Call and Reply buffers are involved in real
I/O operations.

To start with, the DMA direction of the I/O for a Call is opposite
that of a Reply.

In the current arrangement, the Reply buffer address is on a
four-byte alignment just past the call buffer. Would be friendlier
on some platforms if that was at a DMA cache alignment instead.

Because the current arrangement allocates a single memory region
which contains both buffers, the RPC Reply buffer often contains a
page boundary in it when the Call buffer is large enough (which is
frequent).

It would be a little nicer for setting up DMA operations (and
possible registration of the Reply buffer) if the two buffers were
separated, well-aligned, and contained as few page boundaries as
possible.

Now, I could just pad out the single memory region used for the pair
of buffers. But frequently that would mean a lot of unused space to
ensure the Reply buffer did not have a page boundary.

Add a separate pointer to rpc_rqst that points right to the RPC
Reply buffer. This makes no difference to xprtsock, but it will help
xprtrdma in subsequent patches.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

68778945

SUNRPC: Generalize the RPC buffer allocation API · 5fe6eaa1

由 Chuck Lever 提交于 9月 15, 2016

xprtrdma needs to allocate the Call and Reply buffers separately.
TBH, the reliance on using a single buffer for the pair of XDR
buffers is transport implementation-specific.

Transports that want to allocate separate Call and Reply buffers
will ignore the "size" argument anyway.  Don't bother passing it.

The buf_alloc method can't return two pointers. Instead, make the
method's return value an error code, and set the rq_buffer pointer
in the method itself.

This gives call_allocate an opportunity to terminate an RPC instead
of looping forever when a permanent problem occurs. If a request is
just bogus, or the transport is in a state where it can't allocate
resources for any request, there needs to be a way to kill the RPC
right there and not loop.

This immediately fixes a rare problem in the backchannel send path,
which loops if the server happens to send a CB request whose
call+reply size is larger than a page (which it shouldn't do yet).

One more issue: looks like xprt_inject_disconnect was incorrectly
placed in the failure path in call_allocate. It needs to be in the
success path, as it is for other call-sites.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5fe6eaa1

SUNRPC: Refactor rpc_xdr_buf_init() · b9c5bc03

由 Chuck Lever 提交于 9月 15, 2016

Clean up: there is some XDR initialization logic that is common
to the forward channel and backchannel. Move it to an XDR header
so it can be shared.

rpc_rqst::rq_buffer points to a buffer containing big-endian data.
Update its annotation as part of the clean up.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

b9c5bc03

SUNRPC: rpc_clnt_add_xprt setup function for NFS layer · fda0ab41