提交 · 9f0c5124f4a82503ee5d55c60b0b9c6afc3af68b · openanolis / cloud-kernel

15 9月, 2018 5 次提交

NFS: Don't open code clearing of delegation state · 9f0c5124

由 Trond Myklebust 提交于 9月 05, 2018

Add a helper for the case when the nfs4 open state has been set to use
a delegation stateid, and we want to revert to using the open stateid.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

9f0c5124

NFSv4.1 fix infinite loop on I/O. · 994b15b9

由 Trond Myklebust 提交于 9月 05, 2018

The previous fix broke recovery of delegated stateids because it assumes
that if we did not mark the delegation as suspect, then the delegation has
effectively been revoked, and so it removes that delegation irrespectively
of whether or not it is valid and still in use. While this is "mostly
harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
in an infinite loop while complaining that we're using an invalid stateid
(in this case the all-zero stateid).

What we rather want to do here is ensure that the delegation is always
correctly marked as needing testing when that is the case. So we want
to close the loophole offered by nfs4_schedule_stateid_recovery(),
which marks the state as needing to be reclaimed, but not the
delegation that may be backing it.

Fixes: 0e3d3e5d ("NFSv4.1 fix infinite loop on IO BAD_STATEID error")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

994b15b9

NFSv4: Fix a tracepoint Oops in initiate_file_draining() · 2edaead6

由 Trond Myklebust 提交于 9月 05, 2018

Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.

Fixes: ce5624f7 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2edaead6

pNFS: Ensure we return the error if someone kills a waiting layoutget · d03360aa

由 Trond Myklebust 提交于 9月 05, 2018

If someone interrupts a wait on one or more outstanding layoutgets in
pnfs_update_layout() then return the ERESTARTSYS/EINTR error.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d03360aa

NFSv4: Fix a tracepoint Oops in initiate_file_draining() · 2a534a74

由 Trond Myklebust 提交于 8月 23, 2018

Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.

Fixes: ce5624f7 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2a534a74

22 8月, 2018 2 次提交

pNFS: Remove unwanted optimisation of layoutget · 0af4c8be

由 Trond Myklebust 提交于 8月 21, 2018

If we knew that the file was empty, we wouldn't be asking for a layout.
Any optimisation here is already done before calling pnfs_update_layout().
As it stands, we sometimes end up doing an unnecessary inband read to
the MDS even when holding a layout.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0af4c8be

pNFS/flexfiles: ff_layout_pg_init_read should exit on error · 1c1aeaf1

由 Trond Myklebust 提交于 8月 21, 2018

If we get an error while retrieving the layout, then we should
report it rather than falling back to I/O through the MDS.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

1c1aeaf1

17 8月, 2018 2 次提交

pNFS: Treat RECALLCONFLICT like DELAY... · ea51f94b

由 Trond Myklebust 提交于 8月 15, 2018

Yes, it is possible to get trapped in a loop, but the server should be
administratively revoking the recalled layout if it never gets returned.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ea51f94b

pNFS: When updating the stateid in layoutreturn, also update the recall range · ecf84026

由 Trond Myklebust 提交于 8月 15, 2018

When we update the layout stateid in nfs4_layoutreturn_refresh_stateid, we
should also update the range in order to let the server know we're actually
returning everything.

Fixes: 16c278dbfa63 ("pnfs: Fix handling of NFS4ERR_OLD_STATEID replies...")
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

ecf84026

16 8月, 2018 1 次提交

NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence() · 8618289c

由 Trond Myklebust 提交于 8月 14, 2018

We must drop the lock before we can sleep in referring_call_exists().
Reported-by: NJia-Ju Bai <baijiaju1990@gmail.com>
Fixes: 045d2a6d ("NFSv4.1: Delay callback processing...")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8618289c

15 8月, 2018 1 次提交

NFSv4: Fix locking in pnfs_generic_recover_commit_reqs · d0fbb1d8

由 Trond Myklebust 提交于 8月 14, 2018

The use of the inode->i_lock was converted to a mutex, but we forgot
to remove the old inode unlock/lock() pair that allowed the layout
segment to be put inside the loop.
Reported-by: NJia-Ju Bai <baijiaju1990@gmail.com>
Fixes: e824f99a ("NFSv4: Use a mutex to protect the per-inode commit...")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d0fbb1d8

14 8月, 2018 3 次提交

NFSv4: Fix a typo in nfs4_init_channel_attrs() · 62421cd9

由 Trond Myklebust 提交于 8月 11, 2018

The back channel size is allowed to be 1 or greater.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

62421cd9

NFSv4: Don't busy wait if NFSv4 session draining is interrupted · 8aafd2fd

由 Trond Myklebust 提交于 8月 11, 2018

Catch the ERESTARTSYS error so that it can be processed by the callers.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8aafd2fd

NFS recover from destination server reboot for copies · e4648aa4

由 Olga Kornievskaia 提交于 8月 13, 2018

Mark the destination state to indicate a server-side copy is
happening. On detecting a reboot and recovering open state check
if any state is engaged in a server-side copy, if so, find the
copy and mark it and then signal the waiting thread. Upon wakeup,
if copy was marked then propage EAGAIN to the nfsd_copy_file_range
and restart the copy from scratch.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

e4648aa4

10 8月, 2018 9 次提交

NFS add a simple sync nfs4_proc_commit after async COPY · 6b8d84e2

由 Olga Kornievskaia 提交于 7月 09, 2018

A COPY with unstable write data needs a simple sync commit.
Filehandle value is gotten as a part of the inner loop so in
case of a reboot retry it should get the new value.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

6b8d84e2

NFS handle COPY ERR_OFFLOAD_NO_REQS · 539f57b3

由 Olga Kornievskaia 提交于 7月 09, 2018

If client sent async COPY and server replied with
ERR_OFFLOAD_NO_REQS, client should retry with a synchronous copy.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

539f57b3

NFS send OFFLOAD_CANCEL when COPY killed · c975c209

由 Olga Kornievskaia 提交于 7月 09, 2018

When COPY is killed by the user send OFFLOAD_CANCEL to server
processing the copy.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c975c209

NFS export nfs4_async_handle_error · 0f913a57

由 Olga Kornievskaia 提交于 7月 09, 2018

Make this function available to nfs42proc.c
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0f913a57

NFS handle COPY reply CB_OFFLOAD call race · bc0c9079

由 Olga Kornievskaia 提交于 7月 09, 2018

It's possible that server replies back with CB_OFFLOAD call and
COPY reply at the same time such that client will process
CB_OFFLOAD before reply to COPY. For that keep a list of pending
callback stateids received and then before waiting on completion
check the pending list.

Cleanup any pending copies on the client shutdown.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

bc0c9079

NFS add support for asynchronous COPY · 62164f31

由 Olga Kornievskaia 提交于 7月 09, 2018

Change xdr to always send COPY asynchronously.

Keep the list copies send in a list under a server structure.
Once copy is sent, it waits on a completion structure that will
be signalled by the callback thread that receives CB_OFFLOAD.

If CB_OFFLOAD returned an error and even if it returned partial
bytes, ignore them (as we can't commit without a verifier to
match) and return an error.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

62164f31

NFS COPY xdr handle async reply · 67aa7444

由 Olga Kornievskaia 提交于 7月 09, 2018

If server returns async reply, it must include a callback stateid,
wr_callback_id in the write_response4.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

67aa7444

NFS OFFLOAD_CANCEL xdr · cb95deea

由 Olga Kornievskaia 提交于 7月 09, 2018

Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

cb95deea

NFS CB_OFFLOAD xdr · 5178a125

由 Olga Kornievskaia 提交于 7月 09, 2018

Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5178a125

09 8月, 2018 10 次提交

NFS: Use an appropriate work queue for direct-write completion · 46483c2e

由 NeilBrown 提交于 8月 08, 2018

When a direct-write completes, a work_struct is schedule to handle
the completion.
When NFS is being used for swap, the direct write might be a swap-out,
so memory allocation can block until the write completes.
The work queue currently used is not WQ_MEM_RECLAIM, so tasks
can block waiting for memory - this leads to deadlock.

So use nfsiod_workqueue instead.  This will always have a running
thread, and work items should never block waiting for memory.
Signed-off-by: NNeil Brown <neilb@suse.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

46483c2e

NFSv4: Fix error handling in nfs4_sp4_select_mode() · 72bf75cf

由 Wei Yongjun 提交于 8月 02, 2018

Error code is set in the error handling cases but never used. Fix it.

Fixes: 937e3133 ("NFSv4.1: Ensure we clear the SP4_MACH_CRED flags in nfs4_sp4_select_mode()")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

72bf75cf

pnfs: Use true and false for boolean values · 10db5b7a

由 Gustavo A. R. Silva 提交于 8月 01, 2018

Return statements in functions returning bool should use true or false
instead of an integer value.

This issue was detected with the help of Coccinelle.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

10db5b7a

pnfs: pnfs_find_lseg() should not check NFS_LSEG_LAYOUTRETURN · 2230ca0d

由 Trond Myklebust 提交于 8月 01, 2018

Layout segment validity is determined only by the NFS_LSEG_VALID flag. If
it is set, the layout segment is finable. As it is, when the flexfiles
driver sets NFS_LSEG_LAYOUTRETURN to indicate that we cannot discard
the layout segment, but that it must be returned, then this can result
in an unnecessary layoutget storm.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2230ca0d

NFS: Mark expected switch fall-throughs · 01e03bdc

由 Gustavo A. R. Silva 提交于 7月 31, 2018

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Warning level 2 was used: -Wimplicit-fallthrough=2
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

01e03bdc

NFSv4: Mark the inode change attribute up to date in update_changeattr() · c8d07159

由 Trond Myklebust 提交于 7月 31, 2018

When we update the change attribute, we should also clear the flag that
says it is out of date.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c8d07159

NFSv4: Detect nlink changes on cross-directory renames too · 5636ec4e

由 Trond Myklebust 提交于 7月 31, 2018

If the object being renamed from one directory to another is also
a directory, then 'nlink' will change for both directories.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

5636ec4e

NFSv4: bump/drop the nlink count on the parent dir when we mkdir/rmdir · 3c591175

由 Trond Myklebust 提交于 7月 31, 2018

Ensure that we always bump or drop the nlink count on the parent directory
when we do a mkdir or a rmdir(). This needs to be done by hand as we don't
have pre/post op attributes.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

3c591175

pnfs: Fix handling of NFS4ERR_OLD_STATEID replies to layoutreturn · c16467dc

由 Trond Myklebust 提交于 7月 29, 2018

If the server tells us that out layoutreturn raced with another layout
update, then we must ensure that the new layout segments are not in use
before we resend with an updated layout stateid.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

c16467dc

xprtrdma: Fix disconnect regression · 8d4fb8ff

由 Chuck Lever 提交于 7月 28, 2018

I found that injecting disconnects with v4.18-rc resulted in
random failures of the multi-threaded git regression test.

The root cause appears to be that, after a reconnect, the
RPC/RDMA transport is waking pending RPCs before the transport has
posted enough Receive buffers to receive the Replies. If a Reply
arrives before enough Receive buffers are posted, the connection
is dropped. A few connection drops happen in quick succession as
the client and server struggle to regain credit synchronization.

This regression was introduced with commit 7c8d9e7c ("xprtrdma:
Move Receive posting to Receive handler"). The client is supposed to
post a single Receive when a connection is established because
it's not supposed to send more than one RPC Call before it gets
a fresh credit grant in the first RPC Reply [RFC 8166, Section
3.3.3].

Unfortunately there appears to be a longstanding bug in the Linux
client's credit accounting mechanism. On connect, it simply dumps
all pending RPC Calls onto the new connection. It's possible it has
done this ever since the RPC/RDMA transport was added to the kernel
ten years ago.

Servers have so far been tolerant of this bad behavior. Currently no
server implementation ever changes its credit grant over reconnects,
and servers always repost enough Receives before connections are
fully established.

The Linux client implementation used to post a Receive before each
of these Calls. This has covered up the flooding send behavior.

I could try to correct this old bug so that the client sends exactly
one RPC Call and waits for a Reply. Since we are so close to the
next merge window, I'm going to instead provide a simple patch to
post enough Receives before a reconnect completes (based on the
number of credits granted to the previous connection).

The spurious disconnects will be gone, but the client will still
send multiple RPC Calls immediately after a reconnect.

Addressing the latter problem will wait for a merge window because
a) I expect it to be a large change requiring lots of testing, and
b) obviously the Linux client has interoperated successfully since
day zero while still being broken.

Fixes: 7c8d9e7c ("xprtrdma: Move Receive posting to ... ")
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8d4fb8ff

01 8月, 2018 6 次提交

sunrpc: whitespace fixes · 8fdee4cc

由 Stephen Hemminger 提交于 7月 24, 2018

Remove trailing whitespace and blank line at EOF
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

8fdee4cc

NFSv4 client live hangs after live data migration recovery · 0f90be13

由 Bill Baker 提交于 6月 19, 2018

After a live data migration event at the NFS server, the client may send
I/O requests to the wrong server, causing a live hang due to repeated
recovery events.  On the wire, this will appear as an I/O request failing
with NFS4ERR_BADSESSION, followed by successful CREATE_SESSION, repeatedly.
NFS4ERR_BADSSESSION is returned because the session ID being used was
issued by the other server and is not valid at the old server.

The failure is caused by async worker threads having cached the transport
(xprt) in the rpc_task structure.  After the migration recovery completes,
the task is redispatched and the task resends the request to the wrong
server based on the old value still present in tk_xprt.

The solution is to recompute the tk_xprt field of the rpc_task structure
so that the request goes to the correct server.
Signed-off-by: NBill Baker <bill.baker@oracle.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NHelen Chao <helen.chao@oracle.com>
Fixes: fb43d172 ("SUNRPC: Use the multipath iterator to assign a ...")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

0f90be13

NFSv4.0 fix client reference leak in callback · 32cd3ee5

由 Olga Kornievskaia 提交于 7月 26, 2018

If there is an error during processing of a callback message, it leads
to refrence leak on the client structure and eventually an unclean
superblock.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

32cd3ee5

sunrpc: kstrtoul() can also return -ERANGE · 1a54c0cf

由 Dan Carpenter 提交于 7月 12, 2018

Smatch complains that "num" can be uninitialized when kstrtoul() returns
-ERANGE.  It's true enough, but basically harmless in this case.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

1a54c0cf

NFS: silence a harmless uninitialized variable warning · 379ebf07

由 Dan Carpenter 提交于 7月 12, 2018

kstrtoul() can return -ERANGE so Smatch complains that "num" can be
uninitialized.  We check that it's within bounds so it's not a huge
deal.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

379ebf07

sunrpc: Change rpc_print_iostats to rpc_clnt_show_stats and handle rpc_clnt clones · 016583d7

由 Dave Wysochanski 提交于 7月 31, 2018

The existing rpc_print_iostats has a few shortcomings. First, the naming
is not consistent with other functions in the kernel that display stats.
Second, it is really displaying stats for an rpc_clnt structure as it
displays both xprt stats and per-op stats. Third, it does not handle
rpc_clnt clones, which is important for the one in-kernel tree caller
of this function, the NFS client's nfs_show_stats function.

Fix all of the above by renaming the rpc_print_iostats to
rpc_clnt_show_stats and looping through any rpc_clnt clones via
cl_parent.

Once this interface is fixed, this addresses a problem with NFSv4.
Before this patch, the /proc/self/mountstats always showed incorrect
counts for NFSv4 lease and session related opcodes such as SEQUENCE,
RENEW, SETCLIENTID, CREATE_SESSION, etc. These counts were always 0
even though many ops would go over the wire. The reason for this is
there are multiple rpc_clnt structures allocated for any given NFSv4
mount, and inside nfs_show_stats() we callled into rpc_print_iostats()
which only handled one of them, nfs_server->client. Fix these counts
by calling sunrpc's new rpc_clnt_show_stats() function, which handles
cloned rpc_clnt structs and prints the stats together.

Note that one side-effect of the above is that multiple mounts from
the same NFS server will show identical counts in the above ops due
to the fact the one rpc_clnt (representing the NFSv4 client state)
is shared across mounts.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

016583d7

31 7月, 2018 1 次提交

sunrpc: Add _add_rpc_iostats() to add rpc_iostats metrics · 189e1955

由 Dave Wysochanski 提交于 7月 10, 2018

Add a helper function to add the metrics in two rpc_iostats structures.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

189e1955

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功