提交 · 79f687a3de9e3ba2518b4ea33f38ca6cbe9133eb · openeuler / Kernel

03 12月, 2016 1 次提交

NFS: Fix a performance regression in readdir · 79f687a3

由 Trond Myklebust 提交于 11月 19, 2016

Ben Coddington reports that commit 311324ad, by adding the function
nfs_dir_mapping_need_revalidate() that checks page cache validity on
each call to nfs_readdir() causes a performance regression when
the directory is being modified.

If the directory is changing while we're iterating through the directory,
POSIX does not require us to invalidate the page cache unless the user
calls rewinddir(). However, we still do want to ensure that we use
readdirplus in order to avoid a load of stat() calls when the user
is doing an 'ls -l' workload.

The fix should be to invalidate the page cache immediately when we're
setting the NFS_INO_ADVISE_RDPLUS bit.
Reported-by: NBenjamin Coddington <bcodding@redhat.com>
Fixes: 311324ad ("NFS: Be more aggressive in using readdirplus...")
Reviewed-by: NBenjamin Coddington <bcodding@redhat.com>
Tested-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

79f687a3

02 12月, 2016 39 次提交

NFS: fix typo in parameter description · f36ab161

由 Wei Yongjun 提交于 10月 28, 2016

Fix typo in parameter description.

Fixes: 5405fc44 ("NFSv4.x: Add kernel parameter to control the
callback server")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f36ab161

NFS: discard nfs_lockowner structure. · d51fdb87

由 NeilBrown 提交于 10月 13, 2016

It now has only one field and is only used in one structure.
So replaced it in that structure by the field it contains.
Signed-off-by: NNeilBrown <neilb@suse.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d51fdb87

NFSv4: enhance nfs4_copy_lock_stateid to use a flock stateid if there is one · 8d424431

由 NeilBrown 提交于 10月 13, 2016

A process can have two possible lock owner for a given open file:
a per-process Posix lock owner and a per-open-file flock owner
Use both of these when searching for a suitable stateid to use.

With this patch, READ/WRITE requests will use the correct stateid
if a flock lock is active.
Signed-off-by: NNeilBrown <neilb@suse.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8d424431

NFSv4: change nfs4_select_rw_stateid to take a lock_context inplace of lock_owner · 17393475

由 NeilBrown 提交于 10月 13, 2016

The only time that a lock_context is not immediately available is in
setattr, and now that it has an open_context, it can easily find one
with nfs_get_lock_context.
This removes the need for the on-stack nfs_lockowner.

This change is preparation for correctly support flock stateids.
Signed-off-by: NNeilBrown <neilb@suse.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

17393475

NFSv4: change nfs4_do_setattr to take an open_context instead of a nfs4_state. · 29b59f94

由 NeilBrown 提交于 10月 13, 2016

The open_context can always lead directly to the state, and is always easily
available, so this is a straightforward change.
Doing this makes more information available to _nfs4_do_setattr() for use
in the next patch.
Signed-off-by: NNeilBrown <neilb@suse.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

29b59f94

NFSv4: add flock_owner to open context · 532d4def

由 NeilBrown 提交于 10月 13, 2016

An open file description (struct file) in a given process can be
associated with two different lock owners.

It can have a Posix lock owner which will be different in each process
that has a fd on the file.
It can have a Flock owner which will be the same in all processes.

When searching for a lock stateid to use, we need to consider both of these
owners

So add a new "flock_owner" to the "nfs_open_context" (of which there
is one for each open file description).

This flock_owner does not need to be reference-counted as there is a
1-1 relation between 'struct file' and nfs open contexts,
and it will never be part of a list of contexts.  So there is no need
for a 'flock_context' - just the owner is enough.

The io_count included in the (Posix) lock_context provides no
guarantee that all read-aheads that could use the state have
completed, so not supporting it for flock locks in not a serious
problem.  Synchronization between flock and read-ahead can be added
later if needed.

When creating an open_context for a non-openning create call, we don't have
a 'struct file' to pass in, so the lock context gets initialized with
a NULL owner, but this will never be used.

The flock_owner is not used at all in this patch, that will come later.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

532d4def

NFS: remove l_pid field from nfs_lockowner · b184b5c3

由 NeilBrown 提交于 10月 13, 2016

this field is not used in any important way and probably should
have been removed by

Commit: 8003d3c4 ("nfs4: treat lock owners as opaque values")

which removed the pid argument from nfs4_get_lock_state.

Except in unusual and uninteresting cases, two threads with the same
->tgid will have the same ->files pointer, so keeping them both
for comparison brings no benefit.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b184b5c3

NFS: Remove unused argument from nfs_direct_write_complete() · 4d3b55d3

由 Anna Schumaker 提交于 11月 23, 2016

This parameter hasn't been used since 2a009ec9 (Linux 3.13-rc3), so
let's remove it from this function.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4d3b55d3

NFS: Remove unused authflavour parameter from nfs_get_client() · 7d38de3f

由 Anna Schumaker 提交于 11月 17, 2016

This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so
let's remove it from this function and callers.
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7d38de3f

nfs: fix false positives in nfs40_walk_client_list() · ced85a75

由 J. Bruce Fields 提交于 11月 28, 2016

It's possible that two different servers can return the same (clientid,
verifier) pair purely by coincidence.  Both are 64-bit values, but
depending on the server implementation, they can be highly predictable
and collisions may be quite likely, especially when there are lots of
servers.

So, check for this case.  If the clientid and verifier both match, then
we actually know they *can't* be the same server, since a new
SETCLIENTID to an already-known server should have changed the verifier.

This helps fix a bug that could cause the client to mount a filesystem
from the wrong server.
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Tested-by: NYongcheng Yang <yoyang@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ced85a75

sunrpc: Don't engage exponential backoff when connection attempt is rejected. · 2c2ee6d2

由 NeilBrown 提交于 11月 23, 2016

xs_connect() contains an exponential backoff mechanism so the repeated
connection attempts are delayed by longer and longer amounts.

This is appropriate when the connection failed due to a timeout, but
it not appropriate when a definitive "no" answer is received.  In such
cases, call_connect_status() imposes a minimum 3-second back-off, so
not having the exponetial back-off will never result in immediate
retries.

The current situation is a problem when the NFS server tries to
register with rpcbind but rpcbind isn't running.  All connection
attempts are made on the same "xprt" and as the connection is never
"closed", the exponential back delays successive attempts to register,
or de-register, different protocols.  This results in a multi-minute
delay with no benefit.

So, when call_connect_status() receives a definitive "no", use
xprt_conditional_disconnect() to cancel the previous connection attempt.
This will set XPRT_CLOSE_WAIT so that xprt->ops->close() calls xs_close()
which resets the reestablish_timeout.

To ensure xprt_conditional_disconnect() does the right thing, we
ensure that rq_connect_cookie is set before a connection attempt, and
allow xprt_conditional_disconnect() to complete even when the
transport is not fully connected.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2c2ee6d2

pNFS: Skip invalid stateids when doing a bulk destroy · b85f5620

由 Trond Myklebust 提交于 11月 30, 2016

If the layout stateid is already invalid, we have no work to do.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

b85f5620

T
pNFS: Wait on outstanding layoutreturns to complete in pnfs_roc() · 29ade5db
由 Trond Myklebust 提交于 11月 30, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
29ade5db
T
pNFS: Don't mark the layout as freed if the last lseg is marked for return · abb3e1c8
由 Trond Myklebust 提交于 11月 30, 2016
```
Address another memory leak.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
abb3e1c8

pNFS: Sync the layout state bits in pnfs_cache_lseg_for_layoutreturn · 4aab9732

由 Trond Myklebust 提交于 11月 30, 2016

Ensure that the layout state bits are synced when we cache a layout
segment for layoutreturn using an appropriate call to
pnfs_set_plh_return_info.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4aab9732

pNFS: Fix bugs in _pnfs_return_layout · 24408f52

由 Trond Myklebust 提交于 11月 30, 2016

We need to honour the NFS_LAYOUT_RETURN_REQUESTED bit regardless of
whether or not there are layout segments pending.
Furthermore, we should ensure that we leave the plh_return_segs list
empty.

This patch fixes a memory leak of the layout segments on plh_return_segs.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

24408f52

pNFS: Clear all layout segment state in pnfs_mark_layout_stateid_invalid · fe1cf946

由 Trond Myklebust 提交于 11月 30, 2016

When the layout state is invalidated, then so is the layout segment
state, and hence we do need to clean up the state bits.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

fe1cf946

pNFS: Prevent unnecessary layoutreturns after delegreturn · 53e6fc86

由 Trond Myklebust 提交于 11月 19, 2016

If we cannot grab the inode or superblock, then we cannot pin the
layout header, and so we cannot send a layoutreturn as part of an
async delegreturn call. In this case, we currently end up sending
an extra layoutreturn after the delegreturn. Since the layout was
implicitly returned by the delegreturn, that just gets a BAD_STATEID.

The fix is to simply complete the return-on-close immediately.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

53e6fc86

pNFS: Enable layoutreturn operation for return-on-close · 1c5bd76d

由 Trond Myklebust 提交于 11月 16, 2016

Amend the pnfs return on close helper functions to enable sending the
layoutreturn op in CLOSE/DELEGRETURN. This closes a potential race between
CLOSE/DELEGRETURN and parallel OPEN calls to the same file, and allows the
client and the server to agree on whether or not there is an outstanding
layout.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1c5bd76d

T
pNFS: Clean up - add a helper to initialise struct layoutreturn_args · 828ed9ec
由 Trond Myklebust 提交于 11月 15, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
828ed9ec

NFSv4: Add encode/decode of the layoutreturn op in DELEGRETURN · 586f1c39

由 Trond Myklebust 提交于 11月 15, 2016

Add XDR encoding for the layoutreturn op, and storage for the layoutreturn
arguments to the DELEGRETURN compound.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

586f1c39

NFSv4: Add encode/decode of the layoutreturn op in CLOSE · cf805165

由 Trond Myklebust 提交于 11月 15, 2016

Add XDR encoding for the layoutreturn op, and storage for the layoutreturn
arguments to the CLOSE compound.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

cf805165

NFSv4: Fix missing operation accounting in NFS4_dec_delegreturn_sz · d8434d4c

由 Trond Myklebust 提交于 11月 16, 2016

We need to account for the reply to the PUTFH operation in the
DELEGRETURN compound.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d8434d4c

pNFS: Don't mark layout segments invalid on layoutreturn in pnfs_roc · 69820d22

由 Trond Myklebust 提交于 11月 15, 2016

The layoutreturn call will take care of invalidating the layout segments
once the call is successful.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

69820d22

T
pNFS: Get rid of unnecessary layout parameter in encode_layoutreturn callback · 94e5c571
由 Trond Myklebust 提交于 9月 15, 2016
```
The parameter is already present in the "args" structure.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
94e5c571
T
pNFS: Skip checking for return-on-close if the layout is invalid · 0cdc329e
由 Trond Myklebust 提交于 11月 21, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
0cdc329e

pNFS: Remove spurious wake up in pnfs_layout_remove_lseg() · e685d237

由 Trond Myklebust 提交于 11月 18, 2016

There is no change to the value of NFS_LAYOUT_RETURN, so we should
not be waking up the RPC call.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

e685d237

NFSv4: Ignore LAYOUTRETURN result if the layout doesn't match or is invalid · 2a974425

由 Trond Myklebust 提交于 11月 20, 2016

Fix a potential race with CB_LAYOUTRECALL in which the server recalls the
remaining layout segments while our LAYOUTRETURN is still in transit.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2a974425

pNFS: Do not free layout segments that are marked for return · 68f74479

由 Trond Myklebust 提交于 10月 12, 2016

We may want to process and transmit layout stat information for the
layout segments that are being returned, so we should defer freeing
them until after the layoutreturn has completed.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

68f74479

pNFS: Delay getting the layout header in CB_LAYOUTRECALL handlers · 7b410d9c

由 Trond Myklebust 提交于 10月 31, 2016

Instead of grabbing the layout, we want to get the inode so that we
can reduce races between layoutget and layoutrecall when the server
does not support call referring.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

7b410d9c

pNFS: consolidate the different range intersection tests · 17822b20

由 Trond Myklebust 提交于 10月 25, 2016

Both pnfs.c and the flexfiles code have their own versions of the
range intersection testing, and the "end_offset" helper.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

17822b20

pNFS: Fix race in pnfs_wait_on_layoutreturn · ee284e35

由 Trond Myklebust 提交于 11月 18, 2016

We must put the task to sleep while holding the inode->i_lock in order
to ensure atomicity with the test for NFS_LAYOUT_RETURN.

Fixes: 500d701f ("NFS41: make close wait for layoutreturn")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

ee284e35

pNFS: On error, do not send LAYOUTGET until the LAYOUTRETURN has completed · 6604b203

由 Trond Myklebust 提交于 10月 17, 2016

If there is an I/O error, we should not call LAYOUTGET until the
LAYOUTRETURN that reports the error is complete.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.8+

6604b203

pNFS: Force a retry of LAYOUTGET if the stateid doesn't match our cache · 9888d837

由 Trond Myklebust 提交于 11月 23, 2016

If the server sends us a completely new stateid, and the client thinks
it already holds a layout, then force a retry of the LAYOUTGET after
invalidating the existing layout in order to avoid corruption due to
races.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9888d837

pNFS: Clear NFS_LAYOUT_RETURN_REQUESTED when invalidating the layout stateid · ae5a459d

由 Trond Myklebust 提交于 11月 14, 2016

We must ensure that we don't schedule a layoutreturn if the layout stateid
has been marked as invalid.

Fixes: 2a59a041 ("pNFS: Fix pnfs_set_layout_stateid() to clear...")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.8+

ae5a459d

pNFS: Don't clear the layout stateid if a layout return is outstanding · 7b650994

由 Trond Myklebust 提交于 11月 14, 2016

If we no longer hold any layout segments, we're normally expected to
consider the layout stateid to be invalid. However we cannot assume this
if we're about to, or in the process of sending a layoutreturn.

Fixes: 334a8f37 ("pNFS: Don't forget the layout stateid if...")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.8+

7b650994

pNFS: Fix a deadlock between read resends and layoutreturn · 54e4a0df

由 Trond Myklebust 提交于 11月 27, 2016

We must not call nfs_pageio_init_read() on a new nfs_pageio_descriptor
while holding a reference to a layout segment, as that can deadlock
pnfs_update_layout().

Fixes: d67ae825 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org # v4.0+

54e4a0df

NFSv4.1: Fix regression in callback retry handling · 9a837856

由 Fred Isaman 提交于 9月 27, 2016

When initializing a freshly created slot for the calllback channel,
the seq_nr needs to be 0, not 1.  Otherwise validate_seqid
and nfs4_slot_wait_on_seqid get confused and believe that the
mpty slot corresponds to a previously sent reply.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9a837856

NFSv4: Optimise away forced revalidation when we know the attributes are OK · 1ad13dbc

由 Trond Myklebust 提交于 10月 27, 2016

The NFS_INO_REVAL_FORCED flag needs to be set if we just got a delegation,
and we see that there might still be some ambiguity as to whether or not
our attribute or data cache are valid.
In practice, this means that a call to nfs_check_inode_attributes() will
have noticed a discrepancy between cached attributes and measured ones,
so let's move the setting of NFS_INO_REVAL_FORCED to there.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1ad13dbc

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功