提交 · cd6c5968582a273561464fe6b1e8cc8214be02df · openeuler / Kernel

18 12月, 2012 1 次提交

SUNRPC: continue run over clients list on PipeFS event instead of break · cd6c5968

由 Stanislav Kinsbursky 提交于 12月 17, 2012

There are SUNRPC clients, which program doesn't have pipe_dir_name. These
clients can be skipped on PipeFS events, because nothing have to be created or
destroyed. But instead of breaking in case of such a client was found, search
for suitable client over clients list have to be continued. Otherwise some
clients could not be covered by PipeFS event handler.
Signed-off-by: NStanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org [>= v3.4]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

cd6c5968

16 12月, 2012 6 次提交

NFS: Don't use SetPageError in the NFS writeback code · ada8e20d

由 Trond Myklebust 提交于 12月 15, 2012

The writeback code is already capable of passing errors back to user space
by means of the open_context->error. In the case of ENOSPC, Neil Brown
is reporting seeing 2 errors being returned.

Neil writes:

"e.g. if /mnt2/ if an nfs mounted filesystem that has no space then

strace dd if=/dev/zero conv=fsync >> /mnt2/afile count=1

reported Input/output error and the relevant parts of the strace output are:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
fsync(1)                                = -1 EIO (Input/output error)
close(1)                                = -1 ENOSPC (No space left on device)"

Neil then shows that the duplication of error messages appears to be due to
the use of the PageError() mechanism, which causes filemap_fdatawait_range
to return the extra EIO. The regression was introduced by
commit 7b281ee0 (NFS: fsync() must exit
with an error if page writeback failed).

Fix this by removing the call to SetPageError(), and just relying on
open_context->error reporting the ENOSPC back to fsync().
Reported-by: NNeil Brown <neilb@suse.de>
Tested-by: NNeil Brown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [3.6+]

ada8e20d

T
SUNRPC: variable 'svsk' is unused in function bc_send_request · 1efc2878
由 Trond Myklebust 提交于 12月 15, 2012
```
Silence a compile time warning.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
1efc2878

SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket · 4a20a988

由 Trond Myklebust 提交于 12月 15, 2012

Silence the unnecessary warning "unhandled error (111) connecting to..."
and convert it to a dprintk for debugging purposes.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4a20a988

NFSv4.1: Deal effectively with interrupted RPC calls. · ac20d163

由 Trond Myklebust 提交于 12月 15, 2012

If an RPC call is interrupted, assume that the server hasn't processed
the RPC call so that the next time we use the slot, we know that if we
get a NFS4ERR_SEQ_MISORDERED or NFS4ERR_SEQ_FALSE_RETRY, we just have
to bump the sequence number.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

ac20d163

NFSv4.1: Move the RPC timestamp out of the slot. · 8e63b6a8

由 Trond Myklebust 提交于 12月 15, 2012

Shave a few bytes off the slot table size by moving the RPC timestamp
into the sequence results.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8e63b6a8

NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED. · e8794440

由 Trond Myklebust 提交于 12月 15, 2012

If the server returns NFS4ERR_SEQ_MISORDERED, it could be a sign
that the slot was retired at some point. Retry the attempt after
reinitialising the slot sequence number to 1.

Also add a handler for NFS4ERR_SEQ_FALSE_RETRY. Just bump the slot
sequence number and retry...
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

e8794440

15 12月, 2012 3 次提交

T
NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0 · 65a0c149
由 Trond Myklebust 提交于 12月 14, 2012
```
If the inode has no links, then we should force a new lookup.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
65a0c149

NFS: Fix calls to drop_nlink() · 1f018458

由 Trond Myklebust 提交于 12月 14, 2012

It is almost always wrong for NFS to call drop_nlink() after removing a
file. What we really want is to mark the inode's attributes for
revalidation, and we want to ensure that the VFS drops it if we're
reasonably sure that this is the final unlink().
Do the former using the usual cache validity flags, and the latter
by testing if inode->i_nlink == 1, and clearing it in that case.

This also fixes the following warning reported by Neil Brown and
Jeff Layton (among others).

[634155.004438] WARNING:
at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.5.0/lin [634155.004442]
Hardware name: Latitude E6510 [634155.004577]  crc_itu_t crc32c_intel
snd_hwdep snd_pcm snd_timer snd soundcor [634155.004609] Pid: 13402, comm:
bash Tainted: G        W    3.5.0-36-desktop # [634155.004611] Call Trace:
[634155.004630]  [<ffffffff8100444a>] dump_trace+0xaa/0x2b0
[634155.004641]  [<ffffffff815a23dc>] dump_stack+0x69/0x6f
[634155.004653]  [<ffffffff81041a0b>] warn_slowpath_common+0x7b/0xc0
[634155.004662]  [<ffffffff811832e4>] drop_nlink+0x34/0x40
[634155.004687]  [<ffffffffa05bb6c3>] nfs_dentry_iput+0x33/0x70 [nfs]
[634155.004714]  [<ffffffff8118049e>] dput+0x12e/0x230
[634155.004726]  [<ffffffff8116b230>] __fput+0x170/0x230
[634155.004735]  [<ffffffff81167c0f>] filp_close+0x5f/0x90
[634155.004743]  [<ffffffff81167cd7>] sys_close+0x97/0x100
[634155.004754]  [<ffffffff815c3b39>] system_call_fastpath+0x16/0x1b
[634155.004767]  [<00007f2a73a0d110>] 0x7f2a73a0d10f
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [3.3+]

1f018458

T
NFS: Ensure that we always drop inodes that have been marked as stale · eed99357
由 Trond Myklebust 提交于 12月 14, 2012
```
There is no need to cache stale inodes.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
eed99357

13 12月, 2012 7 次提交

nfs: Remove unused list nfs4_clientid_list · 48d7a576

由 Yanchuan Nian 提交于 12月 13, 2012

This list was designed to store struct nfs4_client in the client side.
But nfs4_client was obsolete and has been removed from the source code.
So remove the unused list.
Signed-off-by: NYanchuan Nian <ycnian@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

48d7a576

nfs: Remove duplicate function declaration in internal.h · aaea7d2f

由 Yanchuan Nian 提交于 12月 13, 2012

Remove duplicate function declaration in internal.h
Signed-off-by: NYanchuan Nian <ycnian@gmail.com>
[Trond: Added nfs_pageio_init_read, which suffered from the same problem]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

aaea7d2f

NFS: avoid NULL dereference in nfs_destroy_server · f259613a

由 NeilBrown 提交于 12月 13, 2012

In rare circumstances, nfs_clone_server() of a v2 or v3 server can get
an error between setting server->destory (to nfs_destroy_server), and
calling nfs_start_lockd (which will set server->nlm_host).

If this happens, nfs_clone_server will call nfs_free_server which
will call nfs_destroy_server and thence nlmclnt_done(NULL).  This
causes the NULL to be dereferenced.

So add a guard to only call nlmclnt_done() if ->nlm_host is not NULL.

The other guards there are irrelevant as nlm_host can only be non-NULL
if one of these flags are set - so remove those tests.  (Thanks to Trond
for this suggestion).

This is suitable for any stable kernel since 2.6.25.

Cc: stable@vger.kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f259613a

SUNRPC handle EKEYEXPIRED in call_refreshresult · eb96d5c9

由 Andy Adamson 提交于 11月 27, 2012

Currently, when an RPCSEC_GSS context has expired or is non-existent
and the users (Kerberos) credentials have also expired or are non-existent,
the client receives the -EKEYEXPIRED error and tries to refresh the context
forever. If an application is performing I/O, or other work against the share,
the application hangs, and the user is not prompted to refresh/establish their
credentials. This can result in a denial of service for other users.

Users are expected to manage their Kerberos credential lifetimes to mitigate
this issue.

Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number
of times to refresh the gss_context, and then return -EACCES to the application.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

eb96d5c9

SUNRPC set gss gc_expiry to full lifetime · 620038f6

由 Andy Adamson 提交于 11月 27, 2012

Only use the default GSSD_MIN_TIMEOUT if the gss downcall timeout is zero.
Store the full lifetime in gc_expiry (not 3/4 of the lifetime) as subsequent
patches will use the gc_expiry to determine buffered WRITE behavior in the
face of expired or soon to be expired gss credentials.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

620038f6

nfs: fix page dirtying in NFS DIO read codepath · be7e9858

由 Jeff Layton 提交于 12月 12, 2012

The NFS DIO code will dirty pages that catch read responses in order to
handle the case where someone is doing DIO reads into an mmapped buffer.
The existing code doesn't really do the right thing though since it
doesn't take into account the case where we might be attempting to read
past the EOF.

Fix the logic in that code to only dirty pages that ended up receiving
data from the read. Note too that it really doesn't matter if
NFS_IOHDR_ERROR is set or not. All that matters is if the page was
altered by the read.

Cc: Fred Isaman <iisaman@netapp.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

be7e9858

nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ · 67fad106

由 Jeff Layton 提交于 12月 12, 2012

Eryu provided a test program that would segfault when attempting to read
past the EOF on file that was opened O_DIRECT. The buffer given to the
read() call was on the stack, and when he attempted to read past it it
would scribble over the rest of the stack page.

If we hit the end of the file on a DIO READ request, then we don't want
to zero out the rest of the buffer. These aren't pagecache pages after
all, and there's no guarantee that the buffers that were passed in
represent entire pages.

Cc: <stable@vger.kernel.org> # v3.5+
Cc: Fred Isaman <iisaman@netapp.com>
Reported-by: NEryu Guan <eguan@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

67fad106

12 12月, 2012 1 次提交

NFSv4.1: Be conservative about the client highest slotid · b0ef9647

由 Trond Myklebust 提交于 12月 11, 2012

If the server sends us a target that looks like an outlier, but
is lower than the existing target, then respect it anyway.
However defer actually updating the generation counter until
we get a target that doesn't look like an outlier.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b0ef9647

11 12月, 2012 4 次提交

NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly · 85563073

由 Trond Myklebust 提交于 12月 11, 2012

Most (all) NFS4ERR_BADSLOT errors are due to the client failing to
respect the server's sr_highest_slotid limit. This mainly happens
due to reordered RPC requests.
The way to handle it is simply to drop the slot that we're using,
and retry using the new highest_slotid limits.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

85563073

T

Merge branch 'bugfixes' into nfs-for-next · 7ce0171d
由 Trond Myklebust 提交于 12月 11, 2012

7ce0171d

nfs: don't extend writes to cover entire page if pagecache is invalid · 81d9bce5

由 Jeff Layton 提交于 12月 10, 2012

Jian reported that the following sequence would leave "testfile" with
corrupt data:

    # mount localhost:/export /mnt/nfs/ -o vers=3
    # echo abc > /mnt/nfs/testfile; echo def >> /export/testfile; echo ghi >> /mnt/nfs/testfile
    # cat -v /export/testfile
    abc
    ^@^@^@^@ghi

While there's no locking involved here, the operations are serialized,
so CTO should prevent corruption.

The first write to the file is fine and writes 4 bytes. The file is then
extended on the server. When it's reopened a GETATTR is issued and the
size change is noticed. This causes NFS_INO_INVALID_DATA to be set on
the file. Because the file is opened for write only,
nfs_want_read_modify_write() returns 0 to nfs_write_begin().
nfs_updatepage then calls nfs_write_pageuptodate() to see if it should
extend the nfs_page to cover the whole page. NFS_INO_INVALID_DATA is
still set on the file at that point, but that flag is ignored and
nfs_pageuptodate erroneously extends the write to cover the whole page,
with the write done on the server side filled in with zeroes.

This patch just has that function check for NFS_INO_INVALID_DATA in
addition to NFS_INO_REVAL_PAGECACHE. This fixes the bug, but looking
over the code, I wonder if we might have a similar bug in
nfs_revalidate_size(). The difference between those two flags is very
subtle, so it seems like we ought to be checking for
NFS_INO_INVALID_DATA in most of the places that we look for
NFS_INO_REVAL_PAGECACHE.

I believe this is regression introduced by commit 8d197a56. The code
did check for NFS_INO_INVALID_DATA prior to that patch.

Original bug report is here:

    https://bugzilla.redhat.com/show_bug.cgi?id=885743

Cc: <stable@vger.kernel.org> # 3.5+
Reported-by: NJian Li <jiali@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

81d9bce5

NFSv4: Check for buffer length in __nfs4_get_acl_uncached · 7d3e91a8

由 Sven Wegener 提交于 12月 08, 2012

Commit 1f1ea6c2 "NFSv4: Fix buffer overflow checking in
__nfs4_get_acl_uncached" accidently dropped the checking for too small
result buffer length.

If someone uses getxattr on "system.nfs4_acl" on an NFSv4 mount
supporting ACLs, the ACL has not been cached and the buffer suplied is
too short, we still copy the complete ACL, resulting in kernel and user
space memory corruption.
Signed-off-by: NSven Wegener <sven.wegener@stealer.net>
Cc: stable@kernel.org
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7d3e91a8

06 12月, 2012 18 次提交

NFSv4.1: Try to eliminate outliers when updating target_highest_slotid · 1fa80644

由 Trond Myklebust 提交于 12月 02, 2012

Look for sudden changes in the first and second derivatives in order
to eliminate outlier changes to target_highest_slotid (which are
due to out-of-order RPC replies).
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1fa80644

SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones · c05eecf6

由 Trond Myklebust 提交于 11月 30, 2012

Currently, the priority queues attempt to be 'fair' to lower priority
tasks by scheduling them after a certain number of higher priority tasks
have run. The problem is that both the transport send queue and
the NFSv4.1 session slot queue have strong ordering requirements.

This patch therefore removes the fairness code in favour of strong
ordering of task priorities.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

c05eecf6

NFSv4.1: Ensure smooth handover of slots from one task to the next waiting · b75ad4cd

由 Trond Myklebust 提交于 11月 29, 2012

Currently, we see a lot of bouncing for the value of highest_used_slotid
due to the fact that slots are getting freed, instead of getting instantly
transmitted to the next waiting task.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b75ad4cd

NFSv4: Reorder the XDR structures to put sequence at the top, not bottom · 62ae082d

由 Trond Myklebust 提交于 11月 29, 2012

Pre-condition for optimising the slot allocation and reintroducing FIFO
behaviour.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

62ae082d

NFSv4.1: Don't mess with task priorities in nfs41_setup_sequence · 1e1093c7

由 Trond Myklebust 提交于 11月 01, 2012

We want to preserve the rpc_task priority for things like writebacks,
that may have differing levels of urgency.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1e1093c7

NFS: Remove _nfs_call_sync_session · 104287cd

由 Bryan Schumaker 提交于 11月 12, 2012

All it does is pass its arguments through to another function.  Let's
cut out the middleman...
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

104287cd

NFSv4: Clean up handling of privileged operations · 8fe72bac

由 Trond Myklebust 提交于 10月 29, 2012

Privileged rpc calls are those that are run by the state recovery thread,
in cases where we're trying to recover the system after a server reboot
or a network partition. In those cases, we want to fence off all other
rpc calls (see nfs4_begin_drain_session()) so that they don't end up
using stateids or clientids that are in the process of being recovered.

Prior to this patch, we had to set up special callback functions in
order to declare an rpc call as being privileged.
By adding a new field to the sequence arguments, this patch simplifies
things considerably, and allows us to declare the rpc call as privileged
before it is run.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8fe72bac

NFSv4.1: Remove the 'FIFO' behaviour for nfs41_setup_sequence · 275e7e20

由 Trond Myklebust 提交于 11月 01, 2012

It is more important to preserve the task priority behaviour, which ensures
that things like reclaim writes take precedence over background and kupdate
writes.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

275e7e20

NFSv4.1: Clean up nfs41_setup_sequence · 7b939a3f

由 Trond Myklebust 提交于 11月 01, 2012

Move all the sleep-and-exit cases into a single section of code.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

7b939a3f

NFSv4: Simplify the NFSv4/v4.1 synchronous call switch · fd0c0953

由 Trond Myklebust 提交于 11月 01, 2012

We shouldn't need to pass the 'cache_reply' parameter if we
initialise the sequence_args/sequence_res in the caller.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

fd0c0953

NFSv4.1: Simplify the sequence setup · d9afbd1b

由 Trond Myklebust 提交于 10月 22, 2012

Nobody calls nfs4_setup_sequence or nfs41_setup_sequence without
also calling rpc_call_start() on success. This commit therefore
folds the rpc_call_start call into nfs41_setup_sequence().
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

d9afbd1b

NFSv4.1: Use nfs41_setup_sequence where appropriate · 6ba7db34

由 Trond Myklebust 提交于 10月 22, 2012

There is no point in using nfs4_setup_sequence or nfs4_sequence_done
in pure NFSv4.1 functions. We already know that those have sessions...
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6ba7db34

NFSv4.1: Ping server when our session table limits are too high · c10e4498

由 Trond Myklebust 提交于 11月 26, 2012

If the server requests a lower target_highest_slotid, then ensure
that we ping it with at least one RPC call containing an
appropriate SEQUENCE op. This ensures that the server won't need to
send a recall callback in order to shrink the slot table.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

c10e4498

NFSv4.1: Set the maximum slot table size to 1024 slots · 0ca3f482

由 Trond Myklebust 提交于 11月 21, 2012

This means that we end up statically allocating 128 bytes for the
bitmap on each slot table.
For a server that supports 1MB write and read I/O sizes this means
that we can completely fill the maximum 1GB TCP send/receive
windows.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

0ca3f482

T
NFSv4.1: Move slot table and session struct definitions to nfs4session.h · 76e697ba
由 Trond Myklebust 提交于 11月 26, 2012
```
Clean up. Gather NFSv4.1 slot definitions in fs/nfs/nfs4session.h.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
76e697ba
T
NFS: Remove unused function slot_idx · c34309a4
由 Trond Myklebust 提交于 11月 26, 2012
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
c34309a4

NFSv4.1: Cleanup move session slot management to fs/nfs/nfs4session.c · 73e39aaa

由 Trond Myklebust 提交于 11月 26, 2012

NFSv4.1 session management is getting complex enough to deserve
a separate file.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

73e39aaa

NFSv4: Move nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease · 33021279

由 Trond Myklebust 提交于 11月 26, 2012

nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease are both
generic state related functions. As such, they belong in nfs4state.c,
and not nfs4proc.c
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

33021279

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功