提交 · 0d10c2c170e3384dd63f40216d7af4673d5ebb50 · openanolis / cloud-kernel

05 8月, 2014 1 次提交
- T
  nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm · 3234975f
  由 Trond Myklebust 提交于 7月 30, 2014
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  3234975f
01 8月, 2014 1 次提交

nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache · 58fb12e6

由 Jeff Layton 提交于 7月 29, 2014

We don't want to rely on the client_mutex for protection in the case of
NFSv4 open owners. Instead, we add a mutex that will only be taken for
NFSv4.0 state mutating operations, and that will be released once the
entire compound is done.

Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay
take a reference to the stateowner when they are using it for NFSv4.0
open and lock replay caching.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

58fb12e6

30 7月, 2014 1 次提交

nfsd: print status when nfsd4_open fails to open file it just created · b3fbfe0e

由 Jeff Layton 提交于 7月 29, 2014

It's possible for nfsd to fail opening a file that it has just created.
When that happens, we throw a WARN but it doesn't include any info about
the error code. Print the status code to give us a bit more info.

Our QA group hit some of these warnings under some very heavy stress
testing. My suspicion is that they hit the file-max limit, but it's hard
to know for sure. Go ahead and add a -ENFILE mapping to
nfserr_serverfault to make the error more distinct (and correct).
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b3fbfe0e

10 7月, 2014 1 次提交

nfsd: Convert nfs4_check_open_reclaim() to work with lookup_clientid() · 0fe492db

由 Trond Myklebust 提交于 6月 30, 2014

lookup_clientid is preferable to find_confirmed_client since it's able
to use the cached client in the compound state.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0fe492db

09 7月, 2014 4 次提交

NFSD: Remove iattr parameter from nfsd_symlink() · 1e444f5b

由 Kinglong Mee 提交于 7月 01, 2014

Commit db2e747b (vfs: remove mode parameter from vfs_symlink())
have remove mode parameter from vfs_symlink.
So that, iattr isn't needed by nfsd_symlink now, just remove it.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1e444f5b

nfsd4: rename cr_linkname->cr_data · 7fb84306

由 J. Bruce Fields 提交于 6月 24, 2014

The name of a link is currently stored in cr_name and cr_namelen, and
the content in cr_linkname and cr_linklen.  That's confusing.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7fb84306

nfsd: let nfsd_symlink assume null-terminated data · 52ee0433

由 J. Bruce Fields 提交于 6月 20, 2014

Currently nfsd_symlink has a weird hack to serve callers who don't
null-terminate symlink data: it looks ahead at the next byte to see if
it's zero, and copies it to a new buffer to null-terminate if not.

That means callers don't have to null-terminate, but they *do* have to
ensure that the byte following the end of the data is theirs to read.

That's a bit subtle, and the NFSv4 code actually got this wrong.

So let's just throw out that code and let callers pass null-terminated
strings; we've already fixed them to do that.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

52ee0433

nfsd: fix rare symlink decoding bug · b829e919

由 J. Bruce Fields 提交于 6月 19, 2014

An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.

The vfs, on the other hand, wants null-terminated data.

The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.

The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.

But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.

Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.

In the NFSv2/v3 cases this actually appears to be safe:

	- nfs3svc_decode_symlinkargs explicitly null-terminates the data
	  (after first checking its length and copying it to a new
	  page).
	- NFSv2 limits symlinks to 1k.  The buffer holding the rpc
	  request is always at least a page, and the link data (and
	  previous fields) have maximum lengths that prevent the request
	  from reaching the end of a page.

In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.

The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though.  It should
really either do the copy itself every time or just require a
null-terminated string.
Reported-by: NJeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b829e919

28 6月, 2014 1 次提交

nfsd: fix rare symlink decoding bug · 76f47128

由 J. Bruce Fields 提交于 6月 19, 2014

An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.

The vfs, on the other hand, wants null-terminated data.

The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.

The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.

But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.

Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.

In the NFSv2/v3 cases this actually appears to be safe:

	- nfs3svc_decode_symlinkargs explicitly null-terminates the data
	  (after first checking its length and copying it to a new
	  page).
	- NFSv2 limits symlinks to 1k.  The buffer holding the rpc
	  request is always at least a page, and the link data (and
	  previous fields) have maximum lengths that prevent the request
	  from reaching the end of a page.

In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.

The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though.  It should
really either do the copy itself every time or just require a
null-terminated string.
Reported-by: NJeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

76f47128

23 6月, 2014 4 次提交

nfsd: add __force to opaque verifier field casts · f419992c

由 Jeff Layton 提交于 6月 17, 2014

sparse complains that we're stuffing non-byte-swapped values into
__be32's here. Since they're supposed to be opaque, it doesn't matter
much. Just add __force to make sparse happy.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f419992c

NFSD: Using exp_get for export getting · bf18f163

由 Kinglong Mee 提交于 6月 10, 2014

Don't using cache_get besides export.h, using exp_get for export.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

bf18f163

SUNRPC/NFSD: Change to type of bool for rq_usedeferral and rq_splice_ok · f15a5cf9

由 Kinglong Mee 提交于 6月 10, 2014

rq_usedeferral and rq_splice_ok are used as 0 and 1, just defined to bool.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f15a5cf9

NFSD: Using min/max/min_t/max_t for calculate · 3c7aa15d

由 Kinglong Mee 提交于 6月 10, 2014

Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3c7aa15d

07 6月, 2014 1 次提交

nfsd4: simplify server xdr->next_page use · 05638dc7

由 J. Bruce Fields 提交于 6月 02, 2014

The rpc code makes available to the NFS server an array of pages to
encod into.  The server represents its reply as an xdr buf, with the
head pointing into the first page in that array, the pages ** array
starting just after that, and the tail (if any) sharing any leftover
space in the page used by the head.

While encoding, we use xdr_stream->page_ptr to keep track of which page
we're currently using.

Currently we set xdr_stream->page_ptr to buf->pages, which makes the
head a weird exception to the rule that page_ptr always points to the
page we're currently encoding into.  So, instead set it to buf->pages -
1 (the page actually containing the head), and remove the need for a
little unintuitive logic in xdr_get_next_encode_buffer() and
xdr_truncate_encode.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

05638dc7

05 6月, 2014 2 次提交

nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound · 7025005d

由 Jeff Layton 提交于 5月 30, 2014

The memset of resp in svc_process_common should ensure that these are
already zeroed by the time they get here.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7025005d

nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open · ba5378b6

由 Jeff Layton 提交于 5月 30, 2014

In the NFS4_OPEN_CLAIM_PREVIOUS case, we should only mark it confirmed
if the nfs4_check_open_reclaim check succeeds.

In the NFS4_OPEN_CLAIM_DELEG_PREV_FH and NFS4_OPEN_CLAIM_DELEGATE_PREV
cases, I see no point in declaring the openowner confirmed when the
operation is going to fail anyway, and doing so might allow the client
to game things such that it wouldn't need to confirm a subsequent open
with the same owner.
Signed-off-by: NJeff Layton <jlayton@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ba5378b6

31 5月, 2014 10 次提交

nfsd4: better reservation of head space for krb5 · a5cddc88

由 J. Bruce Fields 提交于 5月 12, 2014

RPC_MAX_AUTH_SIZE is scattered around several places.  Better to set it
once in the auth code, where this kind of estimate should be made.  And
while we're at it we can leave it zero when we're not using krb5i or
krb5p.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a5cddc88

nfsd4: estimate sequence response size · ccae70a9

由 J. Bruce Fields 提交于 3月 23, 2014

Otherwise a following patch would turn off all 4.1 zero-copy reads.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ccae70a9

nfsd4: better estimate of getattr response size · b86cef60

由 J. Bruce Fields 提交于 3月 23, 2014

We plan to use this estimate to decide whether or not to allow zero-copy
reads. Currently we're assuming all getattr's are a page, which can be
both too small (ACLs e.g. may be arbitrarily long) and too large (after
an upcoming read patch this will unnecessarily prevent zero copy reads
in any read compound also containing a getattr).
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b86cef60

nfsd4: allow large readdirs · 561f0ed4

由 J. Bruce Fields 提交于 1月 20, 2014

Currently we limit readdir results to a single page.  This can result in
a performance regression compared to NFSv3 when reading large
directories.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

561f0ed4

nfsd4: more precise nfsd4_max_reply · 4f0cefbf

由 J. Bruce Fields 提交于 3月 11, 2014

It will turn out to be useful to have a more accurate estimate of reply
size; so, piggyback on the existing op reply-size estimators.

Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct
nfsd4_operation and friends.  (Thanks to Christoph Hellwig for pointing
out that simplification.)
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4f0cefbf

nfsd4: don't try to encode conflicting owner if low on space · 8c7424cf

由 J. Bruce Fields 提交于 3月 10, 2014

I ran into this corner case in testing: in theory clients can provide
state owners up to 1024 bytes long.  In the sessions case there might be
a risk of this pushing us over the DRC slot size.

The conflicting owner isn't really that important, so let's humor a
client that provides a small maxresponsize_cached by allowing ourselves
to return without the conflicting owner instead of outright failing the
operation.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8c7424cf

nfsd4: allow encoding across page boundaries · 2825a7f9

由 J. Bruce Fields 提交于 8月 26, 2013

After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2825a7f9

nfsd4: size-checking cleanup · a8095f7e

由 J. Bruce Fields 提交于 3月 11, 2014

Better variable name, some comments, etc.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a8095f7e

nfsd4: remove redundant encode buffer size checking · ea8d7720

由 J. Bruce Fields 提交于 3月 08, 2014

Now that all op encoders can handle running out of space, we no longer
need to check the remaining size for every operation; only nonidempotent
operations need that check, and that can be done by
nfsd4_check_resp_size.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ea8d7720

nfsd4: teach encoders to handle reserve_space failures · d0a381dd

由 J. Bruce Fields 提交于 1月 30, 2014

We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
and special checking in those operations (getattr) whose result can vary
enormously.

However:
	- COMPOUND_SLACK_SPACE may be difficult to maintain as we add
	  more protocol.
	- BUG_ON or page faulting on failure seems overly fragile.
	- Especially in the 4.1 case, we prefer not to fail compounds
	  just because the returned result came *close* to session
	  limits.  (Though perfect enforcement here may be difficult.)
	- I'd prefer encoding to be uniform for all encoders instead of
	  having special exceptions for encoders containing, for
	  example, attributes.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d0a381dd

29 5月, 2014 3 次提交

J
nfsd4: keep xdr buf length updated · 6ac90391
由 J. Bruce Fields 提交于 2月 26, 2014
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
6ac90391

nfsd4: use xdr_stream throughout compound encoding · d3f627c8

由 J. Bruce Fields 提交于 2月 26, 2014

Note this makes ADJUST_ARGS useless; we'll remove it in the following
patch.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d3f627c8

nfsd4: use xdr_reserve_space in attribute encoding · ddd1ea56

由 J. Bruce Fields 提交于 8月 27, 2013

This is a cosmetic change for now; no change in behavior.

Note we're just depending on xdr_reserve_space to do the bounds checking
for us, we're not really depending on its adjustment of iovec or xdr_buf
lengths yet, as those are fixed up by as necessary after the fact by
read-link operations and by nfs4svc_encode_compoundres.  However we do
have to update xdr->iov on read-like operations to prevent
xdr_reserve_space from messing with the already-fixed-up length of the
the head.

When the attribute encoding fails partway through we have to undo the
length adjustments made so far.  We do it manually for now, but later
patches will add an xdr_truncate_encode() helper to handle cases like
this.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ddd1ea56

27 5月, 2014 1 次提交

nfsd4: fix encoding of out-of-space replies · 07d1f802

由 J. Bruce Fields 提交于 3月 06, 2014

If nfsd4_check_resp_size() returns an error then we should really be
truncating the reply here, otherwise we may leave extra garbage at the
end of the rpc reply.

Also add a warning to catch any cases where our reply-size estimates may
be wrong in the case of a non-idempotent operation.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

07d1f802

23 5月, 2014 8 次提交

nfsd4: reserve head space for krb5 integ/priv info · 1802a678

由 J. Bruce Fields 提交于 1月 21, 2014

Currently if the nfs-level part of a reply would be too large, we'll
return an error to the client.  But if the nfs-level part fits and
leaves no room for krb5p or krb5i stuff, then we just drop the request
entirely.

That's no good.  Instead, reserve some slack space at the end of the
buffer and make sure we fail outright if we'd come close.

The slack space here is a massive overstimate of what's required, we
should probably try for a tighter limit at some point.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1802a678

nfsd4: move proc_compound xdr encode init to helper · 2d124dfa

由 J. Bruce Fields 提交于 1月 15, 2014

Mechanical transformation with no change of behavior.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2d124dfa

nfsd4: tweak nfsd4_encode_getattr to take xdr_stream · d5184658

由 J. Bruce Fields 提交于 8月 26, 2013

Just change the nfsd4_encode_getattr api.  Not changing any code or
adding any new functionality yet.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d5184658

nfsd4: embed xdr_stream in nfsd4_compoundres · 4aea24b2

由 J. Bruce Fields 提交于 1月 15, 2014

This is a mechanical transformation with no change in behavior.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4aea24b2

nfsd4: decoding errors can still be cached and require space · e372ba60

由 J. Bruce Fields 提交于 5月 19, 2014

Currently a non-idempotent op reply may be cached if it fails in the
proc code but not if it fails at xdr decoding.  I doubt there are any
xdr-decoding-time errors that would make this a problem in practice, so
this probably isn't a serious bug.

The space estimates should also take into account space required for
encoding of error returns.  Again, not a practical problem, though it
would become one after future patches which will tighten the space
estimates.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e372ba60

nfsd4: fix write reply size estimate · f34e432b

由 J. Bruce Fields 提交于 5月 16, 2014

The write reply also includes count and stable_how.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f34e432b

J
nfsd4: read size estimate should include padding · 622f560e
由 J. Bruce Fields 提交于 5月 16, 2014
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
622f560e

nfsd4: READ, READDIR, etc., are idempotent · 5b648699

由 J. Bruce Fields 提交于 3月 07, 2014

OP_MODIFIES_SOMETHING flags operations that we should be careful not to
initiate without being sure we have the buffer space to encode a reply.

None of these ops fall into that category.

We could probably remove a few more, but this isn't a very important
problem at least for ops whose reply size is easy to estimate.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5b648699

07 5月, 2014 1 次提交

NFSd: Clean up nfs4_preprocess_stateid_op · 14bcab1a

由 Trond Myklebust 提交于 4月 18, 2014

Move the state locking and file descriptor reference out from the
callers and into nfs4_preprocess_stateid_op() itself.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

14bcab1a

30 3月, 2014 1 次提交

NFSD: Clear wcc data between compound ops · 2336745e

由 Kinglong Mee 提交于 3月 29, 2014

Testing NFS4.0 by pynfs, I got some messeages as,
"nfsd: inode locked twice during operation."

When one compound RPC contains two or more ops that locks
the filehandle,the second op will cause the message.

As two SETATTR ops, after the first SETATTR, nfsd will not call
fh_put() to release current filehandle, it means filehandle have
unlocked with fh_post_saved = 1.
The second SETATTR find fh_post_saved = 1, and printk the message.

v2: introduce helper fh_clear_wcc().
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2336745e

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功