提交 · 4f0cefbf389c28b0a2be34960797adb0c84ee43d · openeuler / Kernel

31 5月, 2014 12 次提交

nfsd4: more precise nfsd4_max_reply · 4f0cefbf

由 J. Bruce Fields 提交于 3月 11, 2014

It will turn out to be useful to have a more accurate estimate of reply
size; so, piggyback on the existing op reply-size estimators.

Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct
nfsd4_operation and friends.  (Thanks to Christoph Hellwig for pointing
out that simplification.)
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4f0cefbf

nfsd4: don't try to encode conflicting owner if low on space · 8c7424cf

由 J. Bruce Fields 提交于 3月 10, 2014

I ran into this corner case in testing: in theory clients can provide
state owners up to 1024 bytes long.  In the sessions case there might be
a risk of this pushing us over the DRC slot size.

The conflicting owner isn't really that important, so let's humor a
client that provides a small maxresponsize_cached by allowing ourselves
to return without the conflicting owner instead of outright failing the
operation.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8c7424cf

nfsd4: convert 4.1 replay encoding · f5236013

由 J. Bruce Fields 提交于 3月 21, 2014

Limits on maxresp_sz mean that we only ever need to replay rpc's that
are contained entirely in the head.

The one exception is very small zero-copy reads.  That's an odd corner
case as clients wouldn't normally ask those to be cached.

in any case, this seems a little more robust.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f5236013

nfsd4: allow encoding across page boundaries · 2825a7f9

由 J. Bruce Fields 提交于 8月 26, 2013

After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2825a7f9

nfsd4: size-checking cleanup · a8095f7e

由 J. Bruce Fields 提交于 3月 11, 2014

Better variable name, some comments, etc.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a8095f7e

nfsd4: remove redundant encode buffer size checking · ea8d7720

由 J. Bruce Fields 提交于 3月 08, 2014

Now that all op encoders can handle running out of space, we no longer
need to check the remaining size for every operation; only nonidempotent
operations need that check, and that can be done by
nfsd4_check_resp_size.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ea8d7720

nfsd4: nfsd4_check_resp_size needn't recalculate length · 67492c99

由 J. Bruce Fields 提交于 3月 08, 2014

We're keeping the length updated as we go now, so there's no need for
the extra calculation here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

67492c99

nfsd4: reserve space before inlining 0-copy pages · 4e21ac4b

由 J. Bruce Fields 提交于 3月 22, 2014

Once we've included page-cache pages in the encoding it's difficult to
remove them and restart encoding.  (xdr_truncate_encode doesn't handle
that case.)  So, make sure we'll have adequate space to finish the
operation first.

For now COMPOUND_SLACK_SPACE checks should prevent this case happening,
but we want to remove those checks.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4e21ac4b

nfsd4: teach encoders to handle reserve_space failures · d0a381dd

由 J. Bruce Fields 提交于 1月 30, 2014

We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
and special checking in those operations (getattr) whose result can vary
enormously.

However:
	- COMPOUND_SLACK_SPACE may be difficult to maintain as we add
	  more protocol.
	- BUG_ON or page faulting on failure seems overly fragile.
	- Especially in the 4.1 case, we prefer not to fail compounds
	  just because the returned result came *close* to session
	  limits.  (Though perfect enforcement here may be difficult.)
	- I'd prefer encoding to be uniform for all encoders instead of
	  having special exceptions for encoders containing, for
	  example, attributes.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d0a381dd

nfsd4: "backfill" using write_bytes_to_xdr_buf · 082d4bd7

由 J. Bruce Fields 提交于 8月 29, 2013

Normally xdr encoding proceeds in a single pass from start of a buffer
to end, but sometimes we have to write a few bytes to an earlier
position.

Use write_bytes_to_xdr_buf for these cases rather than saving a pointer
to write to.  We plan to rewrite xdr_reserve_space to handle encoding
across page boundaries using a scratch buffer, and don't want to risk
writing to a pointer that was contained in a scratch buffer.

Also it will no longer be safe to calculate lengths by subtracting two
pointers, so use xdr_buf offsets instead.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

082d4bd7

nfsd4: use xdr_truncate_encode · 1fcea5b2

由 J. Bruce Fields 提交于 2月 26, 2014

Now that lengths are reliable, we can use xdr_truncate instead of
open-coding it everywhere.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1fcea5b2

rpc: xdr_truncate_encode · 3e19ce76

由 J. Bruce Fields 提交于 2月 25, 2014

This will be used in the server side in a few cases:
	- when certain operations (read, readdir, readlink) fail after
	  encoding a partial response.
	- when we run out of space after encoding a partial response.
	- in readlink, where we initially reserve PAGE_SIZE bytes for
	  data, then truncate to the actual size.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3e19ce76

29 5月, 2014 5 次提交

J
nfsd4: keep xdr buf length updated · 6ac90391
由 J. Bruce Fields 提交于 2月 26, 2014
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
6ac90391

nfsd4: no need for encode_compoundres to adjust lengths · dd97fdde

由 J. Bruce Fields 提交于 2月 26, 2014

xdr_reserve_space should now be calculating the length correctly as we
go, so there's no longer any need to fix it up here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

dd97fdde

nfsd4: remove ADJUST_ARGS · f46d382a

由 J. Bruce Fields 提交于 1月 31, 2014

It's just uninteresting debugging code at this point.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f46d382a

nfsd4: use xdr_stream throughout compound encoding · d3f627c8

由 J. Bruce Fields 提交于 2月 26, 2014

Note this makes ADJUST_ARGS useless; we'll remove it in the following
patch.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d3f627c8

nfsd4: use xdr_reserve_space in attribute encoding · ddd1ea56

由 J. Bruce Fields 提交于 8月 27, 2013

This is a cosmetic change for now; no change in behavior.

Note we're just depending on xdr_reserve_space to do the bounds checking
for us, we're not really depending on its adjustment of iovec or xdr_buf
lengths yet, as those are fixed up by as necessary after the fact by
read-link operations and by nfs4svc_encode_compoundres.  However we do
have to update xdr->iov on read-like operations to prevent
xdr_reserve_space from messing with the already-fixed-up length of the
the head.

When the attribute encoding fails partway through we have to undo the
length adjustments made so far.  We do it manually for now, but later
patches will add an xdr_truncate_encode() helper to handle cases like
this.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ddd1ea56

27 5月, 2014 2 次提交

nfsd4: allow space for final error return · 5f4ab945

由 J. Bruce Fields 提交于 3月 07, 2014

This post-encoding check should be taking into account the need to
encode at least an out-of-space error to the following op (if any).
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5f4ab945

nfsd4: fix encoding of out-of-space replies · 07d1f802

由 J. Bruce Fields 提交于 3月 06, 2014

If nfsd4_check_resp_size() returns an error then we should really be
truncating the reply here, otherwise we may leave extra garbage at the
end of the rpc reply.

Also add a warning to catch any cases where our reply-size estimates may
be wrong in the case of a non-idempotent operation.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

07d1f802

23 5月, 2014 16 次提交

nfsd4: reserve head space for krb5 integ/priv info · 1802a678

由 J. Bruce Fields 提交于 1月 21, 2014

Currently if the nfs-level part of a reply would be too large, we'll
return an error to the client.  But if the nfs-level part fits and
leaves no room for krb5p or krb5i stuff, then we just drop the request
entirely.

That's no good.  Instead, reserve some slack space at the end of the
buffer and make sure we fail outright if we'd come close.

The slack space here is a massive overstimate of what's required, we
should probably try for a tighter limit at some point.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1802a678

nfsd4: move proc_compound xdr encode init to helper · 2d124dfa

由 J. Bruce Fields 提交于 1月 15, 2014

Mechanical transformation with no change of behavior.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2d124dfa

nfsd4: tweak nfsd4_encode_getattr to take xdr_stream · d5184658

由 J. Bruce Fields 提交于 8月 26, 2013

Just change the nfsd4_encode_getattr api.  Not changing any code or
adding any new functionality yet.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d5184658

nfsd4: embed xdr_stream in nfsd4_compoundres · 4aea24b2

由 J. Bruce Fields 提交于 1月 15, 2014

This is a mechanical transformation with no change in behavior.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4aea24b2

nfsd4: decoding errors can still be cached and require space · e372ba60

由 J. Bruce Fields 提交于 5月 19, 2014

Currently a non-idempotent op reply may be cached if it fails in the
proc code but not if it fails at xdr decoding.  I doubt there are any
xdr-decoding-time errors that would make this a problem in practice, so
this probably isn't a serious bug.

The space estimates should also take into account space required for
encoding of error returns.  Again, not a practical problem, though it
would become one after future patches which will tighten the space
estimates.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e372ba60

nfsd4: fix write reply size estimate · f34e432b

由 J. Bruce Fields 提交于 5月 16, 2014

The write reply also includes count and stable_how.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f34e432b

J
nfsd4: read size estimate should include padding · 622f560e
由 J. Bruce Fields 提交于 5月 16, 2014
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
622f560e

nfsd4: allow larger 4.1 session drc slots · 24906f32

由 J. Bruce Fields 提交于 3月 12, 2014

The client is actually asking for 2532 bytes. I suspect that's a
mistake. But maybe we can allow some more. In theory lock needs more
if it might return a maximum-length lockowner in the denied case.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

24906f32

nfsd4: READ, READDIR, etc., are idempotent · 5b648699

由 J. Bruce Fields 提交于 3月 07, 2014

OP_MODIFIES_SOMETHING flags operations that we should be careful not to
initiate without being sure we have the buffer space to encode a reply.

None of these ops fall into that category.

We could probably remove a few more, but this isn't a very important
problem at least for ops whose reply size is easy to estimate.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5b648699

nfsd: Only set PF_LESS_THROTTLE when really needed. · 8658452e

由 NeilBrown 提交于 5月 12, 2014

PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks
and live-locks while writing to the page cache in a loop-back
NFS mount situation.

It therefore makes sense to *only* set PF_LESS_THROTTLE in this
situation.
We now know when a request came from the local-host so it could be a
loop-back mount.  We already know when we are handling write requests,
and when we are doing anything else.

So combine those two to allow nfsd to still be throttled (like any
other process) in every situation except when it is known to be
problematic.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8658452e

SUNRPC: track whether a request is coming from a loop-back interface. · ef11ce24

由 NeilBrown 提交于 5月 12, 2014

If an incoming NFS request is coming from the local host, then
nfsd will need to perform some special handling.  So detect that
possibility and make the source visible in rq_local.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ef11ce24

SUNRPC: Fix a module reference leak in svc_handle_xprt · c789102c

由 Trond Myklebust 提交于 5月 18, 2014

If the accept() call fails, we need to put the module reference.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c789102c

NFSD: Ignore client's source port on RDMA transports · 16e4d93f

由 Chuck Lever 提交于 5月 19, 2014

An NFS/RDMA client's source port is meaningless for RDMA transports.
The transport layer typically sets the source port value on the
connection to a random ephemeral port.

Currently, NFS server administrators must specify the "insecure"
export option to enable clients to access exports via RDMA.

But this means NFS clients can access such an export via IP using an
ephemeral port, which may not be desirable.

This patch eliminates the need to specify the "insecure" export
option to allow NFS/RDMA clients access to an export.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

16e4d93f

nfsd: remove nfsd4_free_slab · abf1135b

由 Christoph Hellwig 提交于 5月 21, 2014

No need for a kmem_cache_destroy wrapper in nfsd, just do proper
goto based unwinding.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

abf1135b

nfsd: Remove assignments inside conditions · d40aa337

由 Benoit Taine 提交于 5月 22, 2014

Assignments should not happen inside an if conditional, but in the line
before. This issue was reported by checkpatch.

The semantic patch that makes this change is as follows
(http://coccinelle.lip6.fr/):

// <smpl>

@@
identifier i1;
expression e1;
statement S;
@@
-if(!(i1 = e1)) S
+i1 = e1;
+if(!i1)
+S

// </smpl>

It has been tested by compilation.
Signed-off-by: NBenoit Taine <benoit.taine@lip6.fr>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d40aa337

J

Merge 3.15 bugfixes for 3.16 · f35ea0d4
由 J. Bruce Fields 提交于 5月 22, 2014

f35ea0d4

22 5月, 2014 2 次提交

nfsd4: fix delegation cleanup on error · cbf7a75b

由 J. Bruce Fields 提交于 3月 03, 2014

We're not cleaning up everything we need to on error.  In particular,
we're not removing our lease.  Among other problems this can cause the
struct nfs4_file used as fl_owner to be referenced after it has been
destroyed.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

cbf7a75b

NFSD: Don't clear SUID/SGID after root writing data · 368fe39b

由 Kinglong Mee 提交于 4月 19, 2014

We're clearing the SUID/SGID bits on write by hand in nfsd_vfs_write,
even though the subsequent vfs_writev() call will end up doing this for
us (through file system write methods eventually calling
file_remove_suid(), e.g., from __generic_file_aio_write).

So, remove the redundant nfsd code.

The only change in behavior is when the write is by root, in which case
we previously cleared SUID/SGID, but will now leave it alone.  The new
behavior is the behavior of every filesystem we've checked.

It seems better to be consistent with local filesystem behavior.  And
the security advantage seems limited as root could always restore these
bits by hand if it wanted.

SUID/SGID is not cleared after writing data with (root, local ext4),
   File: ‘test’
   Size: 0               Blocks: 0          IO Block: 4096   regular
empty file
Device: 803h/2051d      Inode: 1200137     Links: 1
Access: (4777/-rwsrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2014-04-18 21:36:31.016029014 +0800
Modify: 2014-04-18 21:36:31.016029014 +0800
Change: 2014-04-18 21:36:31.026030285 +0800
  Birth: -
   File: ‘test’
   Size: 5               Blocks: 8          IO Block: 4096   regular file
Device: 803h/2051d      Inode: 1200137     Links: 1
Access: (4777/-rwsrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2014-04-18 21:36:31.016029014 +0800
Modify: 2014-04-18 21:36:31.040032065 +0800
Change: 2014-04-18 21:36:31.040032065 +0800
  Birth: -

With no_root_squash, (root, remote ext4), SUID/SGID are cleared,
   File: ‘test’
   Size: 0               Blocks: 0          IO Block: 262144 regular
empty file
Device: 24h/36d Inode: 786439      Links: 1
Access: (4777/-rwsrwxrwx)  Uid: ( 1000/    test)   Gid: ( 1000/    test)
Context: system_u:object_r:nfs_t:s0
Access: 2014-04-18 21:45:32.155805097 +0800
Modify: 2014-04-18 21:45:32.155805097 +0800
Change: 2014-04-18 21:45:32.168806749 +0800
  Birth: -
   File: ‘test’
   Size: 5               Blocks: 8          IO Block: 262144 regular file
Device: 24h/36d Inode: 786439      Links: 1
Access: (0777/-rwxrwxrwx)  Uid: ( 1000/    test)   Gid: ( 1000/    test)
Context: system_u:object_r:nfs_t:s0
Access: 2014-04-18 21:45:32.155805097 +0800
Modify: 2014-04-18 21:45:32.184808783 +0800
Change: 2014-04-18 21:45:32.184808783 +0800
  Birth: -
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

368fe39b

21 5月, 2014 2 次提交

nfsd4: warn on finding lockowner without stateid's · 27b11428

由 J. Bruce Fields 提交于 5月 08, 2014

The current code assumes a one-to-one lockowner<->lock stateid
correspondance.

Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

27b11428

nfsd4: remove lockowner when removing lock stateid · a1b8ff4c

由 J. Bruce Fields 提交于 5月 20, 2014

The nfsv4 state code has always assumed a one-to-one correspondance
between lock stateid's and lockowners even if it appears not to in some
places.

We may actually change that, but for now when FREE_STATEID releases a
lock stateid it also needs to release the parent lockowner.

Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
calls same_lockowner_ino on a lockowner that unexpectedly has an empty
so_stateids list.

Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a1b8ff4c

16 5月, 2014 1 次提交

nfsd4: fix corruption on setting an ACL. · 5513a510

由 J. Bruce Fields 提交于 5月 14, 2014

As of 06f9cc12 "nfsd4: don't create
unnecessary mask acl", any non-trivial ACL will be left with an
unitialized entry, and a trivial ACL may write one entry beyond what's
allocated.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5513a510

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功