提交 · 6c342655022d5189c45e4f7ed0cc8048c9ad9815 · openanolis / cloud-kernel

09 6月, 2018 1 次提交

NFSv4: Return NFS4ERR_DELAY when a delegation recall fails due to igrab() · 6c342655

由 Trond Myklebust 提交于 6月 07, 2018

If the attempt to recall the delegation fails because the inode is
in the process of being evicted from cache, then use NFS4ERR_DELAY
to ask the server to retry later.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

6c342655

06 6月, 2018 2 次提交

NFSv4.0: Remove transport protocol name from non-UCS client ID · 025bb9f8

由 Chuck Lever 提交于 6月 04, 2018

Commit 69dd716c ("NFSv4: Add socket proto argument to
setclientid") (2007) added the transport protocol name to the client
ID string, but the patch description doesn't explain why this was
necessary.

At that time, the only transport protocol name that would have been
used is "tcp" (for both IPv4 and IPv6), resulting in no additional
distinctiveness of the client ID string.

Since there is one client instance, the server should recognize it's
state whether the client is connecting via TCP or RDMA. Same client,
same lease.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

025bb9f8

NFSv4.0: Remove cl_ipaddr from non-UCS client ID · 848a4eb2

由 Chuck Lever 提交于 6月 04, 2018

It is possible for two distinct clients to have the same cl_ipaddr:

 - if the client admin disables callback with clientaddr=0.0.0.0 on
   more than one client

 - if two clients behind separate NATs use the same private subnet
   number

 - if the client admin specifies the same address via clientaddr=
   mount option (pointing the server at the same NAT box, for
   example)

Because of the way the Linux NFSv4.0 client constructs its client
ID string by default, such clients could interfere with each others'
lease state when mounting the same server:

	scnprintf(str, len, "Linux NFSv4.0 %s/%s %s",
		clp->cl_ipaddr,
		rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR),
		rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_PROTO));

cl_ipaddr is set to the value of the clientaddr= mount option. Two
clients whose addresses are 192.168.3.77 that mount the same server
(whose public IP address is, say, 3.4.5.6) would both generate the
same client ID string when sending a SETCLIENTID:

  Linux NFSv4.0 192.168.3.77/3.4.5.6 tcp

and thus the server would not be able to distinguish the clients'
leases. If both clients are using AUTH_SYS when sending SETCLIENTID
then the server could possibly permit the two clients to interfere
with or purge each others' leases.

To better ensure that Linux's NFSv4.0 client ID strings are distinct
in these cases, remove cl_ipaddr from the client ID string and
replace it with something more likely to be unique. Note that the
replacement looks a lot like the uniform client ID string.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

848a4eb2

05 6月, 2018 12 次提交

NFSv4: Fix a compiler warning when CONFIG_NFS_V4_1 is undefined · 977294c7

由 Trond Myklebust 提交于 6月 05, 2018

Fix a compiler warning:
fs/nfs/nfs4proc.c:910:13: warning: 'nfs4_layoutget_release' defined but not used [-Wunused-function]
 static void nfs4_layoutget_release(void *calldata)
             ^~~~~~~~~~~~~~~~~~~~~~
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

977294c7

NFS: Filter cache invalidation when holding a delegation · 3f0b3cf4

由 Trond Myklebust 提交于 6月 03, 2018

If the client holds a delegation, then ensure we filter out attempts
to invalidate the size, owner, group owner, or mode unless we made the
change, in which case, check that NFS_INO_REVAL_FORCED is set by the
caller.
Always filter out attempts to invalidate the change attribute and
size, since we are authoritative for those.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

3f0b3cf4

NFS: Ignore NFS_INO_REVAL_FORCED in nfs_check_inode_attributes() · 4ebe83af

由 Trond Myklebust 提交于 6月 03, 2018

If we hold a delegation, we should not need to call
nfs_check_inode_attributes() since we already know which attributes
are valid, and which ones may still need revalidation. The state
of the NFS_INO_REVAL_FORCED flag is therefore irrelevant.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

4ebe83af

NFS: Improve caching while holding a delegation · c80d17c5

由 Trond Myklebust 提交于 6月 03, 2018

Make sure that the client completely ignores change attribute and size
changes on the server when it holds a delegation.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

c80d17c5

NFS: Fix attribute revalidation · 0b467264

由 Trond Myklebust 提交于 6月 03, 2018

Don't mark attributes as invalid just because they have changed. Instead,
for the purposes of adjusting the attribute cache timeout, keep a
separate variable that tracks whether or not a change occurred.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

0b467264

NFS: fix up nfs_setattr_update_inode · 6a97d02d

由 Trond Myklebust 提交于 4月 08, 2018

Always try to set the attributes, even if we don't have a valid struct
nfs_fattr.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

6a97d02d

NFSv4: Ensure the inode is clean when we set a delegation · 97c2c17a

由 Trond Myklebust 提交于 4月 07, 2018

If there are attributes that are still invalid when we set a delegation,
then we need to set the NFS_INO_REVAL_FORCED flag.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

97c2c17a

NFSv4: Ignore NFS_INO_REVAL_FORCED in nfs4_proc_access · 7c672654

由 Trond Myklebust 提交于 6月 04, 2018

If we hold a delegation, we don't need to care about whether or not
the inode attributes are up to date. We know we can cache the results
of this call regardless.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

7c672654

T
NFSv4: Don't ask for delegated attributes when adding a hard link · 2f28dc38
由 Trond Myklebust 提交于 4月 08, 2018
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
2f28dc38

NFSv4: Don't ask for delegated attributes when revalidating the inode · 771734f2

由 Trond Myklebust 提交于 4月 07, 2018

Again, when revalidating the inode, we don't need to ask for attributes
for which we are authoritative.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

771734f2

NFS: Pass the inode down to the getattr() callback · a841b54d

由 Trond Myklebust 提交于 4月 07, 2018

Allow the getattr() callback to check things like whether or not we hold
a delegation so that it can adjust the attributes that it is asking for.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

a841b54d

NFSv4: Don't request size+change attribute if they are delegated to us · 30846df0

由 Trond Myklebust 提交于 4月 07, 2018

When we hold a delegation, we should not need to request attributes such
as the file size or the change attribute. For some servers, avoiding
asking for these unneeded attributes can improve the overall system
performance.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

30846df0

01 6月, 2018 25 次提交

pnfs: Don't release the sequence slot until we've processed layoutget on open · ae55e59d

由 Trond Myklebust 提交于 5月 22, 2018

If the server recalls the layout that was just handed out, we risk hitting
a race as described in RFC5661 Section 2.10.6.3 unless we ensure that we
release the sequence slot after processing the LAYOUTGET operation that
was sent as part of the OPEN compound.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

ae55e59d

pnfs: Don't call commit on failed layoutget-on-open · 32f1c28f

由 Trond Myklebust 提交于 5月 22, 2018

If the layoutget on open call failed, we can't really commit the inode,
so don't bother calling it.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

32f1c28f

pNFS: Don't send LAYOUTGET on OPEN for read, if we already have cached data · 64294b08

由 Trond Myklebust 提交于 2月 02, 2017

If we're only opening the file for reading, and the file is empty and/or
we already have cached data, then heuristically optimise away the
LAYOUTGET.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

64294b08

NFSv4/pnfs: Don't switch off layoutget-on-open for transient errors · 8dc96566

由 Trond Myklebust 提交于 2月 01, 2017

Ensure that we only switch off the LAYOUTGET operation in the OPEN
compound when the server is truly broken, and/or it is complaining
that the compound is too large.
Currently, we end up turning off the functionality permanently,
even for transient errors such as EACCES or ENOSPC.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

8dc96566

NFSv4/pnfs: Ensure pnfs_parse_lgopen() won't try to parse uninitialised data · d49e0d5b

由 Trond Myklebust 提交于 2月 01, 2017

We need to ensure that pnfs_parse_lgopen() doesn't try to parse a
struct nfs4_layoutget_res that was not filled by a successful call
to decode_layoutget(). This can happen if we performed a cached open,
or if either the OP_ACCESS or OP_GETATTR operations preceding the
OP_LAYOUTGET in the compound returned an error.

By initialising the 'status' field to NFS4ERR_DELAY, we ensure that
pnfs_parse_lgopen() won't try to interpret the structure.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d49e0d5b

pnfs: Fix manipulation of NFS_LAYOUT_FIRST_LAYOUTGET · 30ae2412

由 Fred Isaman 提交于 10月 18, 2016

The flag was not always being cleared after LAYOUTGET on OPEN.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

30ae2412

pnfs: Add barrier to prevent lgopen using LAYOUTGET during recall · c49b5209

由 Fred Isaman 提交于 10月 05, 2016

Since the LAYOUTGET on OPEN can be sent without prior inode information,
existing methods to prevent LAYOUTGET from being sent while processing
CB_LAYOUTRECALL don't work. Track if a recall occurred while LAYOUTGET
was being sent, and if so ignore the results.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

c49b5209

pnfs: Stop attempting LAYOUTGET on OPEN on failure · 6e01260c

由 Fred Isaman 提交于 10月 04, 2016

Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

6e01260c

pnfs: Add LAYOUTGET to OPEN of an existing file · 78746a38

由 Fred Isaman 提交于 9月 22, 2016

Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

78746a38

pNFS: Refactor nfs4_layoutget_release() · 29a8bfe5

由 Trond Myklebust 提交于 5月 30, 2018

Move the actual freeing of the struct nfs4_layoutget into fs/nfs/pnfs.c
where it can be reused by the layoutget on open code.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

29a8bfe5

pnfs: Add LAYOUTGET to OPEN of a new file · 2409a976

由 Fred Isaman 提交于 10月 06, 2016

This triggers when have no pre-existing inode to attach to.
The preexisting case is saved for later.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

2409a976

pnfs: Change pnfs_alloc_init_layoutget_args call signature · 5e36e2a9

由 Fred Isaman 提交于 10月 06, 2016

Don't send in a layout, instead use the (possibly NULL) inode.

This is needed for LAYOUTGET attached to an OPEN where the inode is not
yet set.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

5e36e2a9

pnfs: Move nfs4_opendata into nfs4_fs.h · 1b146fcf

由 Fred Isaman 提交于 9月 21, 2016

It will be needed now by the pnfs code.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

1b146fcf

F
pnfs: Add conditional encode/decode of LAYOUTGET within OPEN compound · 56f487f8
由 Fred Isaman 提交于 9月 21, 2016
```
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
56f487f8

pnfs: move allocations out of nfs4_proc_layoutget · dacb452d

由 Fred Isaman 提交于 9月 19, 2016

They work better in the new alloc_init function.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

dacb452d

pnfs: refactor send_layoutget · 587f03de

由 Fred Isaman 提交于 9月 21, 2016

Pull out the alloc/init part for eventual reuse by OPEN.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

587f03de

pnfs: Add layout driver flag PNFS_LAYOUTGET_ON_OPEN · f86c3ac5

由 Fred Isaman 提交于 9月 20, 2016

Driver can set flag to allow LAYOUTGET to be sent with OPEN.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

f86c3ac5

NFS4: move ctx into nfs4_run_open_task · 3b65a30d

由 Fred Isaman 提交于 9月 19, 2016

Preparing to add conditional LAYOUTGET to OPEN rpc, the LAYOUTGET
will need the ctx info.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

3b65a30d

pnfs: Store return value of decode_layoutget for later processing · 808ba32a

由 Fred Isaman 提交于 10月 04, 2016

This will be needed to seperate return value of OPEN and LAYOUTGET
when they are combined into a single RPC.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

808ba32a

pnfs: Remove redundant assignment from nfs4_proc_layoutget(). · 34ec9aac

由 Fred Isaman 提交于 9月 20, 2016

nfs_init_sequence() will clear this for us.
Signed-off-by: NFred Isaman <fred.isaman@gmail.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

34ec9aac

NFSv4: Don't add a new lock on an interrupted wait for LOCK · a3cf9bca

由 Benjamin Coddington 提交于 5月 03, 2018

If the wait for a LOCK operation is interrupted, and then the file is
closed, the locks cleanup code will assume that no new locks will be added
to the inode after it has completed.  We already have a mechanism to detect
if there was signal, so let's use that to avoid recreating the local lock
once the RPC completes.  Also skip re-sending the LOCK operation for the
various error cases if we were signaled.
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
[Trond: Fix inverted test of locks_lock_inode_wait()]
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

a3cf9bca

NFSv4: Always clear the pNFS layout when handling ESTALE · cf61eb26

由 Trond Myklebust 提交于 5月 29, 2018

If we get an ESTALE error in response to an RPC call operating on the
file on the MDS, we should immediately cancel the layout for that file.
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

cf61eb26

NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message · d6889480

由 Dave Wysochanski 提交于 5月 29, 2018

In nfs_idmap_read_and_verify_message there is an incorrect sprintf '%d'
that converts the __u32 'im_id' from struct idmap_msg to 'id_str', which
is a stack char array variable of length NFS_UINT_MAXLEN == 11.
If a uid or gid value is > 2147483647 = 0x7fffffff, the conversion
overflows into a negative value, for example:
crash> p (unsigned) (0x80000000)
$1 = 2147483648
crash> p (signed) (0x80000000)
$2 = -2147483648
The '-' sign is written to the buffer and this causes a 1 byte overflow
when the NULL byte is written, which corrupts kernel stack memory.  If
CONFIG_CC_STACKPROTECTOR_STRONG is set we see a stack-protector panic:

[11558053.616565] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa05b8a8c
[11558053.639063] CPU: 6 PID: 9423 Comm: rpc.idmapd Tainted: G        W      ------------ T 3.10.0-514.el7.x86_64 #1
[11558053.641990] Hardware name: Red Hat OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014
[11558053.644462]  ffffffff818c7bc0 00000000b1f3aec1 ffff880de0f9bd48 ffffffff81685eac
[11558053.646430]  ffff880de0f9bdc8 ffffffff8167f2b3 ffffffff00000010 ffff880de0f9bdd8
[11558053.648313]  ffff880de0f9bd78 00000000b1f3aec1 ffffffff811dcb03 ffffffffa05b8a8c
[11558053.650107] Call Trace:
[11558053.651347]  [<ffffffff81685eac>] dump_stack+0x19/0x1b
[11558053.653013]  [<ffffffff8167f2b3>] panic+0xe3/0x1f2
[11558053.666240]  [<ffffffff811dcb03>] ? kfree+0x103/0x140
[11558053.682589]  [<ffffffffa05b8a8c>] ? idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4]
[11558053.689710]  [<ffffffff810855db>] __stack_chk_fail+0x1b/0x30
[11558053.691619]  [<ffffffffa05b8a8c>] idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4]
[11558053.693867]  [<ffffffffa00209d6>] rpc_pipe_write+0x56/0x70 [sunrpc]
[11558053.695763]  [<ffffffff811fe12d>] vfs_write+0xbd/0x1e0
[11558053.702236]  [<ffffffff810acccc>] ? task_work_run+0xac/0xe0
[11558053.704215]  [<ffffffff811fec4f>] SyS_write+0x7f/0xe0
[11558053.709674]  [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b

Fix this by calling the internally defined nfs_map_numeric_to_string()
function which properly uses '%u' to convert this __u32.  For consistency,
also replace the one other place where snprintf is called.
Signed-off-by: NDave Wysochanski <dwysocha@redhat.com>
Reported-by: NStephen Johnston <sjohnsto@redhat.com>
Fixes: cf4ab538 ("NFSv4: Fix the string length returned by the idmapper")
Cc: stable@vger.kernel.org # v3.4+
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d6889480

NFS: Fix up nfs_post_op_update_inode() to force ctime updates · d554168f

由 Trond Myklebust 提交于 5月 29, 2018

We do not want to ignore ctime updates that originate from functions
such as link().
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>

d554168f

T
NFS: Ensure we revalidate the inode correctly after setacl · 472f761e
由 Trond Myklebust 提交于 4月 08, 2018
```
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
```
472f761e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功