提交 · 5283b03ee5cd28d516646298bead09b238d92ddc · openeuler / Kernel

25 2月, 2017 3 次提交

nfs/nfsd/sunrpc: enforce transport requirements for NFSv4 · 5283b03e

由 Jeff Layton 提交于 2月 24, 2017

NFSv4 requires a transport "that is specified to avoid network
congestion" (RFC 7530, section 3.1, paragraph 2).  In practical terms,
that means that you should not run NFSv4 over UDP. The server has never
enforced that requirement, however.

This patchset fixes this by adding a new flag to the svc_version that
states that it has these transport requirements. With that, we can check
that the transport has XPT_CONG_CTRL set before processing an RPC. If it
doesn't we reject it with RPC_PROG_MISMATCH.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5283b03e

sunrpc: turn bitfield flags in svc_version into bools · 05a45a2d

由 Jeff Layton 提交于 2月 24, 2017

It's just simpler to read this way, IMO. Also, no need to explicitly
set vs_hidden to false in the nfsacl ones.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

05a45a2d

nfsd: remove superfluous KERN_INFO · 4ab495bf

由 Rasmus Villemoes 提交于 2月 24, 2017

dprintk already provides a KERN_* prefix; this KERN_INFO just shows up
as some odd characters in the output.

Simplify the message a bit while we're there.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4ab495bf

21 2月, 2017 2 次提交

nfsd: special case truncates some more · 783112f7

由 Christoph Hellwig 提交于 2月 20, 2017

Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributes to set to set various file attributes including the
file size and the uid/gid.

The Linux syscalls never mix size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates don't update random other attributes, and many other file
systems handle the case but do not update the other attributes in the
same transaction.  NFSD on the other hand passes the attributes it gets
on the wire more or less directly through to the VFS, leading to updates
the file systems don't expect.  XFS at least has an assert on the
allowed attributes, which caught an unusual NFS client setting the size
and group at the same time.

To handle this issue properly this splits the notify_change call in
nfsd_setattr into two separate ones.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Tested-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

783112f7

nfsd: minor nfsd_setattr cleanup · 758e99fe

由 Christoph Hellwig 提交于 2月 20, 2017

Simplify exit paths, size_change use.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: stable@kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

758e99fe

18 2月, 2017 6 次提交

NFSD: Reserve adequate space for LOCKT operation · 7323f0d2

由 Kinglong Mee 提交于 2月 03, 2017

After tightening the OP_LOCKT reply size estimate, we can get warnings
like:

[11512.783519] RPC request reserved 124 but used 152
[11512.813624] RPC request reserved 108 but used 136
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7323f0d2

NFSD: Get response size before operation for all RPCs · 2282cd2c

由 Kinglong Mee 提交于 2月 03, 2017

NFSD usess PAGE_SIZE as the reply size estimate for RPCs which don't
support op_rsize_bop(), A PAGE_SIZE (4096) is larger than many real
response sizes, eg, access (op_encode_hdr_size + 2), seek
(op_encode_hdr_size + 3).

This patch just adds op_rsize_bop() for all RPCs getting response size.

An overestimate is generally safe but the tighter estimates are probably
better.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

2282cd2c

K
nfsd/callback: Drop a useless data copy when comparing sessionid · 82743380
由 Kinglong Mee 提交于 2月 05, 2017
```
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
82743380

nfsd/callback: skip the callback tag · e86a40bc

由 Kinglong Mee 提交于 2月 05, 2017

The callback tag is NULL, and hdr->nops is unused too right now, but.
But if we were to ever test with a nonzero callback tag, nops will get a
bad value.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e86a40bc

nfsd/callback: Cleanup callback cred on shutdown · f7d1ddbe

由 Kinglong Mee 提交于 2月 05, 2017

The rpccred gotten from rpc_lookup_machine_cred() should be put when
state is shutdown.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f7d1ddbe

nfsd/idmap: return nfserr_inval for 0-length names · c3821b34

由 Kinglong Mee 提交于 2月 05, 2017

Tigran Mkrtchyan's new pynfs testcases for zero length principals fail:

SATT16   st_setattr.testEmptyPrincipal                            : FAILURE
           Setting empty owner should return NFS4ERR_INVAL,
           instead got NFS4ERR_BADOWNER
SATT17   st_setattr.testEmptyGroupPrincipal                       : FAILURE
           Setting empty owner_group should return NFS4ERR_INVAL,
           instead got NFS4ERR_BADOWNER

This patch checks the principal and returns nfserr_inval directly.  It
could check after decoding in nfs4xdr.c, but it's simpler to do it in
nfsd_map_xxxx.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

c3821b34

10 2月, 2017 1 次提交

nfsd: Revert "nfsd: special case truncates some more" · 0839ffb8

由 J. Bruce Fields 提交于 2月 09, 2017

This patch incorrectly attempted nested mnt_want_write, and incorrectly
disabled nfsd's owner override for truncate.  We'll fix those problems
and make another attempt soon, for the moment I think the safest is to
revert.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0839ffb8

07 2月, 2017 1 次提交

NFSDv4: use export cache flushtime for changeid on V4ROOT objects. · b8800921

由 NeilBrown 提交于 1月 30, 2017

If you change the set of filesystems that are exported, then
the contents of various directories in the NFSv4 pseudo-root
is likely to change.  However the change-id of those
directories is currently tied to the underlying directory,
so the client may not see the changes in a timely fashion.

This patch changes the change-id number to be derived from the
"flush_time" of the export cache.  Whenever any changes are
made to the set of exported filesystems, this flush_time is
updated.  The result is that clients see changes to the set
of exported filesystems much more quickly, often immediately.
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b8800921

01 2月, 2017 9 次提交

nfsd: opt in to labeled nfs per export · 32ddd944

由 J. Bruce Fields 提交于 1月 03, 2017

Currently turning on NFSv4.2 results in 4.2 clients suddenly seeing the
individual file labels as they're set on the server.  This is not what
they've previously seen, and not appropriate in may cases.  (In
particular, if clients have heterogenous security policies then one
client's labels may not even make sense to another.)  Labeled NFS should
be opted in only in those cases when the administrator knows it makes
sense.

It's helpful to be able to turn 4.2 on by default, and otherwise the
protocol upgrade seems free of regressions.  So, default labeled NFS to
off and provide an export flag to reenable it.

Users wanting labeled NFS support on an export will henceforth need to:

	- make sure 4.2 support is enabled on client and server (as
	  before), and
	- upgrade the server nfs-utils to a version supporting the new
	  "security_label" export flag.
	- set that "security_label" flag on the export.

This is commit may be seen as a regression to anyone currently depending
on security labels.  We believe those cases are currently rare.

Reported-by: tibbs@math.uh.edu
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

32ddd944

nfsd: constify nfsd_suppatttrs · 5cf23dbb

由 J. Bruce Fields 提交于 1月 11, 2017

To keep me from accidentally writing to this again....
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

5cf23dbb

nfsd: initialize sin6_scope_id in nfsd_inet6addr_event() · 7b19824d

由 Scott Mayhew 提交于 1月 05, 2017

I noticed this was missing when I was testing with link local addresses.
Signed-off-by: NScott Mayhew <smayhew@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7b19824d

NFSD: Remove unused value inode in nfsd_vfs_write · 865d50b2

由 Kinglong Mee 提交于 12月 31, 2016

This is just cleanup, no change in functionality.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

865d50b2

NFSD: cleanup dead codes and values in nfsd_write · 52e380e0

由 Kinglong Mee 提交于 12月 31, 2016

This is just cleanup, no change in functionality.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

52e380e0

NFSD: pass an integer for stable type to nfsd_vfs_write · 54bbb7d2

由 Kinglong Mee 提交于 12月 31, 2016

After fae5096a "nfsd: assume writeable exportabled filesystems have
f_sync" we no longer modify this argument.

This is just cleanup, no change in functionality.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

54bbb7d2

NFSD: correctly range-check v4.x minor version when setting versions. · e35659f1

由 NeilBrown 提交于 12月 21, 2016

Writing to /proc/fs/nfsd/versions allows individual major versions
and NFSv4 minor versions to be enabled or disabled.

However NFSv4.0 cannot currently be disabled, thought there is no good reason.
Also the minor number is parsed as a 'long' but used as an 'int'
so '4294967297' will be incorrectly treated as '1'.

This patch removes the test on 'minor == 0' and switches to kstrtouint()
to get correct range checking.

When reading from /proc/fs/nfsd/versions, 4.0 is current not reported.
To allow the disabling for v4.0 to be visible, while maintaining
backward compatibility, change code to report "-4.0" if appropriate, but
not "+4.0".
Signed-off-by: NNeilBrown <neilb@suse.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e35659f1

nfsd: special case truncates some more · 41f53350

由 Christoph Hellwig 提交于 1月 24, 2017

Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributs to set to set various file attributes including the
file size and the uid/gid.

The Linux syscalls never mixes size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates might not update random other attributes, and many other
file systems handle the case but do not update the different attributes
in the same transaction. NFSD on the other hand passes the attributes
it gets on the wire more or less directly through to the VFS, leading to
updates the file systems don't expect. XFS at least has an assert on
the allowed attributes, which caught an unusual NFS client setting the
size and group at the same time.

To handle this issue properly this switches nfsd to call vfs_truncate
for size changes, and then handle all other attributes through
notify_change. As a side effect this also means less boilerplace code
around the size change as we can now reuse the VFS code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

41f53350

NFSD: Fix a null reference case in find_or_create_lock_stateid() · d19fb70d

由 Kinglong Mee 提交于 1月 18, 2017

nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().

If nfsd doesn't go through init_lock_stateid() and put stateid at end,
there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).

This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().

Cc: stable@vger.kernel.org
Fixes: 356a95ec "nfsd: clean up races in lock stateid searching..."
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d19fb70d

13 1月, 2017 1 次提交

nfsd: fix supported attributes for acl & labels · dcd20869

由 J. Bruce Fields 提交于 1月 11, 2017

Oops--in 916d2d84 I moved some constants into an array for
convenience, but here I'm accidentally writing to that array.

The effect is that if you ever encounter a filesystem lacking support
for ACLs or security labels, then all queries of supported attributes
will report that attribute as unsupported from then on.

Fixes: 916d2d84 "nfsd: clean up supported attribute handling"
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

dcd20869

25 12月, 2016 1 次提交

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

16 12月, 2016 2 次提交

vfs: call vfs_clone_file_range() under freeze protection · 031a072a

由 Amir Goldstein 提交于 9月 23, 2016

Move sb_start_write()/sb_end_write() out of the vfs helper and up into the
ioctl handler.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

031a072a

nfsd: add support for the umask attribute · 47057abd

由 Andreas Gruenbacher 提交于 1月 12, 2016

Clients can set the umask attribute when creating files to cause the
server to apply it always except when inheriting permissions from the
parent directory.  That way, the new files will end up with the same
permissions as files created locally.

See https://tools.ietf.org/html/draft-ietf-nfsv4-umask-02 for more
details.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

47057abd

09 12月, 2016 1 次提交

vfs: replace calling i_op->readlink with vfs_readlink() · fd4a0edf

由 Miklos Szeredi 提交于 12月 09, 2016

Also check d_is_symlink() in callers instead of inode->i_op->readlink
because following patches will allow NULL ->readlink for symlinks.
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

fd4a0edf

18 11月, 2016 1 次提交

netns: make struct pernet_operations::id unsigned int · c7d03a00

由 Alexey Dobriyan 提交于 11月 17, 2016

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

	void f(long *p, int i)
	{
		g(p[i]);
	}

  roughly translates to

	movsx	rsi, esi
	mov	rdi, [rsi+...]
	call 	g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

	static inline void *net_generic(const struct net *net, int id)
	{
		...
		ptr = ng->ptr[id - 1];
		...
	}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
	function                                     old     new   delta
	nfsd4_lock                                  3886    3959     +73
	tipc_link_build_proto_msg                   1096    1140     +44
	mac80211_hwsim_new_radio                    2776    2808     +32
	tipc_mon_rcv                                1032    1058     +26
	svcauth_gss_legacy_init                     1413    1429     +16
	tipc_bcbase_select_primary                   379     392     +13
	nfsd4_exchange_id                           1247    1260     +13
	nfsd4_setclientid_confirm                    782     793     +11
		...
	put_client_renew_locked                      494     480     -14
	ip_set_sockfn_get                            730     716     -14
	geneve_sock_add                              829     813     -16
	nfsd4_sequence_done                          721     703     -18
	nlmclnt_lookup_host                          708     686     -22
	nfsd4_lockt                                 1085    1063     -22
	nfs_get_client                              1077    1050     -27
	tcf_bpf_init                                1106    1076     -30
	nfsd4_encode_fattr                          5997    5930     -67
	Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7d03a00

15 11月, 2016 2 次提交

nfsd: constify reply_cache_stats_operations structure · 7ba630f5

由 Julia Lawall 提交于 8月 28, 2016

reply_cache_stats_operations, of type struct file_operations, is never
modified, so declare it as const.

Done with the help of Coccinelle.
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: NJeff Layton <jlayton@poochiereds.net>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7ba630f5

nfsd: update workqueue creation · 88382036

由 J. Bruce Fields 提交于 11月 14, 2016

No real change in functionality, but the old interface seems to be
deprecated.

We don't actually care about ordering necessarily, but we do depend on
running at most one work item at a time: nfsd4_process_cb_update()
assumes that no other thread is running it, and that no new callbacks
are starting while it's running.
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

88382036

02 11月, 2016 5 次提交

nfsd: catch errors in decode_fattr earlier · e864c189

由 J. Bruce Fields 提交于 6月 10, 2016

3c8e0316 "NFSv4: do exact check about attribute specified" fixed
some handling of unsupported-attribute errors, but it also delayed
checking for unwriteable attributes till after we decode them.  This
could lead to odd behavior in the case a client attemps to set an
attribute we don't know about followed by one we try to parse.  In that
case the parser for the known attribute will attempt to parse the
unknown attribute.  It should fail in some safe way, but the error might
at least be incorrect (probably bad_xdr instead of inval).  So, it's
better to do that check at the start.

As far as I know this doesn't cause any problems with current clients
but it might be a minor issue e.g. if we encounter a future client that
supports a new attribute that we currently don't.

Cc: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e864c189

nfsd: clean up supported attribute handling · 916d2d84

由 J. Bruce Fields 提交于 10月 18, 2016

Minor cleanup, no change in behavior.

Provide helpers for some common attribute bitmap operations.  Drop some
comments that just echo the code.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

916d2d84

nfsd: fix error handling for clients that fail to return the layout · 851238a2

由 Jeff Layton 提交于 10月 20, 2016

Currently, when the client continually returns NFS4ERR_DELAY on a
CB_LAYOUTRECALL, we'll give up trying to retransmit after two lease
periods, but leave the layout in place.

What we really need to do here is fence the client in this case. Have it
fall through to that code in that case instead of into the
NFS4ERR_NOMATCHING_LAYOUT case.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

851238a2

nfsd: more robust allocation failure handling in nfsd_reply_cache_init · 8f97514b

由 Jeff Layton 提交于 10月 26, 2016

Currently, we try to allocate the cache as a single, large chunk, which
can fail if no big chunks of memory are available. We _do_ try to size
it according to the amount of memory in the box, but if the server is
started well after boot time, then the allocation can fail due to memory
fragmentation.

Fall back to doing a vzalloc if the kcalloc fails, and switch the
shutdown code to do a kvfree to handle freeing correctly.
Reported-by: NOlaf Hering <olaf@aepfle.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

8f97514b

nfsd: Fix general protection fault in release_lock_stateid() · f46c445b

由 Chuck Lever 提交于 10月 29, 2016

When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
I get this crash on the server:

Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
Oct 28 22:04:30 klimt kernel: Stack:
Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
Oct 28 22:04:30 klimt kernel: Call Trace:
Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---

Jeff Layton says:
> Hm...now that I look though, this is a little suspicious:
>
>    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
>
> I wonder if it's possible for the openstateid to have already been
> destroyed at this point.
>
> We might be better off doing something like this to get the client pointer:
>
>    stp->st_stid.sc_client;
>
> ...which should be more direct and less dependent on other stateids
> staying valid.

With the suggested change, I am no longer able to reproduce the above oops.

v2: Fix unhash_lock_stateid() as well
Fix-suggested-by: NJeff Layton <jlayton@redhat.com>
Fixes: 42691398 ('nfsd: Fix race between FREE_STATEID and LOCK')
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f46c445b

25 10月, 2016 1 次提交

nfsd: move blocked lock handling under a dedicated spinlock · 0cc11a61

由 Jeff Layton 提交于 10月 20, 2016

Bruce was hitting some lockdep warnings in testing, showing that we
could hit a deadlock with the new CB_NOTIFY_LOCK handling, involving a
rather complex situation involving four different spinlocks.

The crux of the matter is that we end up taking the nn->client_lock in
the lm_notify handler. The simplest fix is to just declare a new
per-nfsd_net spinlock to protect the new CB_NOTIFY_LOCK structures.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

0cc11a61

08 10月, 2016 4 次提交

cred: simpler, 1D supplementary groups · 81243eac

由 Alexey Dobriyan 提交于 10月 07, 2016

Current supplementary groups code can massively overallocate memory and
is implemented in a way so that access to individual gid is done via 2D
array.

If number of gids is <= 32, memory allocation is more or less tolerable
(140/148 bytes).  But if it is not, code allocates full page (!)
regardless and, what's even more fun, doesn't reuse small 32-entry
array.

2D array means dependent shifts, loads and LEAs without possibility to
optimize them (gid is never known at compile time).

All of the above is unnecessary.  Switch to the usual
trailing-zero-len-array scheme.  Memory is allocated with
kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
(LEA 8(gi,idx,4) or even without displacement).

Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
think kernel can handle such allocation.

On my usual desktop system with whole 9 (nine) aux groups, struct
group_info shrinks from 148 bytes to 44 bytes, yay!

Nice side effects:

 - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,

 - fix little mess in net/ipv4/ping.c
   should have been using GROUP_AT macro but this point becomes moot,

 - aux group allocation is persistent and should be accounted as such.

Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.bySigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81243eac

NFSD: Implement the COPY call · 29ae7f9d

由 Anna Schumaker 提交于 9月 07, 2016

I only implemented the sync version of this call, since it's the
easiest.  I can simply call vfs_copy_range() and have the vfs do the
right thing for the filesystem being exported.
Signed-off-by: NAnna Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

29ae7f9d

nfsd: handle EUCLEAN · 42e61616

由 J. Bruce Fields 提交于 10月 04, 2016

Eric Sandeen reports that xfs can return this if filesystem corruption
prevented completing the operation.
Reported-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

42e61616

nfsd: only WARN once on unmapped errors · ff30f08c

由 J. Bruce Fields 提交于 10月 04, 2016

No need to spam the logs here.

The only drawback is losing information if we ever encounter two
different unmapped errors, but in practice we've rarely see even one.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

ff30f08c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功