提交 · 90478939dce096ed5b239cad16237dca0a59d66f · openanolis / cloud-kernel

31 3月, 2014 4 次提交

locks: pass the cmd value to fcntl_getlk/getlk64 · c1e62b8f

由 Jeff Layton 提交于 2月 03, 2014

Once we introduce file private locks, we'll need to know what cmd value
was used, as that affects the ownership and whether a conflict would
arise.
Signed-off-by: NJeff Layton <jlayton@redhat.com>

c1e62b8f

locks: make /proc/locks show IS_FILE_PVT locks as type "FLPVT" · c918d42a

由 Jeff Layton 提交于 2月 03, 2014

In a later patch, we'll be adding a new type of lock that's owned by
the struct file instead of the files_struct. Those sorts of locks
will be flagged with a new FL_FILE_PVT flag.

Report these types of locks as "FLPVT" in /proc/locks to distinguish
them from "classic" POSIX locks.
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

c918d42a

locks: rename locks_remove_flock to locks_remove_file · 78ed8a13

由 Jeff Layton 提交于 2月 03, 2014

This function currently removes leases in addition to flock locks and in
a later patch we'll have it deal with file-private locks too. Rename it
to locks_remove_file to indicate that it removes locks that are
associated with a particular struct file, and not just flock locks.
Acked-by: NJ. Bruce Fields <bfields@fieldses.org>
Signed-off-by: NJeff Layton <jlayton@redhat.com>

78ed8a13

locks: close potential race between setlease and open · 24cbe784

由 Jeff Layton 提交于 2月 03, 2014

As Al Viro points out, there is an unlikely, but possible race between
opening a file and setting a lease on it. generic_add_lease is done with
the i_lock held, but the inode->i_flock check in break_lease is
lockless. It's possible for another task doing an open to do the entire
pathwalk and call break_lease between the point where generic_add_lease
checks for a conflicting open and adds the lease to the list. If this
occurs, we can end up with a lease set on the file with a conflicting
open.

To guard against that, check again for a conflicting open after adding
the lease to the i_flock list. If the above race occurs, then we can
simply unwind the lease setting and return -EAGAIN.

Because we take dentry references and acquire write access on the file
before calling break_lease, we know that if the i_flock list is empty
when the open caller goes to check it then the necessary refcounts have
already been incremented. Thus the additional check for a conflicting
open will see that there is one and the setlease call will fail.

Cc: Bruce Fields <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@fieldses.org>

24cbe784

26 1月, 2014 1 次提交

fs: add a set_acl inode operation · 893d46e4

由 Christoph Hellwig 提交于 12月 20, 2013

This will allow moving all the Posix ACL handling into the VFS and clean
up tons of cruft in the filesystems.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

893d46e4

16 11月, 2013 1 次提交

consolidate simple ->d_delete() instances · b26d4cd3

由 Al Viro 提交于 10月 25, 2013

Rename simple_delete_dentry() to always_delete_dentry() and export it.
Export simple_dentry_operations, while we are at it, and get rid of
their duplicates
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b26d4cd3

09 11月, 2013 12 次提交

locks: break delegations on any attribute modification · 27ac0ffe

由 J. Bruce Fields 提交于 9月 20, 2011

NFSv4 uses leases to guarantee that clients can cache metadata as well
as data.

Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

27ac0ffe

locks: break delegations on link · 146a8595

由 J. Bruce Fields 提交于 9月 20, 2011

Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

146a8595

locks: break delegations on rename · 8e6d782c

由 J. Bruce Fields 提交于 9月 20, 2011

Cc: David Howells <dhowells@redhat.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8e6d782c

locks: helper functions for delegation breaking · 5a14696c

由 J. Bruce Fields 提交于 8月 28, 2012

We'll need the same logic for rename and link.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5a14696c

locks: break delegations on unlink · b21996e3

由 J. Bruce Fields 提交于 9月 20, 2011

We need to break delegations on any operation that changes the set of
links pointing to an inode.  Start with unlink.

Such operations also hold the i_mutex on a parent directory.  Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client.  To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:

	acquire locks
	...
	test for delegation; if found:
		take reference on inode
		release locks
		wait for delegation break
		drop reference on inode
		retry

It is possible this could never terminate.  (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.)  But this seems very unlikely.

The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack.  We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.

Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b21996e3

locks: implement delegations · df4e8d2c

由 J. Bruce Fields 提交于 3月 05, 2012

Implement NFSv4 delegations at the vfs level using the new FL_DELEG lock
type.

Note nfsd is the only delegation user and is only using read
delegations.  Warn on any attempt to set a write delegation for now.
We'll come back to that case later.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

df4e8d2c

locks: introduce new FL_DELEG lock flag · 617588d5

由 J. Bruce Fields 提交于 7月 01, 2011

For now FL_DELEG is just a synonym for FL_LEASE.  So this patch doesn't
change behavior.

Next we'll modify break_lease to treat FL_DELEG leases differently, to
account for the fact that NFSv4 delegations should be broken in more
situations than Windows oplocks.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

617588d5

vfs: rename I_MUTEX_QUOTA now that it's not used for quotas · 40bd22c9

由 J. Bruce Fields 提交于 4月 18, 2012

I_MUTEX_QUOTA is now just being used whenever we want to lock two
non-directories.  So the name isn't right.  I_MUTEX_NONDIR2 isn't
especially elegant but it's the best I could think of.

Also fix some outdated documentation.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

40bd22c9

vfs: pull ext4's double-i_mutex-locking into common code · 375e289e

由 J. Bruce Fields 提交于 4月 18, 2012

We want to do this elsewhere as well.

Also catch any attempts to use it for directories (where this ordering
would conflict with ancestor-first directory ordering in lock_rename).

Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

375e289e

vfs: split out vfs_getattr_nosec · b7a6ec52

由 J. Bruce Fields 提交于 10月 02, 2013

The filehandle lookup code wants this version of getattr.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b7a6ec52

A
take anon inode allocation to libfs.c · 6987843f
由 Al Viro 提交于 10月 02, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
6987843f

get rid of s_files and files_lock · eee5cc27

由 Al Viro 提交于 10月 04, 2013

The only thing we need it for is alt-sysrq-r (emergency remount r/o)
and these days we can do just as well without going through the
list of files.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

eee5cc27

25 10月, 2013 3 次提交
- A
  make freeing super_block rcu-delayed · e2fec7c3
  由 Al Viro 提交于 10月 04, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  e2fec7c3
- A
  consolidate the reassignments of ->f_op in ->open() instances · e84f9e57
  由 Al Viro 提交于 9月 22, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  e84f9e57
- A
  new helper: kfree_put_link() · 87dc800b
  由 Al Viro 提交于 9月 16, 2013
```
duplicated to hell and back...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  87dc800b
28 9月, 2013 1 次提交

NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() · f1fe29b4

由 David Howells 提交于 9月 27, 2013

Use i_writecount to control whether to get an fscache cookie in nfs_open() as
NFS does not do write caching yet.  I *think* this is the cause of a problem
encountered by Mark Moseley whereby __fscache_uncache_page() gets a NULL
pointer dereference because cookie->def is NULL:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160
PGD 0
Thread overran stack, or stack corrupted
Oops: 0000 [#1] SMP
Modules linked in: ...
CPU: 7 PID: 18993 Comm: php Not tainted 3.11.1 #1
Hardware name: Dell Inc. PowerEdge R420/072XWF, BIOS 1.3.5 08/21/2012
task: ffff8804203460c0 ti: ffff880420346640
RIP: 0010:[<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160
RSP: 0018:ffff8801053af878 EFLAGS: 00210286
RAX: 0000000000000000 RBX: ffff8800be2f8780 RCX: ffff88022ffae5e8
RDX: 0000000000004c66 RSI: ffffea00055ff440 RDI: ffff8800be2f8780
RBP: ffff8801053af898 R08: 0000000000000001 R09: 0000000000000003
R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00055ff440
R13: 0000000000001000 R14: ffff8800c50be538 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88042fc60000(0063) knlGS:00000000e439c700
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000001d8f000 CR4: 00000000000607f0
Stack:
...
Call Trace:
[<ffffffff81365a72>] __nfs_fscache_invalidate_page+0x42/0x70
[<ffffffff813553d5>] nfs_invalidate_page+0x75/0x90
[<ffffffff811b8f5e>] truncate_inode_page+0x8e/0x90
[<ffffffff811b90ad>] truncate_inode_pages_range.part.12+0x14d/0x620
[<ffffffff81d6387d>] ? __mutex_lock_slowpath+0x1fd/0x2e0
[<ffffffff811b95d3>] truncate_inode_pages_range+0x53/0x70
[<ffffffff811b969d>] truncate_inode_pages+0x2d/0x40
[<ffffffff811b96ff>] truncate_pagecache+0x4f/0x70
[<ffffffff81356840>] nfs_setattr_update_inode+0xa0/0x120
[<ffffffff81368de4>] nfs3_proc_setattr+0xc4/0xe0
[<ffffffff81357f78>] nfs_setattr+0xc8/0x150
[<ffffffff8122d95b>] notify_change+0x1cb/0x390
[<ffffffff8120a55b>] do_truncate+0x7b/0xc0
[<ffffffff8121f96c>] do_last+0xa4c/0xfd0
[<ffffffff8121ffbc>] path_openat+0xcc/0x670
[<ffffffff81220a0e>] do_filp_open+0x4e/0xb0
[<ffffffff8120ba1f>] do_sys_open+0x13f/0x2b0
[<ffffffff8126aaf6>] compat_SyS_open+0x36/0x50
[<ffffffff81d7204c>] sysenter_dispatch+0x7/0x24

The code at the instruction pointer was disassembled:

> (gdb) disas __fscache_uncache_page
> Dump of assembler code for function __fscache_uncache_page:
> ...
> 0xffffffff812a18ff <+31>: mov 0x48(%rbx),%rax
> 0xffffffff812a1903 <+35>: cmpb $0x0,0x10(%rax)
> 0xffffffff812a1907 <+39>: je 0xffffffff812a19cd <__fscache_uncache_page+237>

These instructions make up:

	ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX);

That cmpb is the faulting instruction (%rax is 0).  So cookie->def is NULL -
which presumably means that the cookie has already been at least partway
through __fscache_relinquish_cookie().

What I think may be happening is something like a three-way race on the same
file:

	PROCESS 1	PROCESS 2	PROCESS 3
	===============	===============	===============
	open(O_TRUNC|O_WRONLY)
			open(O_RDONLY)
					open(O_WRONLY)
	-->nfs_open()
	-->nfs_fscache_set_inode_cookie()
	nfs_fscache_inode_lock()
	nfs_fscache_disable_inode_cookie()
	__fscache_relinquish_cookie()
	nfs_inode->fscache = NULL
	<--nfs_fscache_set_inode_cookie()

			-->nfs_open()
			-->nfs_fscache_set_inode_cookie()
			nfs_fscache_inode_lock()
			nfs_fscache_enable_inode_cookie()
			__fscache_acquire_cookie()
			nfs_inode->fscache = cookie
			<--nfs_fscache_set_inode_cookie()
	<--nfs_open()
	-->nfs_setattr()
	...
	...
	-->nfs_invalidate_page()
	-->__nfs_fscache_invalidate_page()
	cookie = nfsi->fscache
					-->nfs_open()
					-->nfs_fscache_set_inode_cookie()
					nfs_fscache_inode_lock()
					nfs_fscache_disable_inode_cookie()
					-->__fscache_relinquish_cookie()
	-->__fscache_uncache_page(cookie)
	<crash>
					<--__fscache_relinquish_cookie()
					nfs_inode->fscache = NULL
					<--nfs_fscache_set_inode_cookie()

What is needed is something to prevent process #2 from reacquiring the cookie
- and I think checking i_writecount should do the trick.

It's also possible to have a two-way race on this if the file is opened
O_TRUNC|O_RDONLY instead.
Reported-by: NMark Moseley <moseleymark@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

f1fe29b4

11 9月, 2013 8 次提交

fs: convert inode and dentry shrinking to be node aware · 9b17c623