提交 · 3235b40303b6f609c446275d0e7f6f9f4fe94156 · openanolis / cloud-kernel

20 11月, 2014 2 次提交
- A
  assorted conversions to %p[dD] · a455589f
  由 Al Viro 提交于 10月 21, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  a455589f
- A
  switch d_materialise_unique() users to d_splice_alias() · 41d28bca
  由 Al Viro 提交于 10月 12, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  41d28bca
05 11月, 2014 1 次提交

NFSv4: Ensure nfs_atomic_open set the dentry verifier on ENOENT · 809fd143

由 Trond Myklebust 提交于 10月 23, 2014

If the OPEN rpc call to the server fails with an ENOENT call, nfs_atomic_open
will create a negative dentry for that file, however it currently fails
to call nfs_set_verifier(), thus causing the dentry to be immediately
revalidated on the next call to nfs_lookup_revalidate() instead of following
the usual lookup caching rules.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

809fd143

09 10月, 2014 2 次提交

vfs: Make d_invalidate return void · 5542aa2f

由 Eric W. Biederman 提交于 2月 13, 2014

Now that d_invalidate can no longer fail, stop returning a useless
return code.  For the few callers that checked the return code update
remove the handling of d_invalidate failure.
Reviewed-by: NMiklos Szeredi <miklos@szeredi.hu>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

5542aa2f

vfs: Remove unnecessary calls of check_submounts_and_drop · 9b053f32

由 Eric W. Biederman 提交于 2月 13, 2014

Now that check_submounts_and_drop can not fail and is called from
d_invalidate there is no longer a need to call check_submounts_and_drom
from filesystem d_revalidate methods so remove it.
Reviewed-by: NMiklos Szeredi <miklos@szeredi.hu>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9b053f32

04 8月, 2014 8 次提交

NFS: fix two problems in lookup_revalidate in RCU-walk · 50d77739

由 NeilBrown 提交于 8月 04, 2014

1/ rcu_dereference isn't correct: that field isn't
   RCU protected.   It could potentially change at any time
   so ACCESS_ONCE might be justified.

   changes to ->d_parent are protected by ->d_seq.  However
   that isn't always checked after ->d_revalidate is called,
   so it is safest to keep the double-check that ->d_parent
   hasn't changed at the end of these functions.

2/ in nfs4_lookup_revalidate, "->d_parent" was forgotten.
   So 'parent' was not the parent of 'dentry'.
   This fails safe is the context is that dentry->d_inode is
   NULL, and the result of parent->d_inode being NULL is
   that ECHILD is returned, which is always safe.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

50d77739

NFS: allow lockless access to access_cache · f682a398

由 NeilBrown 提交于 7月 14, 2014

The access cache is used during RCU-walk path lookups, so it is best
to avoid locking if possible as taking a lock kills concurrency.

The rbtree is not rcu-safe and cannot easily be made so.
Instead we simply check the last (i.e. most recent) entry on the LRU
list.  If this doesn't match, then we return -ECHILD and retry in
lock/refcount mode.

This requires freeing the nfs_access_entry struct with rcu, and
requires using rcu access primatives when adding entries to the lru, and
when examining the last entry.

Calling put_rpccred before kfree_rcu looks a bit odd, but as
put_rpccred already provides rcu protection, we know that the cred will
not actually be freed until the next grace period, so any concurrent
access will be safe.

This patch provides about 5% performance improvement on a stat-heavy
synthetic work load with 4 threads on a 2-core CPU.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f682a398

NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU · 1fa1e384

由 NeilBrown 提交于 7月 14, 2014

It fails with -ECHILD rather than make an RPC call.

This allows nfs_lookup_revalidate to call it in RCU-walk mode.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1fa1e384

NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU · 912a108d

由 NeilBrown 提交于 7月 14, 2014

This requires nfs_check_verifier to take an rcu_walk flag, and requires
an rcu version of nfs_revalidate_inode which returns -ECHILD rather
than making an RPC call.

With this, nfs_lookup_revalidate can call nfs_neg_need_reval in
RCU-walk mode.

We can also move the LOOKUP_RCU check past the nfs_check_verifier()
call in nfs_lookup_revalidate.

If RCU_WALK prevents nfs_check_verifier or nfs_neg_need_reval from
doing a full check, they return a status indicating that a revalidation
is required.  As this revalidation will not be possible in RCU_WALK
mode, -ECHILD will ultimately be returned, which is the desired result.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

912a108d

NFS: support RCU_WALK in nfs_permission() · f3324a2a

由 NeilBrown 提交于 7月 14, 2014

nfs_permission makes two calls which are not always safe in RCU_WALK,
rpc_lookup_cred and nfs_do_access.

The second can easily be made rcu-safe by aborting with -ECHILD before
making the RPC call.

The former can be made rcu-safe by calling rpc_lookup_cred_nonblock()
instead.
As this will almost always succeed, we use it even when RCU_WALK
isn't being used as it still saves some spinlocks in a common case.
We only fall back to rpc_lookup_cred() if rpc_lookup_cred_nonblock()
fails and MAY_NOT_BLOCK isn't set.

This optimisation (always trying rpc_lookup_cred_nonblock()) is
particularly important when a security module is active.
In that case inode_permission() may return -ECHILD from
security_inode_permission() even though ->permission() succeeded in
RCU_WALK mode.
This leads to may_lookup() retrying inode_permission after performing
unlazy_walk().  The spinlock that rpc_lookup_cred() takes is often
more expensive than anything security_inode_permission() does, so that
spinlock becomes the main bottleneck.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

f3324a2a

NFS: prepare for RCU-walk support but pushing tests later in code. · d51ac1a8

由 NeilBrown 提交于 7月 14, 2014

nfs_lookup_revalidate, nfs4_lookup_revalidate, and nfs_permission
all need to understand and handle RCU-walk for NFS to gain the
benefits of RCU-walk for cached information.

Currently these functions all immediately return -ECHILD
if the relevant flag (LOOKUP_RCU or MAY_NOT_BLOCK) is set.

This patch pushes those tests later in the code so that we only abort
immediately before we enter rcu-unsafe code.  As subsequent patches
make that rcu-unsafe code rcu-safe, several of these new tests will
disappear.

With this patch there are several paths through the code which will no
longer return -ECHILD during an RCU-walk.  However these are mostly
error paths or other uninteresting cases.

A noteworthy change in nfs_lookup_revalidate is that we don't take
(or put) the reference to ->d_parent when LOOKUP_RCU is set.
Rather we rcu_dereference ->d_parent, and check that ->d_inode
is not NULL.  We also check that ->d_parent hasn't changed after
all the tests.

In nfs4_lookup_revalidate we simply avoid testing LOOKUP_RCU on the
path that only calls nfs_lookup_revalidate() as that function
already performs the required test.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d51ac1a8

NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used. · 49317a7f

由 NeilBrown 提交于 7月 14, 2014

nfs4_lookup_revalidate only uses 'parent' to get 'dir', and only
uses 'dir' if 'inode == NULL'.

So we don't need to find out what 'parent' or 'dir' is until we
know that 'inode' is NULL.

By moving 'dget_parent' inside the 'if', we can reduce the number of
call sites for 'dput(parent)'.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

49317a7f

NFS: Enforce an upper limit on the number of cached access call · 3a505845

由 Trond Myklebust 提交于 7月 21, 2014

This may be used to limit the number of cached credentials building up
inside the access cache.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

3a505845

18 4月, 2014 1 次提交

arch: Mass conversion of smp_mb__*() · 4e857c58

由 Peter Zijlstra 提交于 3月 17, 2014

Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4e857c58

05 4月, 2014 1 次提交

nfs: pass string length to pr_notice message about readdir loops · 9581a4ae

由 Jeff Layton 提交于 4月 05, 2014

There is no guarantee that the strings in the nfs_cache_array will be
NULL-terminated. In the event that we end up hitting a readdir loop, we
need to ensure that we pass the warning message the length of the
string.
Reported-by: NLachlan McIlroy <lmcilroy@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

9581a4ae

18 3月, 2014 1 次提交

nfs: convert nfs_rename to use async_rename infrastructure · 80a491fd

由 Jeff Layton 提交于 3月 17, 2014

There isn't much sense in maintaining two separate versions of rename
code. Convert nfs_rename to use the asynchronous rename infrastructure
that nfs_sillyrename uses, and emulate synchronous behavior by having
the task just wait on the reply.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Tested-by: NAnna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

80a491fd

12 2月, 2014 1 次提交

NFS: Be more aggressive in using readdirplus for 'ls -l' situations · 311324ad

由 Trond Myklebust 提交于 2月 07, 2014

Try to detect 'ls -l' by having nfs_getattr() look at whether or not
there is an opendir() file descriptor for the parent directory.
If so, then assume that we want to force use of readdirplus in order
to avoid the multiple GETATTR calls over the wire.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

311324ad

11 2月, 2014 1 次提交

mm: fix page leak at nfs_symlink() · a0b54add

由 Rafael Aquini 提交于 2月 10, 2014

Changes in commit a0b8cab3 ("mm: remove lru parameter from
__pagevec_lru_add and remove parts of pagevec API") have introduced a
call to add_to_page_cache_lru() which causes a leak in nfs_symlink() as
now the page gets an extra refcount that is not dropped.

Jan Stancek observed and reported the leak effect while running test8
from Connectathon Testsuite.  After several iterations over the test
case, which creates several symlinks on a NFS mountpoint, the test
system was quickly getting into an out-of-memory scenario.

This patch fixes the page leak by dropping that extra refcount
add_to_page_cache_lru() is grabbing.
Signed-off-by: NJan Stancek <jstancek@redhat.com>
Signed-off-by: NRafael Aquini <aquini@redhat.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: <stable@vger.kernel.org>	[3.11.x+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a0b54add

29 1月, 2014 1 次提交

nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING · 4db72b40

由 Jeff Layton 提交于 1月 28, 2014

If the setting of NFS_INO_INVALIDATING gets reordered to before the
clearing of NFS_INO_INVALID_DATA, then another task may hit a race
window where both appear to be clear, even though the inode's pages are
still in need of invalidation. Fix this by adding the appropriate memory
barriers.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

4db72b40

28 1月, 2014 1 次提交

NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping · d529ef83

由 Jeff Layton 提交于 1月 27, 2014

There is a possible race in how the nfs_invalidate_mapping function is
handled. Currently, we go and invalidate the pages in the file and then
clear NFS_INO_INVALID_DATA.

The problem is that it's possible for a stale page to creep into the
mapping after the page was invalidated (i.e., via readahead). If another
writer comes along and sets the flag after that happens but before
invalidate_inode_pages2 returns then we could clear the flag
without the cache having been properly invalidated.

So, we must clear the flag first and then invalidate the pages. Doing
this however, opens another race:

It's possible to have two concurrent read() calls that end up in
nfs_revalidate_mapping at the same time. The first one clears the
NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping.

Just before calling that though, the other task races in, checks the
flag and finds it cleared. At that point, it trusts that the mapping is
good and gets the lock on the page, allowing the read() to be satisfied
from the cache even though the data is no longer valid.

These effects are easily manifested by running diotest3 from the LTP
test suite on NFS. That program does a series of DIO writes and buffered
reads. The operations are serialized and page-aligned but the existing
code fails the test since it occasionally allows a read to come out of
the cache incorrectly. While mixing direct and buffered I/O isn't
recommended, I believe it's possible to hit this in other ways that just
use buffered I/O, though that situation is much harder to reproduce.

The problem is that the checking/clearing of that flag and the
invalidation of the mapping really need to be atomic. Fix this by
serializing concurrent invalidations with a bitlock.

At the same time, we also need to allow other places that check
NFS_INO_INVALID_DATA to check whether we might be in the middle of
invalidating the file, so fix up a couple of places that do that
to look for the new NFS_INO_INVALIDATING flag.

Doing this requires us to be careful not to set the bitlock
unnecessarily, so this code only does that if it believes it will
be doing an invalidation.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

d529ef83

06 1月, 2014 1 次提交

NFS: dprintk() should not print negative fileids and inode numbers · 1e8968c5

由 Niels de Vos 提交于 12月 17, 2013

A fileid in NFS is a uint64. There are some occurrences where dprintk()
outputs a signed fileid. This leads to confusion and more difficult to
read debugging (negative fileids matching positive inode numbers).
Signed-off-by: NNiels de Vos <ndevos@redhat.com>
CC: Santosh Pradhan <spradhan@redhat.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

1e8968c5

29 10月, 2013 1 次提交

nfs: use IS_ROOT not DCACHE_DISCONNECTED · a3f432bf

由 J. Bruce Fields 提交于 10月 15, 2013

This check was added by Al Viro with
d9e80b7d "nfs d_revalidate() is too
trigger-happy with d_drop()", with the explanation that we don't want to
remove the root of a disconnected tree, which will still be included on
the s_anon list.

But DCACHE_DISCONNECTED does *not* actually identify dentries that are
disconnected from the dentry tree or hashed on s_anon.  IS_ROOT() is the
way to do that.

Also add a comment from Al's commit to remind us why this check is
there.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

a3f432bf

25 10月, 2013 1 次提交
- A
  nfs: use %p[dD] instead of open-coded (and often racy) equivalents · 6de1472f
  由 Al Viro 提交于 9月 16, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  6de1472f
28 9月, 2013 1 次提交

NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() · f1fe29b4

由 David Howells 提交于 9月 27, 2013

Use i_writecount to control whether to get an fscache cookie in nfs_open() as
NFS does not do write caching yet.  I *think* this is the cause of a problem
encountered by Mark Moseley whereby __fscache_uncache_page() gets a NULL
pointer dereference because cookie->def is NULL:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160
PGD 0
Thread overran stack, or stack corrupted
Oops: 0000 [#1] SMP
Modules linked in: ...
CPU: 7 PID: 18993 Comm: php Not tainted 3.11.1 #1
Hardware name: Dell Inc. PowerEdge R420/072XWF, BIOS 1.3.5 08/21/2012
task: ffff8804203460c0 ti: ffff880420346640
RIP: 0010:[<ffffffff812a1903>] __fscache_uncache_page+0x23/0x160
RSP: 0018:ffff8801053af878 EFLAGS: 00210286
RAX: 0000000000000000 RBX: ffff8800be2f8780 RCX: ffff88022ffae5e8
RDX: 0000000000004c66 RSI: ffffea00055ff440 RDI: ffff8800be2f8780
RBP: ffff8801053af898 R08: 0000000000000001 R09: 0000000000000003
R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00055ff440
R13: 0000000000001000 R14: ffff8800c50be538 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88042fc60000(0063) knlGS:00000000e439c700
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000001d8f000 CR4: 00000000000607f0
Stack:
...
Call Trace:
[<ffffffff81365a72>] __nfs_fscache_invalidate_page+0x42/0x70
[<ffffffff813553d5>] nfs_invalidate_page+0x75/0x90
[<ffffffff811b8f5e>] truncate_inode_page+0x8e/0x90
[<ffffffff811b90ad>] truncate_inode_pages_range.part.12+0x14d/0x620
[<ffffffff81d6387d>] ? __mutex_lock_slowpath+0x1fd/0x2e0
[<ffffffff811b95d3>] truncate_inode_pages_range+0x53/0x70
[<ffffffff811b969d>] truncate_inode_pages+0x2d/0x40
[<ffffffff811b96ff>] truncate_pagecache+0x4f/0x70
[<ffffffff81356840>] nfs_setattr_update_inode+0xa0/0x120
[<ffffffff81368de4>] nfs3_proc_setattr+0xc4/0xe0
[<ffffffff81357f78>] nfs_setattr+0xc8/0x150
[<ffffffff8122d95b>] notify_change+0x1cb/0x390
[<ffffffff8120a55b>] do_truncate+0x7b/0xc0
[<ffffffff8121f96c>] do_last+0xa4c/0xfd0
[<ffffffff8121ffbc>] path_openat+0xcc/0x670
[<ffffffff81220a0e>] do_filp_open+0x4e/0xb0
[<ffffffff8120ba1f>] do_sys_open+0x13f/0x2b0
[<ffffffff8126aaf6>] compat_SyS_open+0x36/0x50
[<ffffffff81d7204c>] sysenter_dispatch+0x7/0x24

The code at the instruction pointer was disassembled:

> (gdb) disas __fscache_uncache_page
> Dump of assembler code for function __fscache_uncache_page:
> ...
> 0xffffffff812a18ff <+31>: mov 0x48(%rbx),%rax
> 0xffffffff812a1903 <+35>: cmpb $0x0,0x10(%rax)
> 0xffffffff812a1907 <+39>: je 0xffffffff812a19cd <__fscache_uncache_page+237>

These instructions make up:

	ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX);

That cmpb is the faulting instruction (%rax is 0).  So cookie->def is NULL -
which presumably means that the cookie has already been at least partway
through __fscache_relinquish_cookie().

What I think may be happening is something like a three-way race on the same
file:

	PROCESS 1	PROCESS 2	PROCESS 3
	===============	===============	===============
	open(O_TRUNC|O_WRONLY)
			open(O_RDONLY)
					open(O_WRONLY)
	-->nfs_open()
	-->nfs_fscache_set_inode_cookie()
	nfs_fscache_inode_lock()
	nfs_fscache_disable_inode_cookie()
	__fscache_relinquish_cookie()
	nfs_inode->fscache = NULL
	<--nfs_fscache_set_inode_cookie()

			-->nfs_open()
			-->nfs_fscache_set_inode_cookie()
			nfs_fscache_inode_lock()
			nfs_fscache_enable_inode_cookie()
			__fscache_acquire_cookie()
			nfs_inode->fscache = cookie
			<--nfs_fscache_set_inode_cookie()
	<--nfs_open()
	-->nfs_setattr()
	...
	...
	-->nfs_invalidate_page()
	-->__nfs_fscache_invalidate_page()
	cookie = nfsi->fscache
					-->nfs_open()
					-->nfs_fscache_set_inode_cookie()
					nfs_fscache_inode_lock()
					nfs_fscache_disable_inode_cookie()
					-->__fscache_relinquish_cookie()
	-->__fscache_uncache_page(cookie)
	<crash>
					<--__fscache_relinquish_cookie()
					nfs_inode->fscache = NULL
					<--nfs_fscache_set_inode_cookie()

What is needed is something to prevent process #2 from reacquiring the cookie
- and I think checking i_writecount should do the trick.

It's also possible to have a two-way race on this if the file is opened
O_TRUNC|O_RDONLY instead.
Reported-by: NMark Moseley <moseleymark@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

f1fe29b4

26 9月, 2013 1 次提交

NFSv4: Honour the 'opened' parameter in the atomic_open() filesystem method · 5bc2afc2

由 Trond Myklebust 提交于 9月 23, 2013

Determine if we've created a new file by examining the directory change
attribute and/or the O_EXCL flag.

This fixes a regression when doing a non-exclusive create of a new file.
If the FILE_CREATED flag is not set, the atomic_open() command will
perform full file access permissions checks instead of just checking
for MAY_OPEN.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5bc2afc2

17 9月, 2013 1 次提交

nfs: set FILE_CREATED · 01c919ab

由 Miklos Szeredi 提交于 9月 16, 2013

Set FILE_CREATED on O_CREAT|O_EXCL.  If the NFS server honored our request
for exclusivity then this must be correct.

Currently this is a no-op, since the VFS sets FILE_CREATED anyway.  The
next patch will, however, require this flag to be always set by
filesystems.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

01c919ab

11 9月, 2013 2 次提交

fs: convert fs shrinkers to new scan/count API · 1ab6c499

由 Dave Chinner 提交于 8月 28, 2013

Convert the filesystem shrinkers to use the new API, and standardise some
of the behaviours of the shrinkers at the same time.  For example,
nr_to_scan means the number of objects to scan, not the number of objects
to free.

I refactored the CIFS idmap shrinker a little - it really needs to be
broken up into a shrinker per tree and keep an item count with the tree
root so that we don't need to walk the tree every time the shrinker needs
to count the number of objects in the tree (i.e.  all the time under
memory pressure).

[glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
[assorted fixes folded in]
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NGlauber Costa <glommer@openvz.org>
Acked-by: NMel Gorman <mgorman@suse.de>
Acked-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: NJan Kara <jack@suse.cz>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1ab6c499

super: fix calculation of shrinkable objects for small numbers · 55f841ce

由 Glauber Costa 提交于 8月 28, 2013

The sysctl knob sysctl_vfs_cache_pressure is used to determine which
percentage of the shrinkable objects in our cache we should actively try
to shrink.

It works great in situations in which we have many objects (at least more
than 100), because the aproximation errors will be negligible.  But if
this is not the case, specially when total_objects < 100, we may end up
concluding that we have no objects at all (total / 100 = 0, if total <
100).

This is certainly not the biggest killer in the world, but may matter in
very low kernel memory situations.
Signed-off-by: NGlauber Costa <glommer@openvz.org>
Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

55f841ce

06 9月, 2013 1 次提交

nfs: use check_submounts_and_drop() · 13caa9fb

由 Miklos Szeredi 提交于 9月 05, 2013

Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.

check_submounts_and_drop() can deal with negative dentries and
non-directories as well.

Non-directories can also be mounted on.  And just like directories we don't
want these to disappear with invalidation.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

13caa9fb

04 9月, 2013 1 次提交

NFS: Ensure that rmdir() waits for sillyrenames to complete · ba6c0592

由 Trond Myklebust 提交于 8月 30, 2013

If an NFS client does

	mkdir("dir");
	fd = open("dir/file");
	unlink("dir/file");
	close(fd);
	rmdir("dir");

then the asynchronous nature of the sillyrename operation means that
we can end up getting EBUSY for the rmdir() in the above test. Fix
that by ensuring that we wait for any in-progress sillyrenames
before sending the rmdir() to the server.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

ba6c0592

30 8月, 2013 1 次提交

NFS: Fix up two use-after-free issues with the new tracing code · 2d9db750

由 Trond Myklebust 提交于 8月 30, 2013

We don't want to pass the context argument to trace_nfs_atomic_open_exit()
after it has been released.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2d9db750

22 8月, 2013 7 次提交

T
NFS: Add tracepoints for debugging NFS hard links · 1fd1085b
由 Trond Myklebust 提交于 8月 21, 2013
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
1fd1085b
T
NFS: Add tracepoints for debugging NFS rename and sillyrename issues · 70ded201
由 Trond Myklebust 提交于 8月 21, 2013
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
70ded201

NFS: Add tracepoints for debugging directory changes · 1ca42382

由 Trond Myklebust 提交于 8月 21, 2013

Add tracepoints for mknod, mkdir, rmdir, remove (unlink) and symlink.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1ca42382

T
NFS: Add tracepoints for debugging generic file create events · 8b0ad3d4
由 Trond Myklebust 提交于 8月 21, 2013
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
8b0ad3d4

NFS: Add event tracing for generic NFS lookups · 6e0d0be7

由 Trond Myklebust 提交于 8月 20, 2013

Add tracepoints for lookup, lookup_revalidate and atomic_open
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

6e0d0be7

NFS: Pass in lookup flags from nfs_atomic_open to nfs_lookup · 1472b83e

由 Trond Myklebust 提交于 8月 20, 2013

When doing an open of a directory, ensure that we do pass the lookup flags
from nfs_atomic_open into nfs_lookup.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1472b83e

NFS: Add event tracing for generic NFS events · f4ce1299

由 Trond Myklebust 提交于 8月 19, 2013

Add tracepoints for inode attribute updates, attribute revalidation,
writeback start/end fsync start/end, attribute change start/end,
permission check start/end.

The intention is to enable performance tracing using 'perf'as well as
improving debugging.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

f4ce1299

21 8月, 2013 1 次提交

NFS: Remove the NFSv4 "open optimisation" from nfs_permission · 5948a401

由 Trond Myklebust 提交于 8月 20, 2013

Ever since commit 6168f62c (Add ACCESS operation to OPEN compound)
the NFSv4 atomic open has primed the access cache, and so nfs_permission
will no longer do an RPC call on the wire.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5948a401

08 8月, 2013 1 次提交

nfs: verify open flags before allowing an atomic open · 9597c13b

由 Jeff Layton 提交于 8月 02, 2013

Currently, you can open a NFSv4 file with O_APPEND|O_DIRECT, but cannot
fcntl(F_SETFL,...) with those flags. This flag combination is explicitly
forbidden on NFSv3 opens, and it seems like it should also be on NFSv4.
Reported-by: NChao Ye <cye@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

9597c13b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功