提交 · fa0d7e3de6d6fc5004ad9dea0dd6b286af8f03e9 · openeuler / raspberrypi-kernel

07 1月, 2011 6 次提交

由 Nick Piggin 提交于 1月 07, 2011

RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page->mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fa0d7e3d

fs: dcache remove dcache_lock · b5c84bf6

由 Nick Piggin 提交于 1月 07, 2011

dcache_lock no longer protects anything. remove it.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

b5c84bf6

fs: Use rename lock and RCU for multi-step operations · 949854d0

由 Nick Piggin 提交于 1月 07, 2011

The remaining usages for dcache_lock is to allow atomic, multi-step read-side
operations over the directory tree by excluding modifications to the tree.
Also, to walk in the leaf->root direction in the tree where we don't have
a natural d_lock ordering.

This could be accomplished by taking every d_lock, but this would mean a
huge number of locks and actually gets very tricky.

Solve this instead by using the rename seqlock for multi-step read-side
operations, retry in case of a rename so we don't walk up the wrong parent.
Concurrent dentry insertions are not serialised against. Concurrent deletes
are tricky when walking up the directory: our parent might have been deleted
when dropping locks so also need to check and retry for that.

We can also use the rename lock in cases where livelock is a worry (and it
is introduced in subsequent patch).
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

949854d0

fs: scale inode alias list · b23fb0a6

由 Nick Piggin 提交于 1月 07, 2011

Add a new lock, dcache_inode_lock, to protect the inode's i_dentry list
from concurrent modification. d_alias is also protected by d_lock.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

b23fb0a6

fs: dcache scale dentry refcount · b7ab39f6

由 Nick Piggin 提交于 1月 07, 2011

Make d_count non-atomic and protect it with d_lock. This allows us to ensure a
0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when
we start protecting many other dentry members with d_lock.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

b7ab39f6

fs: change d_delete semantics · fe15ce44

由 Nick Piggin 提交于 1月 07, 2011

Change d_delete from a dentry deletion notification to a dentry caching
advise, more like ->drop_inode. Require it to be constant and idempotent,
and not take d_lock. This is how all existing filesystems use the callback
anyway.

This makes fine grained dentry locking of dput and dentry lru scanning
much simpler.
Signed-off-by: NNick Piggin <npiggin@kernel.dk>

fe15ce44

11 12月, 2010 1 次提交

NFS: Fix panic after nfs_umount() · 5b362ac3

由 Chuck Lever 提交于 12月 10, 2010

After a few unsuccessful NFS mount attempts in which the client and
server cannot agree on an authentication flavor both support, the
client panics.  nfs_umount() is invoked in the kernel in this case.

Turns out nfs_umount()'s UMNT RPC invocation causes the RPC client to
write off the end of the rpc_clnt's iostat array.  This is because the
mount client's nrprocs field is initialized with the count of defined
procedures (two: MNT and UMNT), rather than the size of the client's
proc array (four).

The fix is to use the same initialization technique used by most other
upper layer clients in the kernel.

Introduced by commit 0b524123, which failed to update nrprocs when
support was added for UMNT in the kernel.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=24302
BugLink: http://bugs.launchpad.net/bugs/683938Reported-by: NStefan Bader <stefan.bader@canonical.com>
Tested-by: NStefan Bader <stefan.bader@canonical.com>
Cc: stable@kernel.org # >= 2.6.32
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5b362ac3

08 12月, 2010 6 次提交

nfs: remove extraneous and problematic calls to nfs_clear_request · 2df485a7

由 Trond Myklebust 提交于 12月 07, 2010

When a nfs_page is freed, nfs_free_request is called which also calls
nfs_clear_request to clean out the lock and open contexts and free the
pagecache page.

However, a couple of places in the nfs code call nfs_clear_request
themselves. What happens here if the refcount on the request is still high?
We'll be releasing contexts and freeing pointers while the request is
possibly still in use.

Remove those bare calls to nfs_clear_context. That should only be done when
the request is being freed.

Note that when doing this, we need to watch out for tests of req->wb_page.
Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked()
would check the value of req->wb_page to figure out if the page is mapped
into the nfsi->nfs_page_tree. We now indicate the page is mapped using
the new bit PG_MAPPED in req->wb_flags .
Reported-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

2df485a7

nfs: kernel should return EPROTONOSUPPORT when not support NFSv4 · 0de1b7e8

由 Mi Jinlong 提交于 10月 30, 2010

  When nfs client(kernel) don't support NFSv4, maybe user build
  kernel without NFSv4, there is a problem.

  Using command "mount SERVER-IP:/nfsv3 /mnt/" to mount NFSv3
  filesystem, mount should should success, but fail and get error:

    "mount.nfs: an incorrect mount option was specified"

  System call mount "nfs"(not "nfs4") with "vers=4",
  if CONFIG_NFS_V4 is not defined, the "vers=4" will be parsed
  as invalid argument and kernel return EINVAL to nfs-utils.

  About that, we really want get EPROTONOSUPPORT rather than
  EINVAL. This path make sure kernel parses argument success,
  and return EPROTONOSUPPORT at nfs_validate_mount_data().
Signed-off-by: NMi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

0de1b7e8

NFS: Fix fcntl F_GETLK not reporting some conflicts · 21ac19d4

由 Sergey Vlasov 提交于 11月 28, 2010

The commit 129a84de (locks: fix F_GETLK
regression (failure to find conflicts)) fixed the posix_test_lock()
function by itself, however, its usage in NFS changed by the commit
9d6a8c5c (locks: give posix_test_lock
same interface as ->lock) remained broken - subsequent NFS-specific
locking code received F_UNLCK instead of the user-specified lock type.
To fix the problem, fl->fl_type needs to be saved before the
posix_test_lock() call and restored if no local conflicts were reported.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=23892Tested-by: NAlexander Morozov <amorozov@etersoft.ru>
Signed-off-by: NSergey Vlasov <vsu@altlinux.ru>
Cc: <stable@kernel.org>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

21ac19d4

nfs: Discard ACL cache on mode update · 08a22b39

由 Aneesh Kumar K.V 提交于 12月 01, 2010

An update of mode bits can result in ACL value being changed. We need
to mark the acl cache invalid when we update mode. Similarly we need
to update file attribute when we change ACL value
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

08a22b39

NFS: Readdir cleanups · 47c716cb

由 Trond Myklebust 提交于 12月 07, 2010

No functional changes, but clarify the code.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

47c716cb

NFS: nfs_readdir_search_for_cookie() don't mark as eof if cookie not found · 18fb5fe4

由 Trond Myklebust 提交于 12月 07, 2010

If we're searching for a specific cookie, and it isn't found in the page
cache, we should try an uncached_readdir(). To do so, we return EBADCOOKIE,
but we don't set desc->eof.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

18fb5fe4

02 12月, 2010 1 次提交

NFS: Fix a memory leak in nfs_readdir · 11de3b11

由 Trond Myklebust 提交于 12月 01, 2010

We need to ensure that the entries in the nfs_cache_array get cleared
when the page is removed from the page cache. To do so, we use the
freepage address_space operation.

Change nfs_readdir_clear_array to use kmap_atomic(), so that the
function can be safely called from all contexts.

Finally, modify the cache_page_release helper to call
nfs_readdir_clear_array directly, when dealing with an anonymous
page from 'uncached_readdir'.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

11de3b11

01 12月, 2010 2 次提交

NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler · 0aded708

由 Trond Myklebust 提交于 11月 30, 2010

We need to use the cookie from the previous array entry, not the
actual cookie that we are searching for (except for the case of
uncached_readdir).
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

0aded708

NFS: Fix a readdirplus bug · 37a09f07

由 Trond Myklebust 提交于 11月 30, 2010

When comparing filehandles in the helper nfs_same_file(), we should not be
using 'strncmp()': filehandles are not null terminated strings.

Instead, we should just use the existing helper nfs_compare_fh().
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

37a09f07

23 11月, 2010 9 次提交

NFS: Ensure we return the dirent->d_type when it is known · 0b26a0bf

由 Trond Myklebust 提交于 11月 20, 2010

Store the dirent->d_type in the struct nfs_cache_array_entry so that we
can use it in getdents() calls.

This fixes a regression with the new readdir code.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

0b26a0bf

NFS: Correct the array bound calculation in nfs_readdir_add_to_array · 3020093f

由 Trond Myklebust 提交于 11月 20, 2010

It looks as if the array size calculation in MAX_READDIR_ARRAY does not
take the alignment of struct nfs_cache_array_entry into account.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

3020093f

NFS: Don't ignore errors from nfs_do_filldir() · ece0b423

由 Trond Myklebust 提交于 11月 20, 2010

We should ignore the errors from the filldir callback, and just interpret
them as meaning we should exit, however we should definitely pass back
ENOMEM errors.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

ece0b423

NFS: Fix the error handling in "uncached_readdir()" · 85f8607e

由 Trond Myklebust 提交于 11月 20, 2010

Currently, uncached_readdir() is broken because if fails to handle
the results from nfs_readdir_xdr_to_array() correctly.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

85f8607e

T
NFS: Fix a page leak in uncached_readdir() · 7a8e1dc3
由 Trond Myklebust 提交于 11月 20, 2010
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
7a8e1dc3

NFS: Fix a page leak in nfs_do_filldir() · e7c58e97

由 Trond Myklebust 提交于 11月 20, 2010

nfs_do_filldir() must always free desc->page when it is done, otherwise
we end up leaking the page.

Also remove unused variable 'dentry'.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

e7c58e97

NFS: Assume eof if the server returns no readdir records · 5c346854

由 Trond Myklebust 提交于 11月 20, 2010

Some servers are known to be buggy w.r.t. this. Deal with them...
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

5c346854

NFS: Buffer overflow in ->decode_dirent() should not be fatal · 463a376e

由 Trond Myklebust 提交于 11月 20, 2010

Overflowing the buffer in the readdir ->decode_dirent() should not lead to
a fatal error, but rather to an attempt to reread the record in question.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

463a376e

Pure nfs client performance using odirect. · b47d19de

由 Arun Bharadwaj 提交于 11月 18, 2010

When an application opens a file with O_DIRECT flag, if the size of
the data that is written is equal to wsize, the client sends a
WRITE RPC with stable flag set to UNSTABLE followed by a single
COMMIT RPC rather than sending a single WRITE RPC with the stable
flag set to FILE_SYNC. This a bug.

Patch to fix this.
Signed-off-by: NArun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

b47d19de

18 11月, 2010 1 次提交

BKL: remove extraneous #include <smp_lock.h> · 451a3c24

由 Arnd Bergmann 提交于 11月 17, 2010

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

451a3c24

17 11月, 2010 1 次提交

nfs: Ignore kmemleak false positive in nfs_readdir_make_qstr · 04e4bd1c

由 Catalin Marinas 提交于 11月 11, 2010

Strings allocated via kmemdup() in nfs_readdir_make_qstr() are
referenced from the nfs_cache_array which is stored in a page cache
page. Kmemleak does not scan such pages and it reports several false
positives. This patch annotates the string->name pointer so that
kmemleak does not consider it a real leak.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

04e4bd1c

16 11月, 2010 4 次提交

T
NFS: readdir shouldn't read beyond the reply returned by the server · ac396128
由 Trond Myklebust 提交于 11月 15, 2010
```
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
ac396128

NFS: Fix a couple of regressions in readdir. · 8cd51a0c

由 Trond Myklebust 提交于 11月 15, 2010

Fix up the issue that array->eof_index needs to be able to be set
even if array->size == 0.

Ensure that we catch all important memory allocation error conditions
and/or kmap() failures.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8cd51a0c

Revert "NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR" · 23ebbd9a

由 Trond Myklebust 提交于 11月 03, 2010

This reverts commit 80e60639.

This change requires further fixes to ensure that the open doesn't
succeed if the lookup later results in a regular file being created.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

23ebbd9a

Regression: fix mounting NFS when NFSv3 support is not compiled · 1e657bd5

由 Paulius Zaleckas 提交于 10月 31, 2010

Trying to mount NFS (root partition in my case) fails if CONFIG_NFS_V3
is not selected. nfs_validate_mount_data() returns EPROTONOSUPPORT,
because of this check:

#ifndef CONFIG_NFS_V3
	if (args->version == 3)
		goto out_v3_not_compiled;
#endif /* !CONFIG_NFS_V3 */

and args->version was always initialized to 3.

It was working in 2.6.36
Signed-off-by: NPaulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

1e657bd5

31 10月, 2010 2 次提交

locks: let the caller free file_lock on ->setlease failure · 51ee4b84

由 Christoph Hellwig 提交于 10月 31, 2010

The caller allocated it, the caller should free it.

The only issue so far is that we could change the flp pointer even on an
error return if the fl_change callback failed. But we can simply move
the flp assignment after the fl_change invocation, as the callers don't
care about the flp return value if the setlease call failed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51ee4b84

locks: fix setlease methods to free passed-in lock · 05fa3135

由 J. Bruce Fields 提交于 10月 30, 2010

We modified setlease to require the caller to allocate the new lease in
the case of creating a new lease, but forgot to fix up the filesystem
methods.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05fa3135

29 10月, 2010 3 次提交

A
convert simple cases of nfs-related ->get_sb() to ->mount() · 31f43471
由 Al Viro 提交于 10月 29, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
31f43471
A
a couple of open-coded ihold() introduced by nfs merge · a4118ee1
由 Al Viro 提交于 10月 27, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a4118ee1

nfs4: The difference of 2 pointers is ptrdiff_t · 12364a4f

由 Geert Uytterhoeven 提交于 10月 28, 2010

On m68k, which is 32-bit:

fs/nfs/nfs4proc.c: In function ‘nfs41_sequence_done’:
fs/nfs/nfs4proc.c:432: warning: format ‘%ld’ expects type ‘long int’, but argument 3 has type ‘int’
fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’:
fs/nfs/nfs4proc.c:576: warning: format ‘%ld’ expects type ‘long int’, but argument 5 has type ‘int’

On 32-bit, ptrdiff_t is int; on 64-bit, ptrdiff_t is long.

Introduced by commit dfb4f309 ("NFSv4.1: keep
seq_res.sr_slot as pointer rather than an index")
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

12364a4f

28 10月, 2010 4 次提交

nfs: testing the wrong variable · 8f0d97b4

由 Dan Carpenter 提交于 10月 28, 2010

The intent was to test "*desc" for allocation failures, but it tests
"desc" which is always a valid pointer here.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

8f0d97b4

nfs: handle lock context allocation failures in nfs_create_request · 015f0212

由 Jeff Layton 提交于 10月 28, 2010

nfs_get_lock_context can return NULL on an allocation failure.
Regression introduced by commit f11ac8db.
Reported-by: NSteve Dickson <steved@redhat.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

015f0212

Fixed Regression in NFS Direct I/O path · 568a810d

由 Steve Dickson 提交于 10月 28, 2010

A typo, introduced by commit f11ac8db, in the nfs_direct_write()
routine causes writes with O_DIRECT set to fail with a ENOMEM error.
Found-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve Dickson <steved@redhat.com>
Cc: stable@kernel.org
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

568a810d

lockd: push lock_flocks down · 763641d8

由 Arnd Bergmann 提交于 10月 26, 2010

lockd should use lock_flocks() instead of lock_kernel()
to lock against posix locks accessing the i_flock list.

This is a prerequisite to turning lock_flocks into a
spinlock.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NJ. Bruce Fields <bfields@redhat.com>

763641d8