提交 · 73fb7bc7c57d971b11f2e00536ac2d3e316e0609 · openeuler / Kernel

21 4月, 2012 1 次提交

NFS: put open context on error in nfs_pagein_multi · 73fb7bc7

由 Fred Isaman 提交于 4月 20, 2012

Cc: <stable@vger.kernel.org>
Signed-off-by: NFred Isaman <iisaman@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

73fb7bc7

20 4月, 2012 3 次提交

NFSv4: Fix open(O_TRUNC) and ftruncate() error handling · 451146be

由 Trond Myklebust 提交于 4月 18, 2012

If the file wasn't opened for writing, then truncate and ftruncate
need to report the appropriate errors.
Reported-by: NMiklos Szeredi <miklos@szeredi.hu>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

451146be

NFSv4: Ensure that we check lock exclusive/shared type against open modes · 55725513

由 Trond Myklebust 提交于 4月 18, 2012

Since we may be simulating flock() locks using NFS byte range locks,
we can't rely on the VFS having checked the file open mode for us.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

55725513

NFSv4: Ensure that the LOCK code sets exception->inode · 05ffe24f

由 Trond Myklebust 提交于 4月 18, 2012

All callers of nfs4_handle_exception() that need to handle
NFS4ERR_OPENMODE correctly should set exception->inode
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org

05ffe24f

18 4月, 2012 1 次提交
- F
  NFS: check for req==NULL in nfs_try_to_update_request cleanup · ca138f36
  由 Fred Isaman 提交于 4月 05, 2012
```
Signed-off-by: NFred Isaman <iisaman@netapp.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
```
  ca138f36
01 4月, 2012 20 次提交

vfs: fix out-of-date dentry_unhash() comment · c0d02594

由 J. Bruce Fields 提交于 2月 15, 2012

64252c75 "vfs: remove dget() from
dentry_unhash()" changed the implementation but not the comment.

Cc: Sage Weil <sage@newdream.net>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c0d02594

vfs: split __lookup_hash · bad61189

由 Miklos Szeredi 提交于 3月 26, 2012

Split __lookup_hash into two component functions:

 lookup_dcache - tries cached lookup, returns whether real lookup is needed
 lookup_real - calls i_op->lookup

This eliminates code duplication between d_alloc_and_lookup() and
d_inode_lookup().
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bad61189

A
untangling do_lookup() - take __lookup_hash()-calling case out of line. · 81e6f520
由 Al Viro 提交于 3月 30, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
81e6f520

untangling do_lookup() - switch to calling __lookup_hash() · a3255546

由 Al Viro 提交于 3月 30, 2012

now we have __lookup_hash() open-coded if !dentry case;
just call the damn thing instead...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a3255546

A
untangling do_lookup() - merge d_alloc_and_lookup() callers · a6ecdfcf
由 Al Viro 提交于 3月 30, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a6ecdfcf
A
untangling do_lookup() - merge failure exits in !dentry case · ec335e91
由 Al Viro 提交于 3月 30, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ec335e91
A
untangling do_lookup() - massage !dentry case towards __lookup_hash() · d774a058
由 Al Viro 提交于 3月 30, 2012
```
Reorder if-else cases for starters...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d774a058

untangling do_lookup() - get rid of need_reval in !dentry case · 08b0ab7c

由 Al Viro 提交于 3月 30, 2012

Everything arriving into if (!dentry) will have need_reval = 1.
Indeed, the only way to get there with need_reval reset to 0 would
be via
	if (unlikely(d_need_lookup(dentry)))
		goto unlazy;
	if (unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE)) {
		status = d_revalidate(dentry, nd);
	if (unlikely(status <= 0)) {
		if (status != -ECHILD)
			need_reval = 0;
		goto unlazy;
...
unlazy:
	/* no assignments to dentry */
	if (dentry && unlikely(d_need_lookup(dentry))) {
		dput(dentry);
		dentry = NULL;
	}
and if d_need_lookup() had already been false the first time around, it
will remain false on the second call as well.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

08b0ab7c

untangling do_lookup() - eliminate a loop. · acc9cb3c

由 Al Viro 提交于 3月 30, 2012

d_lookup() *will* fail after successful d_invalidate(), if we are
holding i_mutex all along.  IOW, we don't need to jump back to
l: - we know what path will be taken there and can do that (i.e.
d_alloc_and_lookup()) directly.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

acc9cb3c

A
untangling do_lookup() - expand the area under ->i_mutex · 37c17e1f
由 Al Viro 提交于 3月 30, 2012
```
keep holding ->i_mutex over revalidation parts
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
37c17e1f

untangling do_lookup() - isolate !dentry stuff from the rest of it. · 3f6c7c71

由 Al Viro 提交于 3月 30, 2012

Duplicate the revalidation-related parts into if (!dentry) branch.
Next step will be to pull them under i_mutex.

This and the next 8 commits are more or less a splitup of patch
by Miklos; folks, when you are working with something that convoluted,
carve your patches up into easily reviewed steps, especially when
a lot of codepaths involved are rarely hit...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3f6c7c71

vfs: move MAY_EXEC check from __lookup_hash() · cda309de

由 Miklos Szeredi 提交于 3月 26, 2012

The only caller of __lookup_hash() that needs the exec permission check on
parent is lookup_one_len().

All lookup_hash() callers already checked permission in LOOKUP_PARENT walk.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cda309de

vfs: don't revalidate just looked up dentry · 3637c05d

由 Miklos Szeredi 提交于 3月 26, 2012

__lookup_hash() calls ->lookup() if the dentry needs lookup and on success
revalidates the dentry (all under dir->i_mutex).

While this is harmless it doesn't make a lot of sense.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3637c05d

vfs: fix d_need_lookup/d_revalidate order in do_lookup · fa4ee159

由 Miklos Szeredi 提交于 3月 26, 2012

Doing revalidate on a dentry which has not yet been looked up makes no sense.

Move the d_need_lookup() check before d_revalidate().
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

fa4ee159

A
ext3: move headers to fs/ext3/ · 4613ad18
由 Al Viro 提交于 3月 29, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4613ad18
A
migrate ext2_fs.h guts to fs/ext2/ext2.h · f7699f2b
由 Al Viro 提交于 3月 23, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
f7699f2b
A
get rid of pointless includes of ext2_fs.h · 2f99c369
由 Al Viro 提交于 3月 23, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
2f99c369

pstore: trim pstore_get_inode() · 22a71c30

由 Al Viro 提交于 3月 22, 2012

move mode-dependent parts to callers, kill unused arguments
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

22a71c30

A
aio: take final put_ioctx() into callers of io_destroy() · a2e1859a
由 Al Viro 提交于 3月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
a2e1859a
A
aio: merge aio_cancel_all() with wait_for_all_aios() · 06af121e
由 Al Viro 提交于 3月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
06af121e

31 3月, 2012 1 次提交

tracing, sched, vfs: Fix 'old_pid' usage in trace_sched_process_exec() · 6308191f

由 Oleg Nesterov 提交于 3月 30, 2012

1. TRACE_EVENT(sched_process_exec) forgets to actually use the
   old pid argument, it sets ->old_pid = p->pid.

2. search_binary_handler() uses the wrong pid number. tracepoint
   needs the global pid_t from the root namespace, while old_pid
   is the virtual pid number as it seen by the tracer/parent.

With this patch we have two pid_t's in search_binary_handler(),
not really nice. Perhaps we should switch to "struct pid*", but
in this case it would be better to cleanup the current code
first and move the "depth == 0" code outside.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: David Smith <dsmith@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Link: http://lkml.kernel.org/r/20120330162636.GA4857@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6308191f

30 3月, 2012 3 次提交

Revert "ext4: don't release page refs in ext4_end_bio()" · 6268b325

由 Linus Torvalds 提交于 3月 29, 2012

This reverts commit b43d17f3.

Dave Jones reports that it causes lockups on his laptop, and his debug
output showed a lot of processes hung waiting for page_writeback (or
more commonly - processes hung waiting for a lock that was held during
that writeback wait).

The page_writeback hint made Ted suggest that Dave look at this commit,
and Dave verified that reverting it makes his problems go away.

Ted says:
 "That commit fixes a race which is seen when you write into fallocated
  (and hence uninitialized) disk blocks under *very* heavy memory
  pressure.  Furthermore, although theoretically it could trigger under
  normal direct I/O writes, it only seems to trigger if you are issuing
  a huge number of AIO writes, such that a just-written page can get
  evicted from memory, and then read back into memory, before the
  workqueue has a chance to update the extent tree.

  This race has been around for a little over a year, and no one noticed
  until two months ago; it only happens under fairly exotic conditions,
  and in fact even after trying very hard to create a simple repro under
  lab conditions, we could only reproduce the problem and confirm the
  fix on production servers running MySQL on very fast PCIe-attached
  flash devices.

  Given that Dave was able to hit this problem pretty quickly, if we
  confirm that this commit is at fault, the only reasonable thing to do
  is to revert it IMO."
Reported-and-tested-by: NDave Jones <davej@redhat.com>
Acked-by: NTheodore Ts'o <tytso@mit.edu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6268b325

pagemap: remove remaining unneeded spin_lock() · 10bdfb5e

由 Naoya Horiguchi 提交于 3月 29, 2012

Commit 025c5b24 ("thp: optimize away unnecessary page table
locking") moves spin_lock() into pmd_trans_huge_lock() in order to avoid
locking unless pmd is for thp.  So this spin_lock() is a bug.
Reported-by: NSasha Levin <levinsasha928@gmail.com>
Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10bdfb5e

Btrfs: update the checks for mixed block groups with big metadata blocks · bc3f116f

由 Chris Mason 提交于 3月 29, 2012

Dave Sterba had put in patches to look for mixed data/metadata groups
with metadata bigger than 4KB.  But these ended up in the wrong place
and it wasn't testing the feature flag correctly.

This updates the tests to make sure our sizes are matching
Signed-off-by: NChris Mason <chris.mason@oracle.com>

bc3f116f

29 3月, 2012 11 次提交

Btrfs: update to the right index of defragment · e1f041e1

由 Liu Bo 提交于 3月 29, 2012

When we use autodefrag, we forget to update the index which indicates
the last page we've dirty.  And we'll set dirty flags on a same set of
pages again and again.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e1f041e1

Btrfs: do not bother to defrag an extent if it is a big real extent · 66c26892

由 Liu Bo 提交于 3月 29, 2012

$ mkfs.btrfs /dev/sdb7
$ mount /dev/sdb7 /mnt/btrfs/ -oautodefrag
$ dd if=/dev/zero of=/mnt/btrfs/foobar bs=4k count=10 oflag=direct 2>/dev/null
$ filefrag -v /mnt/btrfs/foobar
Filesystem type is: 9123683e
File size of /mnt/btrfs/foobar is 40960 (10 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0     3072              10 eof
/mnt/btrfs/foobar: 1 extent found

Now we have a big real extent [0, 40960), but autodefrag will still defrag it.

$ sync
$ filefrag -v /mnt/btrfs/foobar
Filesystem type is: 9123683e
File size of /mnt/btrfs/foobar is 40960 (10 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0     3082              10 eof
/mnt/btrfs/foobar: 1 extent found

So if we already find a big real extent, we're ok about that, just skip it.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

66c26892

Btrfs: add a check to decide if we should defrag the range · 17ce6ef8

由 Liu Bo 提交于 3月 29, 2012

If our file's layout is as follows:
| hole | data1 | hole | data2 |

we do not need to defrag this file, because this file has holes and
cannot be merged into one extent.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

17ce6ef8

Btrfs: fix recursive defragment with autodefrag option · 4cb13e5d

由 Liu Bo 提交于 3月 29, 2012

$ mkfs.btrfs disk
$ mount disk /mnt -o autodefrag
$ dd if=/dev/zero of=/mnt/foobar bs=4k count=10 2>/dev/null && sync
$ for i in `seq 9 -2 0`; do dd if=/dev/zero of=/mnt/foobar bs=4k count=1 \
  seek=$i conv=notrunc 2> /dev/null; done && sync

then we'll get to defrag "foobar" again and again.
So does option "-o autodefrag,compress".

Reasons:
When the cleaner kthread gets to fetch inodes from the defrag tree and defrag
them, it will dirty pages and submit them, this will comes to another DATA COW
where the processing inode will be inserted to the defrag tree again.

This patch sets a rule for COW code, i.e. insert an inode when we're really
going to make some defragments.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4cb13e5d

Btrfs: fix the mismatch of page->mapping · 1f12bd06

由 Liu Bo 提交于 3月 29, 2012

commit 600a45e1
(Btrfs: fix deadlock on page lock when doing auto-defragment)
fixes the deadlock on page, but it also introduces another bug.

A page may have been truncated after unlock & lock.
So we need to find it again to get the right one.

And since we've held i_mutex lock, inode size remains unchanged and
we can drop isize overflow checks.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

1f12bd06

Btrfs: fix race between direct io and autodefrag · ecb8bea8

由 Liu Bo 提交于 3月 29, 2012

The bug is from running xfstests 209 with autodefrag.

The race is as follows:
       t1                       t2(autodefrag)
   direct IO
     invalidate pagecache
     dio(old data)             add_inode_defrag
     invalidate pagecache
   endio

   direct IO
     invalidate pagecache
                                run_defrag
                                  readpage(old data)
                                  set page dirty (old data)
     dio(new data, rewrite)
     invalidate pagecache (*)
     endio

t2(autodefrag) will get old data into pagecache via readpage and set
pagecache dirty.  Meanwhile, invalidate pagecache(*) will fail due to
dirty flags in pages.  So the old data may be flushed into disk by
flush thread, which will lead to data loss.

And so does the case of user defragment progs.

The patch fixes this race by holding i_mutex when we readpage and set page dirty.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ecb8bea8

Btrfs: fix deadlock during allocating chunks · 15d1ff81

由 Liu Bo 提交于 3月 29, 2012

This deadlock comes from xfstests 251.

We'll hold the chunk_mutex throughout the whole of a chunk allocation.
But if we find that we've used up system chunk space, we need to allocate a
new system chunk, but this will lead to a recursion of chunk allocation and end
up with a deadlock on chunk_mutex.
So instead we need to allocate the system chunk first if we find we're in ENOSPC.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

15d1ff81

Btrfs: show useful info in space reservation tracepoint · 2bcc0328

由 Liu Bo 提交于 3月 29, 2012

o For space info, the type of space info is useful for debug.
o For transaction handle, its transid is useful.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2bcc0328

nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled · 797a9d79

由 Jeff Layton 提交于 3月 29, 2012

Otherwise, we get a warning or error similar to this when building with
CONFIG_NFSD_V4 disabled:

    ERROR: "nfsd4_cld_block" [fs/nfsd/nfsd.ko] undefined!

Fix this by wrapping the calls to rpc_pipefs_notifier_register and
..._unregister in another function and providing no-op replacements
when CONFIG_NFSD_V4 is disabled.
Reported-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

797a9d79

Btrfs: don't use crc items bigger than 4KB · 7ca4be45

由 Chris Mason 提交于 1月 31, 2012

With the big metadata blocks, we can have crc items
that are much bigger than a page.  There are a few
places that we try to kmalloc memory to hold the
items during a split.

Items bigger than 4KB don't really have a huge benefit
in efficiency, but they do trigger larger order allocations.
This commits changes the csums to make sure they stay under
4KB.  This is not a format change, just a #define to limit
huge items.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7ca4be45

Btrfs: flush out and clean up any block device pages during mount · 3c4bb26b

由 Chris Mason 提交于 3月 27, 2012

Btrfs puts the filesystem metadata into its own address space, and
somehow the block device address space isn't getting onto disk properly
before a mount.  The end result is that a loop of mkfs and mounting the
filesystem will sometimes find stale or incorrect data.

This commit should fix it by sprinkling fdatawrites and invalidate_bdev
calls around.  This is a short term measure to make sure it is fixed.
The block devices really should be flushed and cleaned up higher in the
stack.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

3c4bb26b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功