提交 · 9bae113a085b790de384bf86f09e15b42a65a985 · openeuler / Kernel

27 7月, 2011 9 次提交

ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC · 9bae113a

由 Sage Weil 提交于 7月 26, 2011

We only need to put these on the directory unsafe list if they have
side effects that fsync(2) should flush out.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9bae113a

ceph: fix bad parent_inode calc in ceph_lookup_open · acda7657

由 Sage Weil 提交于 7月 26, 2011

We were always getting NULL here because the intent file f_dentry is always
NULL at this point, which means we were always passing NULL to
ceph_mdsc_do_request.  In reality, this was fine, since this isn't
currently ever a write operation that needs to get strung on the dir's
unsafe list.

Use the dir explicitly, and only pass it if this open has side-effects that
a dir fsync should flush.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

acda7657

ceph: avoid carrying Fw cap during write into page cache · d8de9ab6

由 Sage Weil 提交于 7月 26, 2011

The generic_file_aio_write call may block on balance_dirty_pages while we
flush data to the OSDs.  If we hold a reference to the FILE_WR cap during
that interval revocation by the MDS (e.g., to do a stat(2)) may be very
slow.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

d8de9ab6

ceph: report f_bfree based on kb_avail rather than diffing. · 8f04d422

由 Greg Farnum 提交于 7月 26, 2011

Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NGreg Farnum <gregory.farnum@dreamhost.com>

8f04d422

ceph: only queue capsnap if caps are dirty · e77dc3e9

由 Sage Weil 提交于 7月 26, 2011

We used to go into this branch if i_wrbuffer_ref_head was non-zero.  This
was an ancient check from before we were careful about dealing with all
kinds of caps (and not just dirty pages).  It is cleaner to only queue a
capsnap if there is an actual dirty cap.  If we are racing with...
something...we will end up here with ci->i_wrbuffer_refs but no dirty
caps.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

e77dc3e9

ceph: fix snap writeback when racing with writes · af0ed569

由 Sage Weil 提交于 7月 26, 2011

There are two problems that come up when we try to queue a capsnap while a
write is in progress:

 - The FILE_WR cap is held, but not yet dirty, so we may queue a capsnap
   with dirty == 0.  That will crash later in __ceph_flush_snaps().  Or
   on the FILE_WR cap if a write is in progress.
 - We may not have i_head_snapc set, which causes problems pretty quickly.
   Look to the snaprealm in this case.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

af0ed569

ceph: use flag bit for at_end readdir flag · 9cfa1098

由 Sage Weil 提交于 7月 26, 2011

This saves us a word of memory per file.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9cfa1098

ceph: add F_SYNC file flag to force sync (non-O_DIRECT) io · 4918b6d1

由 Sage Weil 提交于 7月 26, 2011

This allows us to force IO through the sync path which you normally only
get when multiple clients are reading/writing to the same file or by
mounting with -o sync.  Among other things, this lets test programs verify
correctness with a single mount.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

4918b6d1

ceph: add flags field to file_info · 252c6728

由 Sage Weil 提交于 7月 26, 2011

Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

252c6728

22 7月, 2011 2 次提交

vfs: drop conditional inode prefetch in __do_lookup_rcu · b91da88f

由 Linus Torvalds 提交于 7月 21, 2011

It seems to hurt performance in real life.  Yes, the inode will be used
later, but the conditional doesn't seem to predict all that well
(negative dentries are not uncommon) and it looks like the cost of
prefetching is simply higher than depending on the cache doing the right
thing.

As usual.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91da88f

FS-Cache: Fix __fscache_uncache_all_inode_pages()'s outer loop · b307d465

由 Jan Beulich 提交于 7月 21, 2011

The compiler, at least for ix86 and m68k, validly warns that the
comparison:

	next <= (loff_t)-1

is always true (and it's always true also for x86-64 and probably all
other arches - as long as pgoff_t isn't wider than loff_t).  The
intention appears to be to avoid wrapping of "next", so rather than
eliminating the pointless comparison, fix the loop to indeed get exited
when "next" would otherwise wrap.

On m68k the following warning is observed:

  fs/fscache/page.c: In function '__fscache_uncache_all_inode_pages':
  fs/fscache/page.c:979: warning: comparison is always false due to limited range of data type
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Reported-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b307d465

21 7月, 2011 1 次提交

CIFS: Fix wrong length in cifs_iovec_read · 2cebaa58

由 Pavel Shilovsky 提交于 7月 20, 2011

Signed-off-by: NPavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

2cebaa58

20 7月, 2011 2 次提交

fs/libfs.c: fix simple_attr_write() on 32bit machines · f7b88631

由 Akinobu Mita 提交于 7月 19, 2011

Assume that /sys/kernel/debug/dummy64 is debugfs file created by
debugfs_create_x64().

	# cd /sys/kernel/debug
	# echo 0x1234567812345678 > dummy64
	# cat dummy64
	0x0000000012345678

	# echo 0x80000000 > dummy64
	# cat dummy64
	0xffffffff80000000

A value larger than INT_MAX cannot be written to the debugfs file created
by debugfs_create_u64 or debugfs_create_x64 on 32bit machine.  Because
simple_attr_write() uses simple_strtol() for the conversion.

To fix this, use simple_strtoll() instead.
Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f7b88631

vfs: fix race in rcu lookup of pruned dentry · 59430262

由 Linus Torvalds 提交于 7月 18, 2011

Don't update *inode in __follow_mount_rcu() until we'd verified that
there is mountpoint there.  Kudos to Hugh Dickins for catching that
one in the first place and eventually figuring out the solution (and
catching a braino in the earlier version of patch).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

59430262

19 7月, 2011 1 次提交

Fix cifs_get_root() · fec11dd9

由 Al Viro 提交于 7月 18, 2011

Add missing ->i_mutex, convert to lookup_one_len() instead of
(broken) open-coded analog, cope with getting something like
a//b as relative pathname.  Simplify the hell out of it, while
we are there...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Reviewed-by: NJeff Layton <jlayton@redhat.com>

fec11dd9

18 7月, 2011 5 次提交

hppfs_lookup(): don't open-code lookup_one_len() · 0916a5e4

由 Al Viro 提交于 7月 17, 2011

... and it's getting it wrong, too - missing ->d_revalidate() calls when
it's dealing with filesystem (procfs) that has non-trivial ->d_revalidate()...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0916a5e4

A
hppfs: fix dentry leak · 3cc0658e
由 Al Viro 提交于 7月 17, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3cc0658e

cramfs: get_cramfs_inode() returns ERR_PTR() on failure · 0577d1ba

由 Al Viro 提交于 7月 17, 2011

... and we want to report these failures in ->lookup() anyway.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

0577d1ba

A
ufs should use d_splice_alias() · 642c937b
由 Al Viro 提交于 7月 17, 2011
```
it's NFS-exportable, so...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
642c937b

fix exofs ->get_parent() · a803b806

由 Al Viro 提交于 7月 08, 2011

NULL is not a possible return value for that method, TYVM...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

a803b806

17 7月, 2011 2 次提交

ceph analog of cifs build_path_from_dentry() race fix · 1b71fe2e

由 Al Viro 提交于 7月 16, 2011

... unfortunately, cifs bug got copied.  Fix is essentially the same.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1b71fe2e

cifs: build_path_from_dentry() race fix · dc137bf5

由 Al Viro 提交于 7月 16, 2011

deal with d_move() races properly; rename_lock read-retry loop,
rcu_read_lock() held while walking to root, d_lock held over
subtraction from namelen and copying the component to stabilize
->d_name.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

dc137bf5

15 7月, 2011 1 次提交

fix loop checks in d_materialise_unique() · 18367501

由 Al Viro 提交于 7月 12, 2011

Both __d_unalias() and __d_materialise_dentry() need loop prevention.
Grab rename_lock in caller, check for loops there...

As a side benefit, we have dentry_lock_for_move() called only under
rename_lock, which seriously reduces deadlock potential of the
execrable "locking order" used for ->d_lock.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

18367501

14 7月, 2011 1 次提交

GFS2: Resolve inode eviction and ail list interaction bug · 380f7c65

由 Steven Whitehouse 提交于 7月 14, 2011

This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Tested-by: NBob Peterson <rpeterso@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Reported-by: NBarry J. Marson <bmarson@redhat.com>
Reported-by: NDavid Teigland <teigland@redhat.com>

380f7c65

13 7月, 2011 4 次提交

Fix ->d_lock locking order in unlazy_walk() · 94c0d4ec

由 Al Viro 提交于 7月 12, 2011

Make sure that child is still a child of parent before nested locking
of child->d_lock in unlazy_walk(); otherwise we are risking a violation
of locking order and deadlocks.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

94c0d4ec

S
[CIFS] update cifs to version 1.74 · c2ec9471
由 Steve French 提交于 7月 12, 2011
```
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
c2ec9471

[CIFS] update limit for snprintf in cifs_construct_tcon · ea1be1a3

由 Steve French 提交于 7月 12, 2011

In 34c87901 "Shrink stack space usage in cifs_construct_tcon" we
change the size of the username name buffer from MAX_USERNAME_SIZE
(256) to 28.  This call to snprintf() needs to be updated as well.
Reported by Dan Carpenter.
Reviewed-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

ea1be1a3

cifs: Fix signing failure when server mandates signing for NTLMSSP · 62411ab2

由 Shirish Pargaonkar 提交于 7月 10, 2011

When using NTLMSSP authentication mechanism, if server mandates
signing, keep the flags in type 3 messages of the NTLMSSP exchange
same as in type 1 messages (i.e. keep the indicated capabilities same).

Some of the servers such as Samba, expect the flags such as
Negotiate_Key_Exchange in type 3 message of NTLMSSP exchange as well.
Some servers like Windows do not.

https://bugzilla.samba.org/show_bug.cgi?id=8212Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail>
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

62411ab2

12 7月, 2011 4 次提交

GFS2: Fix race during filesystem mount · 3942ae53

由 Steven Whitehouse 提交于 7月 11, 2011

There is a potential race during filesystem mounting which has recently
been reported. It occurs when the userland gfs_controld is able to
process requests fast enough that it tries to use the sysfs interface
before the lock module is properly initialised. This is a pretty
unusual case as normally the lock module initialisation is very quick
compared with gfs_controld.

This patch adds an interruptible completion which is used to ensure that
userland will wait for the initialisation of the lock module to
complete.

There are other potential solutions to this problem, but this is the
quickest at this stage and has been tested both with and without
mount.gfs2 present in the system.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reported-by: NDavid Booher <dbooher@adams.net>

3942ae53

GFS2: force a log flush when invalidating the rindex glock · 1ce53368

由 Benjamin Marzinski 提交于 6月 13, 2011

Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.

A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().

This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1ce53368

NFSv4.1: update nfs4_fattr_bitmap_maxsz · e5012d1f

由 Andy Adamson 提交于 7月 11, 2011

Attribute IDs assigned in RFC 5661 now require three bitmaps.
Fixes hitting a BUG_ON in xdr_shrink_bufhead when getting ACLs.
Signed-off-by: NAndy Adamson <andros@netapp.com>
Cc:stable@kernel.org [2.6.39]
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

e5012d1f

cifs: drop spinlock before calling cifs_put_tlink · f484b5d0

由 Jeff Layton 提交于 7月 11, 2011

...as that function can sleep.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

f484b5d0

10 7月, 2011 2 次提交

cifs: fix expand_dfs_referral · b9bce2e9

由 Jeff Layton 提交于 7月 06, 2011

Regression introduced in commit 724d9f1c.

Prior to that, expand_dfs_referral would regenerate the mount data string
and then call cifs_parse_mount_options to re-parse it (klunky, but it
worked). The above commit moved cifs_parse_mount_options out of cifs_mount,
so the re-parsing of the new mount options no longer occurred. Fix it by
making expand_dfs_referral re-parse the mount options.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

b9bce2e9

cifs: move bdi_setup_and_register outside of CONFIG_CIFS_DFS_UPCALL · 20547490

由 Jeff Layton 提交于 7月 09, 2011

This needs to be done regardless of whether that KConfig option is set
or not.
Reported-by: NSven-Haegar Koch <haegar@sdinet.de>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

20547490

08 7月, 2011 2 次提交

cifs: factor smb_vol allocation out of cifs_setup_volume_info · 04db79b0

由 Jeff Layton 提交于 7月 06, 2011

Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: NPavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

04db79b0

FS-Cache: Add a helper to bulk uncache pages on an inode · c902ce1b

由 David Howells 提交于 7月 07, 2011

Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601
Reported-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: NSuresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c902ce1b

07 7月, 2011 4 次提交

btrfs: fix oops when doing space balance · 149e2d76

由 Miao Xie 提交于 7月 06, 2011

We need to make sure the data relocation inode doesn't go through
the delayed metadata updates, otherwise we get an oops during balance:

kernel BUG at fs/btrfs/relocation.c:4303!
[SNIP]
Call Trace:
 [<ffffffffa03143fd>] ? update_ref_for_cow+0x22d/0x330 [btrfs]
 [<ffffffffa0314951>] __btrfs_cow_block+0x451/0x5e0 [btrfs]
 [<ffffffffa031355d>] ? read_block_for_search+0x14d/0x4d0 [btrfs]
 [<ffffffffa0314beb>] btrfs_cow_block+0x10b/0x240 [btrfs]
 [<ffffffffa031acae>] btrfs_search_slot+0x49e/0x7a0 [btrfs]
 [<ffffffffa032d8af>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
 [<ffffffff8147bf0e>] ? mutex_lock+0x1e/0x50
 [<ffffffffa0380cf1>] btrfs_update_delayed_inode+0x71/0x160 [btrfs]
 [<ffffffffa037ff27>] ? __btrfs_release_delayed_node+0x67/0x190 [btrfs]
 [<ffffffffa0381cf8>] btrfs_run_delayed_items+0xe8/0x120 [btrfs]
 [<ffffffffa03365e0>] btrfs_commit_transaction+0x250/0x850 [btrfs]
 [<ffffffff810f91d9>] ? find_get_pages+0x39/0x130
 [<ffffffffa0336cd5>] ? join_transaction+0x25/0x250 [btrfs]
 [<ffffffff81081de0>] ? wake_up_bit+0x40/0x40
 [<ffffffffa03785fa>] prepare_to_relocate+0xda/0xf0 [btrfs]
 [<ffffffffa037f2bb>] relocate_block_group+0x4b/0x620 [btrfs]
 [<ffffffffa0334cf5>] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
 [<ffffffffa037fa43>] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
 [<ffffffffa0368ec0>] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
 [<ffffffffa035e39b>] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
 [<ffffffffa031303d>] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
 [<ffffffffa03577d8>] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [<ffffffffa031bea1>] ? btrfs_previous_item+0xb1/0x150 [btrfs]
 [<ffffffffa03577d8>] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [<ffffffffa035f5aa>] btrfs_balance+0x21a/0x2b0 [btrfs]
 [<ffffffffa0368898>] btrfs_ioctl+0x798/0xd20 [btrfs]
 [<ffffffff8111e358>] ? handle_mm_fault+0x148/0x270
 [<ffffffff814809e8>] ? do_page_fault+0x1d8/0x4b0
 [<ffffffff81160d6a>] do_vfs_ioctl+0x9a/0x540
 [<ffffffff811612b1>] sys_ioctl+0xa1/0xb0
 [<ffffffff81484ec2>] system_call_fastpath+0x16/0x1b
[SNIP]
RIP  [<ffffffffa037c1cc>] btrfs_reloc_cow_block+0x22c/0x270 [btrfs]
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

149e2d76

Btrfs: don't panic if we get an error while balancing V2 · 508794eb

由 Josef Bacik 提交于 7月 02, 2011

A user reported an error where if we try to balance an fs after a device has
been removed it will blow up. This is because we get an EIO back and this is
where BUG_ON(ret) bites us in the ass. To fix we just exit. Thanks,
Reported-by: NAnand Jain <Anand.Jain@oracle.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

508794eb

btrfs: add missing options displayed in mount output · 0942caa3

由 David Sterba 提交于 6月 28, 2011

There are three missed mount options settable by user which are not
currently displayed in mount output.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0942caa3

xfs: unpin stale inodes directly in IOP_COMMITTED · 1316d4da

由 Dave Chinner 提交于 7月 04, 2011

When inodes are marked stale in a transaction, they are treated
specially when the inode log item is being inserted into the AIL.
It tries to avoid moving the log item forward in the AIL due to a
race condition with the writing the underlying buffer back to disk.
The was "fixed" in commit de25c181 ("xfs: avoid moving stale inodes
in the AIL").

To avoid moving the item forward, we return a LSN smaller than the
commit_lsn of the completing transaction, thereby trying to trick
the commit code into not moving the inode forward at all. I'm not
sure this ever worked as intended - it assumes the inode is already
in the AIL, but I don't think the returned LSN would have been small
enough to prevent moving the inode. It appears that the reason it
worked is that the lower LSN of the inodes meant they were inserted
into the AIL and flushed before the inode buffer (which was moved to
the commit_lsn of the transaction).

The big problem is that with delayed logging, the returning of the
different LSN means insertion takes the slow, non-bulk path.  Worse
yet is that insertion is to a position -before- the commit_lsn so it
is doing a AIL traversal on every insertion, and has to walk over
all the items that have already been inserted into the AIL. It's
expensive.

To compound the matter further, with delayed logging inodes are
likely to go from clean to stale in a single checkpoint, which means
they aren't even in the AIL at all when we come across them at AIL
insertion time. Hence these were all getting inserted into the AIL
when they simply do not need to be as inodes marked XFS_ISTALE are
never written back.

Transactional/recovery integrity is maintained in this case by the
other items in the unlink transaction that were modified (e.g. the
AGI btree blocks) and committed in the same checkpoint.

So to fix this, simply unpin the stale inodes directly in
xfs_inode_item_committed() and return -1 to indicate that the AIL
insertion code does not need to do any further processing of these
inodes.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

1316d4da

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功