提交 · 5b4aacefb8fbfc996e68b9b083d30f8bc0972449 · openanolis / cloud-kernel

30 11月, 2016 4 次提交

btrfs: return early from failed memory allocations in ioctl handlers · 7b9ea627

由 Shailendra Verma 提交于 11月 10, 2016

There is no need to call kfree() if memdup_user() fails, as no memory
was allocated and the error in the error-valued pointer should be returned.
Signed-off-by: NShailendra Verma <shailendra.v@samsung.com>
[ edit subject ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7b9ea627

btrfs: remove constant parameter to memset_extent_buffer and rename it · b159fa28

由 David Sterba 提交于 11月 08, 2016

The only memset we do is to 0, so sink the parameter to the function and
simplify all calls. Rename the function to reflect the behaviour.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b159fa28

D
btrfs: use new helpers to set uuids in eb · d24ee97b
由 David Sterba 提交于 11月 09, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
d24ee97b
D
btrfs: remove unused headers, statfs.h · 926b9233
由 David Sterba 提交于 10月 05, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
926b9233

25 10月, 2016 1 次提交

btrfs: make file clone aware of fatal signals · 69ae5e44

由 Wang Xiaoguang 提交于 10月 13, 2016

Indeed this just make the behavior similar to xfs when process has
fatal signals pending, and it'll make fstests/generic/298 happy.
Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

69ae5e44

28 9月, 2016 1 次提交

fs: Replace current_fs_time() with current_time() · c2050a45

由 Deepa Dinamani 提交于 9月 14, 2016

current_fs_time() uses struct super_block* as an argument.
As per Linus's suggestion, this is changed to take struct
inode* as a parameter instead. This is because the function
is primarily meant for vfs inode timestamps.
Also the function was renamed as per Arnd's suggestion.

Change all calls to current_fs_time() to use the new
current_time() function instead. current_fs_time() will be
deleted.
Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c2050a45

27 9月, 2016 1 次提交

btrfs: unsplit printed strings · 5d163e0e

由 Jeff Mahoney 提交于 9月 20, 2016

CodingStyle chapter 2:
"[...] never break user-visible strings such as printk messages,
because that breaks the ability to grep for them."

This patch unsplits user-visible strings.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5d163e0e

22 9月, 2016 1 次提交

btrfs: ensure that file descriptor used with subvol ioctls is a dir · 325c50e3

由 Jeff Mahoney 提交于 9月 21, 2016

If the subvol/snapshot create/destroy ioctls are passed a regular file
with execute permissions set, we'll eventually Oops while trying to do
inode->i_op->lookup via lookup_one_len.

This patch ensures that the file descriptor refers to a directory.

Fixes: cb8e7090 (Btrfs: Fix subvolume creation locking rules)
Fixes: 76dda93c (Btrfs: add snapshot/subvolume destroy ioctl)
Cc: <stable@vger.kernel.org> #v2.6.29+
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

325c50e3

25 8月, 2016 1 次提交

btrfs: waiting on qgroup rescan should not always be interruptible · d06f23d6

由 Jeff Mahoney 提交于 8月 08, 2016

We wait on qgroup rescan completion in three places: file system
shutdown, the quota disable ioctl, and the rescan wait ioctl.  If the
user sends a signal while we're waiting, we continue happily along.  This
is expected behavior for the rescan wait ioctl.  It's racy in the shutdown
path but mostly works due to other unrelated synchronization points.
In the quota disable path, it Oopses the kernel pretty much immediately.

Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

d06f23d6

26 7月, 2016 3 次提交

btrfs: btrfs_abort_transaction, drop root parameter · 66642832

由 Jeff Mahoney 提交于 6月 10, 2016

__btrfs_abort_transaction doesn't use its root parameter except to
obtain an fs_info pointer.  We can obtain that from trans->root->fs_info
for now and from trans->fs_info in a later patch.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

66642832

btrfs: copy_to_sk drop unused root parameter · df397565

由 Jeff Mahoney 提交于 6月 21, 2016

The root parameter for copy_to_sk is not used at all.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

df397565

btrfs: btrfs_test_opt and friends should take a btrfs_fs_info · 3cdde224

由 Jeff Mahoney 提交于 6月 09, 2016

btrfs_test_opt and friends only use the root pointer to access
the fs_info.  Let's pass the fs_info directly in preparation to
eliminate similar patterns all over btrfs.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

3cdde224

26 5月, 2016 2 次提交
- A
  restore killability of old mutex_lock_killable(&inode->i_mutex) users · 00235411
  由 Al Viro 提交于 5月 26, 2016
```
The ones that are taking it exclusive, that is...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  00235411
- N
  btrfs: fix string and comment grammatical issues and typos · 01327610
  由 Nicholas D Steeves 提交于 5月 19, 2016
```
Signed-off-by: NNicholas D Steeves <nsteeves@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  01327610
13 5月, 2016 1 次提交

Btrfs: don't wait for unrelated IO to finish before relocation · 578def7c

由 Filipe Manana 提交于 4月 26, 2016

Before the relocation process of a block group starts, it sets the block
group to readonly mode, then flushes all delalloc writes and then finally
it waits for all ordered extents to complete. This last step includes
waiting for ordered extents destinated at extents allocated in other block
groups, making us waste unecessary time.

So improve this by waiting only for ordered extents that fall into the
block group's range.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>

578def7c

06 5月, 2016 5 次提交

btrfs: ioctl: reorder exclusive op check in RM_DEV · 58d7bbf8

由 David Sterba 提交于 5月 04, 2016

Move the op exclusivity check before the other code (same as in
ADD_DEV).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

58d7bbf8

btrfs: add write protection to SET_FEATURES ioctl · 7ab19625

由 David Sterba 提交于 5月 04, 2016

Perform the want_write check if we get far enough to do any writes.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

7ab19625

btrfs: use dynamic allocation for root item in create_subvol · 49a3c4d9

由 David Sterba 提交于 3月 24, 2016

The size of root item is more than 400 bytes, which is quite a lot of
stack space. As we do IO from inside the subvolume ioctls, we should
keep the stack usage low in case the filesystem is on top of other
layers (NFS, device mapper, iscsi, etc).
Reviewed-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

49a3c4d9

D
btrfs: clone: use vmalloc only as fallback for nodesize bufer · 15351955
由 David Sterba 提交于 4月 11, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
15351955

btrfs: ioctl: reorder exclusive op check in RM_DEV · 2355ac84

由 David Sterba 提交于 4月 28, 2016

Move the op exclusivity check before the other code (same as in
ADD_DEV).
Signed-off-by: NDavid Sterba <dsterba@suse.com>

2355ac84

03 5月, 2016 1 次提交

parallel lookups: actual switch to rwsem · 9902af79

由 Al Viro 提交于 4月 15, 2016

ta-da!

The main issue is the lack of down_write_killable(), so the places
like readdir.c switched to plain inode_lock(); once killable
variants of rwsem primitives appear, that'll be dealt with.

lockdep side also might need more work
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9902af79

02 5月, 2016 1 次提交

btrfs: pass the right error code to the btrfs_std_error · ad8403df

由 Anand Jain 提交于 3月 10, 2016

Also drop the newline from the message.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ad8403df

29 4月, 2016 1 次提交
- D
  btrfs: sink gfp parameter to set_extent_defrag · 018ed4f7
  由 David Sterba 提交于 4月 26, 2016
```
Single caller passes GFP_NOFS.
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  018ed4f7
28 4月, 2016 6 次提交

btrfs: refactor btrfs_dev_replace_start for reuse · b5255456

由 Anand Jain 提交于 3月 24, 2016

A refactor patch, and avoids user input verification in the
btrfs_dev_replace_start(), and so this function can be reused.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

b5255456

btrfs: rename flags for vol args v2 · 735654ea

由 David Sterba 提交于 2月 15, 2016

Rename BTRFS_DEVICE_BY_ID so it's more descriptive that we specify the
device by id, it'll be part of the public API. The mask of supported
flags is also renamed, only for internal use.

The error code for unknown flags is EOPNOTSUPP, fixed.
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

735654ea

btrfs: introduce device delete by devid · 6b526ed7

由 Anand Jain 提交于 2月 13, 2016

This introduces new ioctl BTRFS_IOC_RM_DEV_V2, which uses enhanced struct
btrfs_ioctl_vol_args_v2 to carry devid as an user argument.

The patch won't delete the old ioctl interface and so kernel remains
backward compatible with user land progs.

Test case/script:
echo "0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0" | dmsetup create bad_disk
mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk
mount /dev/sdd /btrfs
dmsetup suspend bad_disk
echo "0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0" | dmsetup load bad_disk
dmsetup resume bad_disk
echo "bad disk failed. now deleting/replacing"
btrfs dev del  3  /btrfs
echo $?
btrfs fi show /btrfs
umount /btrfs
btrfs-show-super /dev/sdd | egrep num_device
dmsetup remove bad_disk
wipefs -a /dev/sdf
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reported-by: NMartin <m_btrfs@ml1.co.uk>
[ adjust messages, s/disk/device/ ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

6b526ed7

btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl · 4c63c245

由 Luke Dashjr 提交于 10月 29, 2015

32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can
be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr
fail.
Signed-off-by: NLuke Dashjr <luke-jr+git@utopios.org>
Cc: stable@vger.kernel.org
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

4c63c245

S
btrfs: Simplify conditions about compress while mapping btrfs flags to inode flags · 13f48dc9
由 Satoru Takeuchi 提交于 3月 15, 2016
```
Signed-off-by: NSatoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
13f48dc9

btrfs: rename btrfs_std_error to btrfs_handle_fs_error · 34d97007

由 Anand Jain 提交于 3月 16, 2016

btrfs_std_error() handles errors, puts FS into readonly mode
(as of now). So its good idea to rename it to btrfs_handle_fs_error().
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
[ edit changelog ]
Signed-off-by: NDavid Sterba <dsterba@suse.com>

34d97007

05 4月, 2016 1 次提交

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf

由 Kirill A. Shutemov 提交于 4月 01, 2016

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09cbfeaf

04 4月, 2016 1 次提交

Btrfs: don't use src fd for printk · c79b4713

由 Josef Bacik 提交于 3月 25, 2016

The fd we pass in may not be on a btrfs file system, so don't try to do
BTRFS_I() on it.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c79b4713

12 3月, 2016 1 次提交

btrfs: move btrfs_compression_type to compression.h · ebb8765b

由 Anand Jain 提交于 3月 10, 2016

So that its better organized.
Signed-off-by: NAnand Jain <anand.jain@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ebb8765b

02 3月, 2016 3 次提交

Btrfs: fix extent_same allowing destination offset beyond i_size · f4dfe687

由 Filipe Manana 提交于 2月 12, 2016

When using the same file as the source and destination for a dedup
(extent_same ioctl) operation we were allowing it to dedup to a
destination offset beyond the file's size, which doesn't make sense and
it's not allowed for the case where the source and destination files are
not the same file. This made de deduplication operation successful only
when the source range corresponded to a hole, a prealloc extent or an
extent with all bytes having a value of 0x00. This was also leaving a
file hole (between i_size and destination offset) without the
corresponding file extent items, which can be reproduced with the
following steps for example:

  $ mkfs.btrfs -f /dev/sdi
  $ mount /dev/sdi /mnt/sdi

  $ xfs_io -f -c "pwrite -S 0xab 304457 404990" /mnt/sdi/foobar
  wrote 404990/404990 bytes at offset 304457
  395 KiB, 99 ops; 0.0000 sec (31.150 MiB/sec and 7984.5149 ops/sec)

  $ /git/hub/duperemove/btrfs-extent-same 24576 /mnt/sdi/foobar 28672 /mnt/sdi/foobar 929792
  Deduping 2 total files
  (28672, 24576): /mnt/sdi/foobar
  (929792, 24576): /mnt/sdi/foobar
  1 files asked to be deduped
  i: 0, status: 0, bytes_deduped: 24576
  24576 total bytes deduped in this operation

  $ umount /mnt/sdi
  $ btrfsck /dev/sdi
  Checking filesystem on /dev/sdi
  UUID: 98c528aa-0833-427d-9403-b98032ffbf9d
  checking extents
  checking free space cache
  checking fs roots
  root 5 inode 257 errors 100, file extent discount
  Found file extent holes:
          start: 712704, len: 217088
  found 540673 bytes used err is 1
  total csum bytes: 400
  total tree bytes: 131072
  total fs tree bytes: 32768
  total extent tree bytes: 16384
  btree space waste bytes: 123675
  file data blocks allocated: 671744
    referenced 671744
  btrfs-progs v4.2.3

So fix this by not allowing the destination to go beyond the file's size,
just as we do for the same where the source and destination files are not
the same.

A test for xfstests follows.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

f4dfe687

Btrfs: fix file loss on log replay after renaming a file and fsync · 2be63d5c

由 Filipe Manana 提交于 2月 12, 2016

We have two cases where we end up deleting a file at log replay time
when we should not. For this to happen the file must have been renamed
and a directory inode must have been fsynced/logged.

Two examples that exercise these two cases are listed below.

  Case 1)

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkdir -p /mnt/a/b
  $ mkdir /mnt/c
  $ touch /mnt/a/b/foo
  $ sync
  $ mv /mnt/a/b/foo /mnt/c/
  # Create file bar just to make sure the fsync on directory a/ does
  # something and it's not a no-op.
  $ touch /mnt/a/bar
  $ xfs_io -c "fsync" /mnt/a
  < power fail / crash >

  The next time the filesystem is mounted, the log replay procedure
  deletes file foo.

  Case 2)

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkdir /mnt/a
  $ mkdir /mnt/b
  $ mkdir /mnt/c
  $ touch /mnt/a/foo
  $ ln /mnt/a/foo /mnt/b/foo_link
  $ touch /mnt/b/bar
  $ sync
  $ unlink /mnt/b/foo_link
  $ mv /mnt/b/bar /mnt/c/
  $ xfs_io -c "fsync" /mnt/a/foo
  < power fail / crash >

  The next time the filesystem is mounted, the log replay procedure
  deletes file bar.

The reason why the files are deleted is because when we log inodes
other then the fsync target inode, we ignore their last_unlink_trans
value and leave the log without enough information to later replay the
rename operations. So we need to look at the last_unlink_trans values
and fallback to a transaction commit if they are greater than the
id of the last committed transaction.

So fix this by looking at the last_unlink_trans values and fallback to
transaction commits when needed. Also, when logging other inodes (for
case 1 we logged descendants of the fsync target inode while for case 2
we logged ascendants) we need to care about concurrent tasks updating
the last_unlink_trans of inodes we are logging (which was already an
existing problem in check_parent_dirs_for_sync()). Since we can not
acquire their inode mutex (vfs' struct inode ->i_mutex), as that causes
deadlocks with other concurrent operations that acquire the i_mutex of
2 inodes (other fsyncs or renames for example), we need to serialize on
the log_mutex of the inode we are logging. A task setting a new value for
an inode's last_unlink_trans must acquire the inode's log_mutex and it
must do this update before doing the actual unlink operation (which is
already the case except when deleting a snapshot). Conversely the task
logging the inode must first log the inode and then check the inode's
last_unlink_trans value while holding its log_mutex, as if its value is
not greater then the id of the last committed transaction it means it
logged a safe state of the inode's items, while if its value is not
smaller then the id of the last committed transaction it means the inode
state it has logged might not be safe (the concurrent task might have
just updated last_unlink_trans but hasn't done yet the unlink operation)
and therefore a transaction commit must be done.

Test cases for xfstests follow in separate patches.
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

2be63d5c

Btrfs: fix unreplayable log after snapshot delete + parent dir fsync · 1ec9a1ae

由 Filipe Manana 提交于 2月 10, 2016

If we delete a snapshot, fsync its parent directory and crash/power fail
before the next transaction commit, on the next mount when we attempt to
replay the log tree of the root containing the parent directory we will
fail and prevent the filesystem from mounting, which is solvable by wiping
out the log trees with the btrfs-zero-log tool but very inconvenient as
we will lose any data and metadata fsynced before the parent directory
was fsynced.

For example:

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt
  $ mkdir /mnt/testdir
  $ btrfs subvolume snapshot /mnt /mnt/testdir/snap
  $ btrfs subvolume delete /mnt/testdir/snap
  $ xfs_io -c "fsync" /mnt/testdir
  < crash / power failure and reboot >
  $ mount /dev/sdc /mnt
  mount: mount(2) failed: No such file or directory

And in dmesg/syslog we get the following message and trace:

[192066.361162] BTRFS info (device dm-0): failed to delete reference to snap, inode 257 parent 257
[192066.363010] ------------[ cut here ]------------
[192066.365268] WARNING: CPU: 4 PID: 5130 at fs/btrfs/inode.c:3986 __btrfs_unlink_inode+0x17a/0x354 [btrfs]()
[192066.367250] BTRFS: Transaction aborted (error -2)
[192066.368401] Modules linked in: btrfs dm_flakey dm_mod ppdev sha256_generic xor raid6_pq hmac drbg ansi_cprng aesni_intel acpi_cpufreq tpm_tis aes_x86_64 tpm ablk_helper evdev cryptd sg parport_pc i2c_piix4 psmouse lrw parport i2c_core pcspkr gf128mul processor serio_raw glue_helper button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs]
[192066.377154] CPU: 4 PID: 5130 Comm: mount Tainted: G        W       4.4.0-rc6-btrfs-next-20+ #1
[192066.378875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[192066.380889]  0000000000000000 ffff880143923670 ffffffff81257570 ffff8801439236b8
[192066.382561]  ffff8801439236a8 ffffffff8104ec07 ffffffffa039dc2c 00000000fffffffe
[192066.384191]  ffff8801ed31d000 ffff8801b9fc9c88 ffff8801086875e0 ffff880143923710
[192066.385827] Call Trace:
[192066.386373]  [<ffffffff81257570>] dump_stack+0x4e/0x79
[192066.387387]  [<ffffffff8104ec07>] warn_slowpath_common+0x99/0xb2
[192066.388429]  [<ffffffffa039dc2c>] ? __btrfs_unlink_inode+0x17a/0x354 [btrfs]
[192066.389236]  [<ffffffff8104ec68>] warn_slowpath_fmt+0x48/0x50
[192066.389884]  [<ffffffffa039dc2c>] __btrfs_unlink_inode+0x17a/0x354 [btrfs]
[192066.390621]  [<ffffffff81184b55>] ? iput+0xb0/0x266
[192066.391200]  [<ffffffffa039ea25>] btrfs_unlink_inode+0x1c/0x3d [btrfs]
[192066.391930]  [<ffffffffa03ca623>] check_item_in_log+0x1fe/0x29b [btrfs]
[192066.392715]  [<ffffffffa03ca827>] replay_dir_deletes+0x167/0x1cf [btrfs]
[192066.393510]  [<ffffffffa03cccc7>] replay_one_buffer+0x417/0x570 [btrfs]
[192066.394241]  [<ffffffffa03ca164>] walk_up_log_tree+0x10e/0x1dc [btrfs]
[192066.394958]  [<ffffffffa03cac72>] walk_log_tree+0xa5/0x190 [btrfs]
[192066.395628]  [<ffffffffa03ce8b8>] btrfs_recover_log_trees+0x239/0x32c [btrfs]
[192066.396790]  [<ffffffffa03cc8b0>] ? replay_one_extent+0x50a/0x50a [btrfs]
[192066.397891]  [<ffffffffa0394041>] open_ctree+0x1d8b/0x2167 [btrfs]
[192066.398897]  [<ffffffffa03706e1>] btrfs_mount+0x5ef/0x729 [btrfs]
[192066.399823]  [<ffffffff8108ad98>] ? trace_hardirqs_on+0xd/0xf
[192066.400739]  [<ffffffff8108959b>] ? lockdep_init_map+0xb9/0x1b3
[192066.401700]  [<ffffffff811714b9>] mount_fs+0x67/0x131
[192066.402482]  [<ffffffff81188560>] vfs_kern_mount+0x6c/0xde
[192066.403930]  [<ffffffffa03702bd>] btrfs_mount+0x1cb/0x729 [btrfs]
[192066.404831]  [<ffffffff8108ad98>] ? trace_hardirqs_on+0xd/0xf
[192066.405726]  [<ffffffff8108959b>] ? lockdep_init_map+0xb9/0x1b3
[192066.406621]  [<ffffffff811714b9>] mount_fs+0x67/0x131
[192066.407401]  [<ffffffff81188560>] vfs_kern_mount+0x6c/0xde
[192066.408247]  [<ffffffff8118ae36>] do_mount+0x893/0x9d2
[192066.409047]  [<ffffffff8113009b>] ? strndup_user+0x3f/0x8c
[192066.409842]  [<ffffffff8118b187>] SyS_mount+0x75/0xa1
[192066.410621]  [<ffffffff8147e517>] entry_SYSCALL_64_fastpath+0x12/0x6b
[192066.411572] ---[ end trace 2de42126c1e0a0f0 ]---
[192066.412344] BTRFS: error (device dm-0) in __btrfs_unlink_inode:3986: errno=-2 No such entry
[192066.413748] BTRFS: error (device dm-0) in btrfs_replay_log:2464: errno=-2 No such entry (Failed to recover log tree)
[192066.415458] BTRFS error (device dm-0): cleaner transaction attach returned -30
[192066.444613] BTRFS: open_ctree failed

This happens because when we are replaying the log and processing the
directory entry pointing to the snapshot in the subvolume tree, we treat
its btrfs_dir_item item as having a location with a key type matching
BTRFS_INODE_ITEM_KEY, which is wrong because the type matches
BTRFS_ROOT_ITEM_KEY and therefore must be processed differently, as the
object id refers to a root number and not to an inode in the root
containing the parent directory.

So fix this by triggering a transaction commit if an fsync against the
parent directory is requested after deleting a snapshot. This is the
simplest approach for a rare use case. Some alternative that avoids the
transaction commit would require more code to explicitly delete the
snapshot at log replay time (factoring out common code from ioctl.c:
btrfs_ioctl_snap_destroy()), special care at fsync time to remove the
log tree of the snapshot's root from the log root of the root of tree
roots, amongst other steps.

A test case for xfstests that triggers the issue follows.

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1	# failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      _cleanup_flakey
      cd /
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter
  . ./common/dmflakey

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_dm_target flakey
  _require_metadata_journaling $SCRATCH_DEV

  rm -f $seqres.full

  _scratch_mkfs >>$seqres.full 2>&1
  _init_flakey
  _mount_flakey

  # Create a snapshot at the root of our filesystem (mount point path), delete it,
  # fsync the mount point path, crash and mount to replay the log. This should
  # succeed and after the filesystem is mounted the snapshot should not be visible
  # anymore.
  _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap1
  _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap1
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT
  _flakey_drop_and_remount
  [ -e $SCRATCH_MNT/snap1 ] && \
      echo "Snapshot snap1 still exists after log replay"

  # Similar scenario as above, but this time the snapshot is created inside a
  # directory and not directly under the root (mount point path).
  mkdir $SCRATCH_MNT/testdir
  _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/testdir/snap2
  _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/testdir/snap2
  $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir
  _flakey_drop_and_remount
  [ -e $SCRATCH_MNT/testdir/snap2 ] && \
      echo "Snapshot snap2 still exists after log replay"

  _unmount_flakey

  echo "Silence is golden"
  status=0
  exit
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Tested-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <clm@fb.com>

1ec9a1ae

23 2月, 2016 3 次提交

D
btrfs: drop unused argument in btrfs_ioctl_get_supported_features · d5131b65
由 David Sterba 提交于 2月 17, 2016
```
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
d5131b65

btrfs: add GET_SUPPORTED_FEATURES to the control device ioctls · c5868f83

由 David Sterba 提交于 2月 17, 2016

The control device is accessible when no filesystem is mounted and we
may want to query features supported by the module. This is already
possible using the sysfs files, this ioctl is for parity and
convenience.
Reviewed-by: NAnand Jain <anand.jain@oracle.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c5868f83

btrfs: remove error message from search ioctl for nonexistent tree · 11ea474f

由 David Sterba 提交于 2月 11, 2016

Let's remove the error message that appears when the tree_id is not
present. This can happen with the quota tree and has been observed in
practice. The applications are supposed to handle -ENOENT and we don't
need to report that in the system log as it's not a fatal error.
Reported-by: NVlastimil Babka <vbabka@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

11ea474f

18 2月, 2016 2 次提交

btrfs: Replace CURRENT_TIME by current_fs_time() · 04b285f3

由 Deepa Dinamani 提交于 2月 06, 2016

CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_fs_time() instead.
Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: linux-btrfs@vger.kernel.org
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

04b285f3

btrfs: remove redundant error check · ac1407ba

由 Byongho Lee 提交于 1月 27, 2016

While running btrfs_mksubvol(), d_really_is_positive() is called twice.
First in btrfs_mksubvol() and second inside btrfs_may_create().  So I
remove the first one.
Signed-off-by: NByongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ac1407ba

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功