1. 26 5月, 2016 1 次提交
  2. 29 4月, 2016 13 次提交
  3. 07 4月, 2016 1 次提交
    • F
      Btrfs: fix file/data loss caused by fsync after rename and new inode · 56f23fdb
      Filipe Manana 提交于
      If we rename an inode A (be it a file or a directory), create a new
      inode B with the old name of inode A and under the same parent directory,
      fsync inode B and then power fail, at log tree replay time we end up
      removing inode A completely. If inode A is a directory then all its files
      are gone too.
      
      Example scenarios where this happens:
      This is reproducible with the following steps, taken from a couple of
      test cases written for fstests which are going to be submitted upstream
      soon:
      
         # Scenario 1
      
         mkfs.btrfs -f /dev/sdc
         mount /dev/sdc /mnt
         mkdir -p /mnt/a/x
         echo "hello" > /mnt/a/x/foo
         echo "world" > /mnt/a/x/bar
         sync
         mv /mnt/a/x /mnt/a/y
         mkdir /mnt/a/x
         xfs_io -c fsync /mnt/a/x
         <power failure happens>
      
         The next time the fs is mounted, log tree replay happens and
         the directory "y" does not exist nor do the files "foo" and
         "bar" exist anywhere (neither in "y" nor in "x", nor the root
         nor anywhere).
      
         # Scenario 2
      
         mkfs.btrfs -f /dev/sdc
         mount /dev/sdc /mnt
         mkdir /mnt/a
         echo "hello" > /mnt/a/foo
         sync
         mv /mnt/a/foo /mnt/a/bar
         echo "world" > /mnt/a/foo
         xfs_io -c fsync /mnt/a/foo
         <power failure happens>
      
         The next time the fs is mounted, log tree replay happens and the
         file "bar" does not exists anymore. A file with the name "foo"
         exists and it matches the second file we created.
      
      Another related problem that does not involve file/data loss is when a
      new inode is created with the name of a deleted snapshot and we fsync it:
      
         mkfs.btrfs -f /dev/sdc
         mount /dev/sdc /mnt
         mkdir /mnt/testdir
         btrfs subvolume snapshot /mnt /mnt/testdir/snap
         btrfs subvolume delete /mnt/testdir/snap
         rmdir /mnt/testdir
         mkdir /mnt/testdir
         xfs_io -c fsync /mnt/testdir # or fsync some file inside /mnt/testdir
         <power failure>
      
         The next time the fs is mounted the log replay procedure fails because
         it attempts to delete the snapshot entry (which has dir item key type
         of BTRFS_ROOT_ITEM_KEY) as if it were a regular (non-root) entry,
         resulting in the following error that causes mount to fail:
      
         [52174.510532] BTRFS info (device dm-0): failed to delete reference to snap, inode 257 parent 257
         [52174.512570] ------------[ cut here ]------------
         [52174.513278] WARNING: CPU: 12 PID: 28024 at fs/btrfs/inode.c:3986 __btrfs_unlink_inode+0x178/0x351 [btrfs]()
         [52174.514681] BTRFS: Transaction aborted (error -2)
         [52174.515630] Modules linked in: btrfs dm_flakey dm_mod overlay crc32c_generic ppdev xor raid6_pq acpi_cpufreq parport_pc tpm_tis sg parport tpm evdev i2c_piix4 proc
         [52174.521568] CPU: 12 PID: 28024 Comm: mount Tainted: G        W       4.5.0-rc6-btrfs-next-27+ #1
         [52174.522805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
         [52174.524053]  0000000000000000 ffff8801df2a7710 ffffffff81264e93 ffff8801df2a7758
         [52174.524053]  0000000000000009 ffff8801df2a7748 ffffffff81051618 ffffffffa03591cd
         [52174.524053]  00000000fffffffe ffff88015e6e5000 ffff88016dbc3c88 ffff88016dbc3c88
         [52174.524053] Call Trace:
         [52174.524053]  [<ffffffff81264e93>] dump_stack+0x67/0x90
         [52174.524053]  [<ffffffff81051618>] warn_slowpath_common+0x99/0xb2
         [52174.524053]  [<ffffffffa03591cd>] ? __btrfs_unlink_inode+0x178/0x351 [btrfs]
         [52174.524053]  [<ffffffff81051679>] warn_slowpath_fmt+0x48/0x50
         [52174.524053]  [<ffffffffa03591cd>] __btrfs_unlink_inode+0x178/0x351 [btrfs]
         [52174.524053]  [<ffffffff8118f5e9>] ? iput+0xb0/0x284
         [52174.524053]  [<ffffffffa0359fe8>] btrfs_unlink_inode+0x1c/0x3d [btrfs]
         [52174.524053]  [<ffffffffa038631e>] check_item_in_log+0x1fe/0x29b [btrfs]
         [52174.524053]  [<ffffffffa0386522>] replay_dir_deletes+0x167/0x1cf [btrfs]
         [52174.524053]  [<ffffffffa038739e>] fixup_inode_link_count+0x289/0x2aa [btrfs]
         [52174.524053]  [<ffffffffa038748a>] fixup_inode_link_counts+0xcb/0x105 [btrfs]
         [52174.524053]  [<ffffffffa038a5ec>] btrfs_recover_log_trees+0x258/0x32c [btrfs]
         [52174.524053]  [<ffffffffa03885b2>] ? replay_one_extent+0x511/0x511 [btrfs]
         [52174.524053]  [<ffffffffa034f288>] open_ctree+0x1dd4/0x21b9 [btrfs]
         [52174.524053]  [<ffffffffa032b753>] btrfs_mount+0x97e/0xaed [btrfs]
         [52174.524053]  [<ffffffff8108e1b7>] ? trace_hardirqs_on+0xd/0xf
         [52174.524053]  [<ffffffff8117bafa>] mount_fs+0x67/0x131
         [52174.524053]  [<ffffffff81193003>] vfs_kern_mount+0x6c/0xde
         [52174.524053]  [<ffffffffa032af81>] btrfs_mount+0x1ac/0xaed [btrfs]
         [52174.524053]  [<ffffffff8108e1b7>] ? trace_hardirqs_on+0xd/0xf
         [52174.524053]  [<ffffffff8108c262>] ? lockdep_init_map+0xb9/0x1b3
         [52174.524053]  [<ffffffff8117bafa>] mount_fs+0x67/0x131
         [52174.524053]  [<ffffffff81193003>] vfs_kern_mount+0x6c/0xde
         [52174.524053]  [<ffffffff8119590f>] do_mount+0x8a6/0x9e8
         [52174.524053]  [<ffffffff811358dd>] ? strndup_user+0x3f/0x59
         [52174.524053]  [<ffffffff81195c65>] SyS_mount+0x77/0x9f
         [52174.524053]  [<ffffffff814935d7>] entry_SYSCALL_64_fastpath+0x12/0x6b
         [52174.561288] ---[ end trace 6b53049efb1a3ea6 ]---
      
      Fix this by forcing a transaction commit when such cases happen.
      This means we check in the commit root of the subvolume tree if there
      was any other inode with the same reference when the inode we are
      fsync'ing is a new inode (created in the current transaction).
      
      Test cases for fstests, covering all the scenarios given above, were
      submitted upstream for fstests:
      
        * fstests: generic test for fsync after renaming directory
          https://patchwork.kernel.org/patch/8694281/
      
        * fstests: generic test for fsync after renaming file
          https://patchwork.kernel.org/patch/8694301/
      
        * fstests: add btrfs test for fsync after snapshot deletion
          https://patchwork.kernel.org/patch/8670671/
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      56f23fdb
  4. 05 4月, 2016 2 次提交
    • K
      mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage · ea1754a0
      Kirill A. Shutemov 提交于
      Mostly direct substitution with occasional adjustment or removing
      outdated comments.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea1754a0
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  5. 04 4月, 2016 8 次提交
  6. 31 3月, 2016 1 次提交
    • F
      btrfs: fix crash/invalid memory access on fsync when using overlayfs · de17e793
      Filipe Manana 提交于
      If the lower or upper directory of an overlayfs mount belong to a btrfs
      file system and we fsync the file through the overlayfs' merged directory
      we ended up accessing an inode that didn't belong to btrfs as if it were
      a btrfs inode at btrfs_sync_file() resulting in a crash like the following:
      
      [ 7782.588845] BUG: unable to handle kernel NULL pointer dereference at 0000000000000544
      [ 7782.590624] IP: [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
      [ 7782.591931] PGD 4d954067 PUD 1e878067 PMD 0
      [ 7782.592016] Oops: 0002 [#6] PREEMPT SMP DEBUG_PAGEALLOC
      [ 7782.592016] Modules linked in: btrfs overlay ppdev crc32c_generic evdev xor raid6_pq psmouse pcspkr sg serio_raw acpi_cpufreq parport_pc parport tpm_tis i2c_piix4 tpm i2c_core processor button loop autofs4 ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix virtio_pci libata virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
      [ 7782.592016] CPU: 10 PID: 16437 Comm: xfs_io Tainted: G      D         4.5.0-rc6-btrfs-next-26+ #1
      [ 7782.592016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
      [ 7782.592016] task: ffff88001b8d40c0 ti: ffff880137488000 task.ti: ffff880137488000
      [ 7782.592016] RIP: 0010:[<ffffffffa030b7ab>]  [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
      [ 7782.592016] RSP: 0018:ffff88013748be40  EFLAGS: 00010286
      [ 7782.592016] RAX: 0000000080000000 RBX: ffff880133b30c88 RCX: 0000000000000001
      [ 7782.592016] RDX: 0000000000000001 RSI: ffffffff8148fec0 RDI: 00000000ffffffff
      [ 7782.592016] RBP: ffff88013748bec0 R08: 0000000000000001 R09: 0000000000000000
      [ 7782.624248] R10: ffff88013748be40 R11: 0000000000000246 R12: 0000000000000000
      [ 7782.624248] R13: 0000000000000000 R14: 00000000009305a0 R15: ffff880015e3be40
      [ 7782.624248] FS:  00007fa83b9cb700(0000) GS:ffff88023ed40000(0000) knlGS:0000000000000000
      [ 7782.624248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 7782.624248] CR2: 0000000000000544 CR3: 00000001fa652000 CR4: 00000000000006e0
      [ 7782.624248] Stack:
      [ 7782.624248]  ffffffff8108b5cc ffff88013748bec0 0000000000000246 ffff8800b005ded0
      [ 7782.624248]  ffff880133b30d60 8000000000000000 7fffffffffffffff 0000000000000246
      [ 7782.624248]  0000000000000246 ffffffff81074f9b ffffffff8104357c ffff880015e3be40
      [ 7782.624248] Call Trace:
      [ 7782.624248]  [<ffffffff8108b5cc>] ? arch_local_irq_save+0x9/0xc
      [ 7782.624248]  [<ffffffff81074f9b>] ? ___might_sleep+0xce/0x217
      [ 7782.624248]  [<ffffffff8104357c>] ? __do_page_fault+0x3c0/0x43a
      [ 7782.624248]  [<ffffffff811a2351>] vfs_fsync_range+0x8c/0x9e
      [ 7782.624248]  [<ffffffff811a237f>] vfs_fsync+0x1c/0x1e
      [ 7782.624248]  [<ffffffff811a24d6>] do_fsync+0x31/0x4a
      [ 7782.624248]  [<ffffffff811a2700>] SyS_fsync+0x10/0x14
      [ 7782.624248]  [<ffffffff81493617>] entry_SYSCALL_64_fastpath+0x12/0x6b
      [ 7782.624248] Code: 85 c0 0f 85 e2 02 00 00 48 8b 45 b0 31 f6 4c 29 e8 48 ff c0 48 89 45 a8 48 8d 83 d8 00 00 00 48 89 c7 48 89 45 a0 e8 fc 43 18 e1 <f0> 41 ff 84 24 44 05 00 00 48 8b 83 58 ff ff ff 48 c1 e8 07 83
      [ 7782.624248] RIP  [<ffffffffa030b7ab>] btrfs_sync_file+0x11b/0x3e9 [btrfs]
      [ 7782.624248]  RSP <ffff88013748be40>
      [ 7782.624248] CR2: 0000000000000544
      [ 7782.661994] ---[ end trace 721e14960eb939bc ]---
      
      This started happening since commit 4bacc9c9 (overlayfs: Make f_path
      always point to the overlay and f_inode to the underlay) and even though
      after this change we could still access the btrfs inode through
      struct file->f_mapping->host or struct file->f_inode, we would end up
      resulting in more similar issues later on at check_parent_dirs_for_sync()
      because the dentry we got (from struct file->f_path.dentry) was from
      overlayfs and not from btrfs, that is, we had no way of getting the dentry
      that belonged to btrfs (we always got the dentry that belonged to
      overlayfs).
      
      The new patch from Miklos Szeredi, titled "vfs: add file_dentry()" and
      recently submitted to linux-fsdevel, adds a file_dentry() API that allows
      us to get the btrfs dentry from the input file and therefore being able
      to fsync when the upper and lower directories belong to btrfs filesystems.
      
      This issue has been reported several times by users in the mailing list
      and bugzilla. A test case for xfstests is being submitted as well.
      
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101951
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109791Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Cc: stable@vger.kernel.org
      de17e793
  7. 22 3月, 2016 4 次提交
  8. 21 3月, 2016 1 次提交
    • C
      btrfs: make sure we stay inside the bvec during __btrfs_lookup_bio_sums · 389f239c
      Chris Mason 提交于
      Commit c40a3d38 (Btrfs: Compute and look up csums based on
      sectorsized blocks) changes around how we walk the bios while looking up
      crcs.  There's an inner loop that is jumping to the next bvec based on
      sectors and before it derefs the next bvec, it needs to make sure we're
      still in the bio.
      
      In this case, the outer loop would have decided to stop moving forward
      too, and the bvec deref is never actually used for anything.  But
      CONFIG_DEBUG_PAGEALLOC catches it because we're outside our bio.
      Signed-off-by: NChris Mason <clm@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      389f239c
  9. 18 3月, 2016 1 次提交
  10. 14 3月, 2016 2 次提交
  11. 12 3月, 2016 4 次提交
  12. 11 3月, 2016 1 次提交
  13. 04 3月, 2016 1 次提交
    • F
      Btrfs: fix loading of orphan roots leading to BUG_ON · 909c3a22
      Filipe Manana 提交于
      When looking for orphan roots during mount we can end up hitting a
      BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is
      replayed and qgroups are enabled. This is because after a log tree is
      replayed, a transaction commit is made, which triggers qgroup extent
      accounting which in turn does backref walking which ends up reading and
      inserting all roots in the radix tree fs_info->fs_root_radix, including
      orphan roots (deleted snapshots). So after the log tree is replayed, when
      finding orphan roots we hit the BUG_ON with the following trace:
      
      [118209.182438] ------------[ cut here ]------------
      [118209.183279] kernel BUG at fs/btrfs/root-tree.c:314!
      [118209.184074] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [118209.185123] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic ppdev xor raid6_pq evdev sg parport_pc parport acpi_cpufreq tpm_tis tpm psmouse
      processor i2c_piix4 serio_raw pcspkr i2c_core button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata
      virtio_pci virtio_ring virtio scsi_mod e1000 floppy [last unloaded: btrfs]
      [118209.186318] CPU: 14 PID: 28428 Comm: mount Tainted: G        W       4.5.0-rc5-btrfs-next-24+ #1
      [118209.186318] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
      [118209.186318] task: ffff8801ec131040 ti: ffff8800af34c000 task.ti: ffff8800af34c000
      [118209.186318] RIP: 0010:[<ffffffffa04237d7>]  [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
      [118209.186318] RSP: 0018:ffff8800af34faa8  EFLAGS: 00010246
      [118209.186318] RAX: 00000000ffffffef RBX: 00000000ffffffef RCX: 0000000000000001
      [118209.186318] RDX: 0000000080000000 RSI: 0000000000000001 RDI: 00000000ffffffff
      [118209.186318] RBP: ffff8800af34fb08 R08: 0000000000000001 R09: 0000000000000000
      [118209.186318] R10: ffff8800af34f9f0 R11: 6db6db6db6db6db7 R12: ffff880171b97000
      [118209.186318] R13: ffff8801ca9d65e0 R14: ffff8800afa2e000 R15: 0000160000000000
      [118209.186318] FS:  00007f5bcb914840(0000) GS:ffff88023edc0000(0000) knlGS:0000000000000000
      [118209.186318] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [118209.186318] CR2: 00007f5bcaceb5d9 CR3: 00000000b49b5000 CR4: 00000000000006e0
      [118209.186318] Stack:
      [118209.186318]  fffffbffffffffff 010230ffffffffff 0101000000000000 ff84000000000000
      [118209.186318]  fbffffffffffffff 30ffffffffffffff 0000000000000101 ffff880082348000
      [118209.186318]  0000000000000000 ffff8800afa2e000 ffff8800afa2e000 0000000000000000
      [118209.186318] Call Trace:
      [118209.186318]  [<ffffffffa042e2db>] open_ctree+0x1e37/0x21b9 [btrfs]
      [118209.186318]  [<ffffffffa040a753>] btrfs_mount+0x97e/0xaed [btrfs]
      [118209.186318]  [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
      [118209.186318]  [<ffffffff8117b87e>] mount_fs+0x67/0x131
      [118209.186318]  [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
      [118209.186318]  [<ffffffffa0409f81>] btrfs_mount+0x1ac/0xaed [btrfs]
      [118209.186318]  [<ffffffff8108e1c0>] ? trace_hardirqs_on+0xd/0xf
      [118209.186318]  [<ffffffff8108c26b>] ? lockdep_init_map+0xb9/0x1b3
      [118209.186318]  [<ffffffff8117b87e>] mount_fs+0x67/0x131
      [118209.186318]  [<ffffffff81192d2b>] vfs_kern_mount+0x6c/0xde
      [118209.186318]  [<ffffffff81195637>] do_mount+0x8a6/0x9e8
      [118209.186318]  [<ffffffff8119598d>] SyS_mount+0x77/0x9f
      [118209.186318]  [<ffffffff81493017>] entry_SYSCALL_64_fastpath+0x12/0x6b
      [118209.186318] Code: 64 00 00 85 c0 89 c3 75 24 f0 41 80 4c 24 20 20 49 8b bc 24 f0 01 00 00 4c 89 e6 e8 e8 65 00 00 85 c0 89 c3 74 11 83 f8 ef 75 02 <0f> 0b
      4c 89 e7 e8 da 72 00 00 eb 1c 41 83 bc 24 00 01 00 00 00
      [118209.186318] RIP  [<ffffffffa04237d7>] btrfs_find_orphan_roots+0x1fc/0x244 [btrfs]
      [118209.186318]  RSP <ffff8800af34faa8>
      [118209.230735] ---[ end trace 83938f987d85d477 ]---
      
      So fix this by not treating the error -EEXIST, returned when attempting
      to insert a root already inserted by the backref walking code, as an error.
      
      The following test case for xfstests reproduces the bug:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            _cleanup_flakey
            cd /
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
        . ./common/dmflakey
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_dm_target flakey
        _require_metadata_journaling $SCRATCH_DEV
      
        rm -f $seqres.full
      
        _scratch_mkfs >>$seqres.full 2>&1
        _init_flakey
        _mount_flakey
      
        _run_btrfs_util_prog quota enable $SCRATCH_MNT
      
        # Create 2 directories with one file in one of them.
        # We use these just to trigger a transaction commit later, moving the file from
        # directory a to directory b and doing an fsync against directory a.
        mkdir $SCRATCH_MNT/a
        mkdir $SCRATCH_MNT/b
        touch $SCRATCH_MNT/a/f
        sync
      
        # Create our test file with 2 4K extents.
        $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io
      
        # Create a snapshot and delete it. This doesn't really delete the snapshot
        # immediately, just makes it inaccessible and invisible to user space, the
        # snapshot is deleted later by a dedicated kernel thread (cleaner kthread)
        # which is woke up at the next transaction commit.
        # A root orphan item is inserted into the tree of tree roots, so that if a
        # power failure happens before the dedicated kernel thread does the snapshot
        # deletion, the next time the filesystem is mounted it resumes the snapshot
        # deletion.
        _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
        _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap
      
        # Now overwrite half of the extents we wrote before. Because we made a snapshpot
        # before, which isn't really deleted yet (since no transaction commit happened
        # after we did the snapshot delete request), the non overwritten extents get
        # referenced twice, once by the default subvolume and once by the snapshot.
        $XFS_IO_PROG -c "pwrite -S 0xbb 4K 8K" $SCRATCH_MNT/foobar | _filter_xfs_io
      
        # Now move file f from directory a to directory b and fsync directory a.
        # The fsync on the directory a triggers a transaction commit (because a file
        # was moved from it to another directory) and the file fsync leaves a log tree
        # with file extent items to replay.
        mv $SCRATCH_MNT/a/f $SCRATCH_MNT/a/b
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
      
        echo "File digest before power failure:"
        md5sum $SCRATCH_MNT/foobar | _filter_scratch
      
        # Now simulate a power failure and mount the filesystem to replay the log tree.
        # After the log tree was replayed, we used to hit a BUG_ON() when processing
        # the root orphan item for the deleted snapshot. This is because when processing
        # an orphan root the code expected to be the first code inserting the root into
        # the fs_info->fs_root_radix radix tree, while in reallity it was the second
        # caller attempting to do it - the first caller was the transaction commit that
        # took place after replaying the log tree, when updating the qgroup counters.
        _flakey_drop_and_remount
      
        echo "File digest before after failure:"
        # Must match what he got before the power failure.
        md5sum $SCRATCH_MNT/foobar | _filter_scratch
      
        _unmount_flakey
        status=0
        exit
      
      Fixes: 2d9e9776 ("Btrfs: use btrfs_get_fs_root in resolve_indirect_ref")
      Cc: stable@vger.kernel.org  # 4.4+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      909c3a22