1. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  2. 12 3月, 2016 1 次提交
  3. 11 2月, 2016 1 次提交
  4. 01 1月, 2016 1 次提交
    • F
      Btrfs: send, don't BUG_ON() when an empty symlink is found · a879719b
      Filipe Manana 提交于
      When a symlink is successfully created it always has an inline extent
      containing the source path. However if an error happens when creating
      the symlink, we can leave in the subvolume's tree a symlink inode without
      any such inline extent item - this happens if after btrfs_symlink() calls
      btrfs_end_transaction() and before it calls the inode eviction handler
      (through the final iput() call), the transaction gets committed and a
      crash happens before the eviction handler gets called, or if a snapshot
      of the subvolume is made before the eviction handler gets called. Sadly
      we can't just avoid this by making btrfs_symlink() call
      btrfs_end_transaction() after it calls the eviction handler, because the
      later can commit the current transaction before it removes any items from
      the subvolume tree (if it encounters ENOSPC errors while reserving space
      for removing all the items).
      
      So make send fail more gracefully, with an -EIO error, and print a
      message to dmesg/syslog informing that there's an empty symlink inode,
      so that the user can delete the empty symlink or do something else
      about it.
      Reported-by: NStephen R. van den Berg <srb@cuci.nl>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      a879719b
  5. 14 10月, 2015 1 次提交
    • R
      btrfs: fix resending received snapshot with parent · b96b1db0
      Robin Ruede 提交于
      This fixes a regression introduced by 37b8d27d between v4.1 and v4.2.
      
      When a snapshot is received, its received_uuid is set to the original
      uuid of the subvolume. When that snapshot is then resent to a third
      filesystem, it's received_uuid is set to the second uuid
      instead of the original one. The same was true for the parent_uuid.
      This behaviour was partially changed in 37b8d27d, but in that patch
      only the parent_uuid was taken from the real original,
      not the uuid itself, causing the search for the parent to fail in
      the case below.
      
      This happens for example when trying to send a series of linked
      snapshots (e.g. created by snapper) from the backup file system back
      to the original one.
      
      The following commands reproduce the issue in v4.2.1
      (no error in 4.1.6)
      
          # setup three test file systems
          for i in 1 2 3; do
      	    truncate -s 50M fs$i
      	    mkfs.btrfs fs$i
      	    mkdir $i
      	    mount fs$i $i
          done
          echo "content" > 1/testfile
          btrfs su snapshot -r 1/ 1/snap1
          echo "changed content" > 1/testfile
          btrfs su snapshot -r 1/ 1/snap2
      
          # works fine:
          btrfs send 1/snap1 | btrfs receive 2/
          btrfs send -p 1/snap1 1/snap2 | btrfs receive 2/
      
          # ERROR: could not find parent subvolume
          btrfs send 2/snap1 | btrfs receive 3/
          btrfs send -p 2/snap1 2/snap2 | btrfs receive 3/
      Signed-off-by: NRobin Ruede <rruede+git@gmail.com>
      Fixes: 37b8d27d ("Btrfs: use received_uuid of parent during send")
      Cc: stable@vger.kernel.org # v4.2+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Tested-by: NEd Tomlinson <edt@aei.ca>
      b96b1db0
  6. 13 10月, 2015 1 次提交
    • F
      Btrfs: send, fix file corruption due to incorrect cloning operations · d906d49f
      Filipe Manana 提交于
      If we have a file that shares an extent with other files, when processing
      the extent item relative to a shared extent, we blindly issue a clone
      operation that will target a length matching the length in the extent item
      and uses as a source some other file the receiver already has and points
      to the same extent. However that range in the other file might not
      exclusively point only to the shared extent, and so using that length
      will result in the receiver getting a file with different data from the
      one in the send snapshot. This issue happened both for incremental and
      full send operations.
      
      So fix this by issuing clone operations with lengths that don't cover
      regions of the source file that point to different extents (or have holes).
      
      The following test case for fstests reproduces the problem.
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -fr $send_files_dir
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _need_to_be_root
        _require_cp_reflink
        _require_xfs_io_command "fpunch"
      
        send_files_dir=$TEST_DIR/btrfs-test-$seq
      
        rm -f $seqres.full
        rm -fr $send_files_dir
        mkdir $send_files_dir
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        # Create our test file with a single 100K extent.
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 100K" \
           $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Clone our file into a new file named bar.
        cp --reflink=always $SCRATCH_MNT/foo $SCRATCH_MNT/bar
      
        # Now overwrite parts of our foo file.
        $XFS_IO_PROG -c "pwrite -S 0xbb 50K 10K" \
           -c "pwrite -S 0xcc 90K 10K" \
           -c "fpunch 70K 10k" \
           $SCRATCH_MNT/foo | _filter_xfs_io
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
           $SCRATCH_MNT/snap
      
        echo "File digests in the original filesystem:"
        md5sum $SCRATCH_MNT/snap/foo | _filter_scratch
        md5sum $SCRATCH_MNT/snap/bar | _filter_scratch
      
        _run_btrfs_util_prog send $SCRATCH_MNT/snap -f $send_files_dir/1.snap
      
        # Now recreate the filesystem by receiving the send stream and verify
        # we get the same file contents that the original filesystem had.
        _scratch_unmount
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
      
        # We expect the destination filesystem to have exactly the same file
        # data as the original filesystem.
        # The btrfs send implementation had a bug where it sent a clone
        # operation from file foo into file bar covering the whole [0, 100K[
        # range after creating and writing the file foo. This was incorrect
        # because the file bar now included the updates done to file foo after
        # we cloned foo to bar, breaking the COW nature of reflink copies
        # (cloned extents).
        echo "File digests in the new filesystem:"
        md5sum $SCRATCH_MNT/snap/foo | _filter_scratch
        md5sum $SCRATCH_MNT/snap/bar | _filter_scratch
      
        status=0
        exit
      
      Another test case that reproduces the problem when we have compressed
      extents:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -fr $send_files_dir
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _need_to_be_root
        _require_cp_reflink
      
        send_files_dir=$TEST_DIR/btrfs-test-$seq
      
        rm -f $seqres.full
        rm -fr $send_files_dir
        mkdir $send_files_dir
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount "-o compress"
      
        # Create our file with an extent of 100K starting at file offset 0K.
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 100K"       \
                        -c "fsync"                        \
                        $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Rewrite part of the previous extent (its first 40K) and write a new
        # 100K extent starting at file offset 100K.
        $XFS_IO_PROG -c "pwrite -S 0xbb 0K 40K"    \
                -c "pwrite -S 0xcc 100K 100K"      \
                $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Our file foo now has 3 file extent items in its metadata:
        #
        # 1) One covering the file range 0 to 40K;
        # 2) One covering the file range 40K to 100K, which points to the first
        #    extent we wrote to the file and has a data offset field with value
        #    40K (our file no longer uses the first 40K of data from that
        #    extent);
        # 3) One covering the file range 100K to 200K.
      
        # Now clone our file foo into file bar.
        cp --reflink=always $SCRATCH_MNT/foo $SCRATCH_MNT/bar
      
        # Create our snapshot for the send operation.
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
                $SCRATCH_MNT/snap
      
        echo "File digests in the original filesystem:"
        md5sum $SCRATCH_MNT/snap/foo | _filter_scratch
        md5sum $SCRATCH_MNT/snap/bar | _filter_scratch
      
        _run_btrfs_util_prog send $SCRATCH_MNT/snap -f $send_files_dir/1.snap
      
        # Now recreate the filesystem by receiving the send stream and verify we
        # get the same file contents that the original filesystem had.
        # Btrfs send used to issue a clone operation from foo's range
        # [80K, 140K[ to bar's range [40K, 100K[ when cloning the extent pointed
        # to by foo's second file extent item, this was incorrect because of bad
        # accounting of the file extent item's data offset field. The correct
        # range to clone from should have been [40K, 100K[.
        _scratch_unmount
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount "-o compress"
      
        _run_btrfs_util_prog receive $SCRATCH_MNT -f $send_files_dir/1.snap
      
        echo "File digests in the new filesystem:"
        # Must match the digests we got in the original filesystem.
        md5sum $SCRATCH_MNT/snap/foo | _filter_scratch
        md5sum $SCRATCH_MNT/snap/bar | _filter_scratch
      
        status=0
        exit
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      d906d49f
  7. 08 10月, 2015 1 次提交
  8. 06 10月, 2015 1 次提交
    • F
      Btrfs: send, fix corner case for reference overwrite detection · b786f16a
      Filipe Manana 提交于
      When the inode given to did_overwrite_ref() matches the current progress
      and has a reference that collides with the reference of other inode that
      has the same number as the current progress, we were always telling our
      caller that the inode's reference was overwritten, which is incorrect
      because the other inode might be a new inode (different generation number)
      in which case we must return false from did_overwrite_ref() so that its
      callers don't use an orphanized path for the inode (as it will never be
      orphanized, instead it will be unlinked and the new inode created later).
      
      The following test case for fstests reproduces the issue:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -fr $send_files_dir
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _need_to_be_root
      
        send_files_dir=$TEST_DIR/btrfs-test-$seq
      
        rm -f $seqres.full
        rm -fr $send_files_dir
        mkdir $send_files_dir
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        # Create our test file with a single extent of 64K.
        mkdir -p $SCRATCH_MNT/foo
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 64K" $SCRATCH_MNT/foo/bar \
            | _filter_xfs_io
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
            $SCRATCH_MNT/mysnap1
        _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT \
            $SCRATCH_MNT/mysnap2
      
        echo "File digest before being replaced:"
        md5sum $SCRATCH_MNT/mysnap1/foo/bar | _filter_scratch
      
        # Remove the file and then create a new one in the same location with
        # the same name but with different content. This new file ends up
        # getting the same inode number as the previous one, because that inode
        # number was the highest inode number used by the snapshot's root and
        # therefore when attempting to find the a new inode number for the new
        # file, we end up reusing the same inode number. This happens because
        # currently btrfs uses the highest inode number summed by 1 for the
        # first inode created once a snapshot's root is loaded (done at
        # fs/btrfs/inode-map.c:btrfs_find_free_objectid in the linux kernel
        # tree).
        # Having these two different files in the snapshots with the same inode
        # number (but different generation numbers) caused the btrfs send code
        # to emit an incorrect path for the file when issuing an unlink
        # operation because it failed to realize they were different files.
        rm -f $SCRATCH_MNT/mysnap2/foo/bar
        $XFS_IO_PROG -f -c "pwrite -S 0xbb 0 96K" \
            $SCRATCH_MNT/mysnap2/foo/bar | _filter_xfs_io
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT/mysnap2 \
            $SCRATCH_MNT/mysnap2_ro
      
        _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
        _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 \
            $SCRATCH_MNT/mysnap2_ro -f $send_files_dir/2.snap
      
        echo "File digest in the original filesystem after being replaced:"
        md5sum $SCRATCH_MNT/mysnap2_ro/foo/bar | _filter_scratch
      
        # Now recreate the filesystem by receiving both send streams and verify
        # we get the same file contents that the original filesystem had.
        _scratch_unmount
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        _run_btrfs_util_prog receive -vv $SCRATCH_MNT -f $send_files_dir/1.snap
        _run_btrfs_util_prog receive -vv $SCRATCH_MNT -f $send_files_dir/2.snap
      
        echo "File digest in the new filesystem:"
        # Must match the digest from the new file.
        md5sum $SCRATCH_MNT/mysnap2_ro/foo/bar | _filter_scratch
      
        status=0
        exit
      Reported-by: NMartin Raiber <martin@urbackup.org>
      Fixes: 8b191a68 ("Btrfs: incremental send, check if orphanized dir inode needs delayed rename")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      b786f16a
  9. 13 6月, 2015 1 次提交
    • J
      Btrfs: use received_uuid of parent during send · 37b8d27d
      Josef Bacik 提交于
      Neil Horman pointed out a problem where if he did something like this
      
      receive A
      snap A B
      change B
      send -p A B
      
      and then on another box do
      
      recieve A
      receive B
      
      the receive B would fail because we use the UUID of A for the clone sources for
      B.  This makes sense most of the time because normally you are sending from the
      original sources, not a received source.  However when you use a recieved subvol
      its UUID is going to be something completely different, so if you then try to
      receive the diff on a different volume it won't find the UUID because the new A
      will be something else.  The only constant is the received uuid.  So instead
      check to see if we have received_uuid set on the root, and if so use that as the
      clone source, as btrfs receive looks for matches either in received_uuid or
      uuid.  Thanks,
      Reported-by: NNeil Horman <nhorman@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NHugo Mills <hugo@carfax.org.uk>
      Signed-off-by: NChris Mason <clm@fb.com>
      37b8d27d
  10. 03 6月, 2015 3 次提交
    • F
      Btrfs: incremental send, fix clone operations for compressed extents · 619d8c4e
      Filipe Manana 提交于
      Marc reported a problem where the receiving end of an incremental send
      was performing clone operations that failed with -EINVAL. This happened
      because, unlike for uncompressed extents, we were not checking if the
      source clone offset and length, after summing the data offset, falls
      within the source file's boundaries.
      
      So make sure we do such checks when attempting to issue clone operations
      for compressed extents.
      
      Problem reproducible with the following steps:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount -o compress /dev/sdb /mnt
        $ mkfs.btrfs -f /dev/sdc
        $ mount -o compress /dev/sdc /mnt2
      
        # Create the file with a single extent of 128K. This creates a metadata file
        # extent item with a data start offset of 0 and a logical length of 128K.
        $ xfs_io -f -c "pwrite -S 0xaa 64K 128K" -c "fsync" /mnt/foo
      
        # Now rewrite the range 64K to 112K of our file. This will make the inode's
        # metadata continue to point to the 128K extent we created before, but now
        # with an extent item that points to the extent with a data start offset of
        # 112K and a logical length of 16K.
        # That metadata file extent item is associated with the logical file offset
        # at 176K and covers the logical file range 176K to 192K.
        $ xfs_io -c "pwrite -S 0xbb 64K 112K" -c "fsync" /mnt/foo
      
        # Now rewrite the range 180K to 12K. This will make the inode's metadata
        # continue to point the the 128K extent we created earlier, with a single
        # extent item that points to it with a start offset of 112K and a logical
        # length of 4K.
        # That metadata file extent item is associated with the logical file offset
        # at 176K and covers the logical file range 176K to 180K.
        $ xfs_io -c "pwrite -S 0xcc 180K 12K" -c "fsync" /mnt/foo
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap1
      
        $ touch /mnt/bar
        # Calls the btrfs clone ioctl.
        $ ~/xfstests/src/cloner -s $((176 * 1024)) -d $((176 * 1024)) \
          -l $((4 * 1024)) /mnt/foo /mnt/bar
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap2
      
        $ btrfs send /mnt/snap1 | btrfs receive /mnt2
        At subvol /mnt/snap1
        At subvol snap1
      
        $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
        At subvol /mnt/snap2
        At snapshot snap2
        ERROR: failed to clone extents to bar
        Invalid argument
      
      A test case for fstests follows soon.
      Reported-by: NMarc MERLIN <marc@merlins.org>
      Tested-by: NMarc MERLIN <marc@merlins.org>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Tested-by: NDavid Sterba <dsterba@suse.cz>
      Tested-by: NJan Alexander Steffens (heftig) <jan.steffens@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      619d8c4e
    • F
      Btrfs: incremental send, check if orphanized dir inode needs delayed rename · 8b191a68
      Filipe Manana 提交于
      If a directory inode is orphanized, because some inode previously
      processed has a new name that collides with the old name of the current
      inode, we need to check if it needs its rename operation delayed too,
      as its ancestor-descendent relationship with some other inode might
      have been reversed between the parent and send snapshots and therefore
      its rename operation needs to happen after that other inode is renamed.
      
      For example, for the following reproducer where this is needed (provided
      by Robbie Ko):
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt2
      
        $ mkdir -p /mnt/data/n1/n2
        $ mkdir /mnt/data/n4
        $ mkdir -p /mnt/data/t6/t7
        $ mkdir /mnt/data/t5
        $ mkdir /mnt/data/t7
        $ mkdir /mnt/data/n4/t2
        $ mkdir /mnt/data/t4
        $ mkdir /mnt/data/t3
        $ mv /mnt/data/t7 /mnt/data/n4/t2
        $ mv /mnt/data/t4 /mnt/data/n4/t2/t7
        $ mv /mnt/data/t5 /mnt/data/n4/t2/t7/t4
        $ mv /mnt/data/t6 /mnt/data/n4/t2/t7/t4/t5
        $ mv /mnt/data/n1/n2 /mnt/data/n4/t2/t7/t4/t5/t6
        $ mv /mnt/data/n1 /mnt/data/n4/t2/t7/t4/t5/t6
        $ mv /mnt/data/n4/t2/t7/t4/t5/t6/t7 /mnt/data/n4/t2/t7/t4/t5/t6/n2
        $ mv /mnt/data/t3 /mnt/data/n4/t2/t7/t4/t5/t6/n2/t7
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap1
      
        $ mv /mnt/data/n4/t2/t7/t4/t5/t6/n1 /mnt/data/n4
        $ mv /mnt/data/n4/t2 /mnt/data/n4/n1
        $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6/n2 /mnt/data/n4/n1/t2
        $ mv /mnt/data/n4/n1/t2/n2/t7/t3 /mnt/data/n4/n1/t2
        $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6 /mnt/data/n4/n1/t2
        $ mv /mnt/data/n4/n1/t2/t7/t4 /mnt/data/n4/n1/t2/t6
        $ mv /mnt/data/n4/n1/t2/t7 /mnt/data/n4/n1/t2/t3
        $ mv /mnt/data/n4/n1/t2/n2/t7 /mnt/data/n4/n1/t2
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap2
      
        $ btrfs send /mnt/snap1 | btrfs receive /mnt2
        $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
        ERROR: send ioctl failed with -12: Cannot allocate memory
      
      Where the parent snapshot directory hierarchy is the following:
      
        .                                                        (ino 256)
        |-- data/                                                (ino 257)
              |-- n4/                                            (ino 260)
                   |-- t2/                                       (ino 265)
                        |-- t7/                                  (ino 264)
                             |-- t4/                             (ino 266)
                                  |-- t5/                        (ino 263)
                                       |-- t6/                   (ino 261)
                                            |-- n1/              (ino 258)
                                            |-- n2/              (ino 259)
                                                 |-- t7/         (ino 262)
                                                      |-- t3/    (ino 267)
      
      And the send snapshot's directory hierarchy is the following:
      
        .                                                        (ino 256)
        |-- data/                                                (ino 257)
              |-- n4/                                            (ino 260)
                   |-- n1/                                       (ino 258)
                        |-- t2/                                  (ino 265)
                             |-- n2/                             (ino 259)
                             |-- t3/                             (ino 267)
                             |    |-- t7                         (ino 264)
                             |
                             |-- t6/                             (ino 261)
                             |    |-- t4/                        (ino 266)
                             |         |-- t5/                   (ino 263)
                             |
                             |-- t7/                             (ino 262)
      
      While processing inode 262 we orphanize inode 264 and later attempt
      to rename inode 264 to its new name/location, which resulted in building
      an incorrect destination path string for the rename operation with the
      value "data/n4/t2/t7/t4/t5/t6/n2/t7/t3/t7". This rename operation must
      have been done only after inode 267 is processed and renamed, as the
      ancestor-descendent relationship between inodes 264 and 267 was reversed
      between both snapshots, because otherwise it results in an infinite loop
      when building the path string for inode 264 when we are processing an
      inode with a number larger than 264. That loop is the following:
      
        start inode 264, send progress of 265 for example
        parent of 264 -> 267
        parent of 267 -> 262
        parent of 262 -> 259
        parent of 259 -> 261
        parent of 261 -> 263
        parent of 263 -> 266
        parent of 266 -> 264
          |--> back to first iteration while current path string length
               is <= PATH_MAX, and fail with -ENOMEM otherwise
      
      So fix this by making the check if we need to delay a directory rename
      regardless of the current inode having been orphanized or not.
      
      A test case for fstests follows soon.
      
      Thanks to Robbie Ko for providing a reproducer for this problem.
      Reported-by: NRobbie Ko <robbieko@synology.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      8b191a68
    • F
      Btrfs: incremental send, don't delay directory renames unnecessarily · 80aa6027
      Filipe Manana 提交于
      Even though we delay the rename of directories when they become
      descendents of other directories that were also renamed in the send
      root to prevent infinite path build loops, we were doing it in cases
      where this was not needed and was actually harmful resulting in
      infinite path build loops as we ended up with a circular dependency
      of delayed directory renames.
      
      Consider the following reproducer:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt2
      
        $ mkdir /mnt/data
        $ mkdir /mnt/data/n1
        $ mkdir /mnt/data/n1/n2
        $ mkdir /mnt/data/n4
        $ mkdir /mnt/data/n1/n2/p1
        $ mkdir /mnt/data/n1/n2/p1/p2
        $ mkdir /mnt/data/t6
        $ mkdir /mnt/data/t7
        $ mkdir -p /mnt/data/t5/t7
        $ mkdir /mnt/data/t2
        $ mkdir /mnt/data/t4
        $ mkdir -p /mnt/data/t1/t3
        $ mkdir /mnt/data/p1
        $ mv /mnt/data/t1 /mnt/data/p1
        $ mkdir -p /mnt/data/p1/p2
        $ mv /mnt/data/t4 /mnt/data/p1/p2/t1
        $ mv /mnt/data/t5 /mnt/data/n4/t5
        $ mv /mnt/data/n1/n2/p1/p2 /mnt/data/n4/t5/p2
        $ mv /mnt/data/t7 /mnt/data/n4/t5/p2/t7
        $ mv /mnt/data/t2 /mnt/data/n4/t1
        $ mv /mnt/data/p1 /mnt/data/n4/t5/p2/p1
        $ mv /mnt/data/n1/n2 /mnt/data/n4/t5/p2/p1/p2/n2
        $ mv /mnt/data/n4/t5/p2/p1/p2/t1 /mnt/data/n4/t5/p2/p1/p2/n2/t1
        $ mv /mnt/data/n4/t5/t7 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t7
        $ mv /mnt/data/n4/t5/p2/p1/t1/t3 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t3
        $ mv /mnt/data/n4/t5/p2/p1/p2/n2/p1 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t7/p1
        $ mv /mnt/data/t6 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t3/t5
        $ mv /mnt/data/n4/t5/p2/p1/t1 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t3/t1
        $ mv /mnt/data/n1 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t7/p1/n1
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap1
      
        $ mv /mnt/data/n4/t1 /mnt/data/n4/t5/p2/p1/p2/n2/t1/t7/p1/t1
        $ mv /mnt/data/n4/t5/p2/p1/p2/n2/t1 /mnt/data/n4/
        $ mv /mnt/data/n4/t5/p2/p1/p2/n2 /mnt/data/n4/t1/n2
        $ mv /mnt/data/n4/t1/t7/p1 /mnt/data/n4/t1/n2/p1
        $ mv /mnt/data/n4/t1/t3/t1 /mnt/data/n4/t1/n2/t1
        $ mv /mnt/data/n4/t1/t3 /mnt/data/n4/t1/n2/t1/t3
        $ mv /mnt/data/n4/t5/p2/p1/p2 /mnt/data/n4/t1/n2/p1/p2
        $ mv /mnt/data/n4/t1/t7 /mnt/data/n4/t1/n2/p1/t7
        $ mv /mnt/data/n4/t5/p2/p1 /mnt/data/n4/t1/n2/p1/p2/p1
        $ mv /mnt/data/n4/t1/n2/t1/t3/t5 /mnt/data/n4/t1/n2/p1/p2/t5
        $ mv /mnt/data/n4/t5 /mnt/data/n4/t1/n2/p1/p2/p1/t5
        $ mv /mnt/data/n4/t1/n2/p1/p2/p1/t5/p2 /mnt/data/n4/t1/n2/p1/p2/p1/p2
        $ mv /mnt/data/n4/t1/n2/p1/p2/p1/p2/t7 /mnt/data/n4/t1/t7
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap2
      
        $ btrfs send /mnt/snap1 | btrfs receive /mnt2
        $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive -vv /mnt2
        ERROR: send ioctl failed with -12: Cannot allocate memory
      
      This reproducer resulted in an infinite path build loop when building the
      path for inode 266 because the following circular dependency of delayed
      directory renames was created:
      
         ino 272 <- ino 261 <- ino 259 <- ino 268 <- ino 267 <- ino 261
      
      Where the notation "X <- Y" means the rename of inode X is delayed by the
      rename of inode Y (X will be renamed after Y is renamed). This resulted
      in an infinite path build loop of inode 266 because that inode has inode
      261 as an ancestor in the send root and inode 261 is in the circular
      dependency of delayed renames listed above.
      
      Fix this by not delaying the rename of a directory inode if an ancestor of
      the inode in the send root, which has a delayed rename operation, is not
      also a descendent of the inode in the parent root.
      
      Thanks to Robbie Ko for sending the reproducer example.
      A test case for xfstests follows soon.
      Reported-by: NRobbie Ko <robbieko@synology.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      80aa6027
  11. 27 3月, 2015 4 次提交
    • F
      Btrfs: incremental send, remove dead code · 5f806c3a
      Filipe Manana 提交于
      The logic to detect path loops when attempting to apply a pending
      directory rename, introduced in commit
      f959492f (Btrfs: send, fix more issues related to directory renames)
      is no longer needed, and the respective fstests test case for that commit,
      btrfs/045, now passes without this code (as well as all the other test
      cases for send/receive).
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5f806c3a
    • F
      Btrfs: incremental send, clear name from cache after orphanization · 8996a48c
      Filipe Manana 提交于
      If a directory's reference ends up being orphanized, because the inode
      currently being processed has a new path that matches that directory's
      path, make sure we evict the name of the directory from the name cache.
      This is because there might be descendent inodes (either directories or
      regular files) that will be orphanized later too, and therefore the
      orphan name of the ancestor must be used, otherwise we send issue rename
      operations with a wrong path in the send stream.
      
      Reproducer:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ mkdir -p /mnt/data/n1/n2/p1/p2
        $ mkdir /mnt/data/n4
        $ mkdir -p /mnt/data/p1/p2
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap1
      
        $ mv /mnt/data/p1/p2 /mnt/data
        $ mv /mnt/data/n1/n2/p1/p2 /mnt/data/p1
        $ mv /mnt/data/p2 /mnt/data/n1/n2/p1
        $ mv /mnt/data/n1/n2 /mnt/data/p1
        $ mv /mnt/data/p1 /mnt/data/n4
        $ mv /mnt/data/n4/p1/n2/p1 /mnt/data
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap2
      
        $ btrfs send /mnt/snap1 -f /tmp/1.send
        $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.send
      
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt2
        $ btrfs receive /mnt2 -f /tmp/1.send
        $ btrfs receive /mnt2 -f /tmp/2.send
        ERROR: rename data/p1/p2 -> data/n4/p1/p2 failed. no such file or directory
      
      Directories data/p1 (inode 263) and data/p1/p2 (inode 264) in the parent
      snapshot are both orphanized during the incremental send, and as soon as
      data/p1 is orphanized, we must make sure that when orphanizing data/p1/p2
      we use a source path of o263-6-o/p2 for the rename operation instead of
      the old path data/p1/p2 (the one before the orphanization of inode 263).
      
      A test case for xfstests follows soon.
      Reported-by: NRobbie Ko <robbieko@synology.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      8996a48c
    • F
      Btrfs: send, don't leave without decrementing clone root's send_progress · 2f1f465a
      Filipe Manana 提交于
      If the clone root was not readonly or the dead flag was set on it, we were
      leaving without decrementing the root's send_progress counter (and before
      we just incremented it). If a concurrent snapshot deletion was in progress
      and ended up being aborted, it would be impossible to later attempt to
      delete again the snapshot, since the root's send_in_progress counter could
      never go back to 0.
      
      We were also setting clone_sources_to_rollback to i + 1 too early - if we
      bailed out because the clone root we got is not readonly or flagged as dead
      we ended up later derreferencing a null pointer because we didn't assign
      the clone root to sctx->clone_roots[i].root:
      
      		for (i = 0; sctx && i < clone_sources_to_rollback; i++)
      			btrfs_root_dec_send_in_progress(
      					sctx->clone_roots[i].root);
      
      So just don't increment the send_in_progress counter if the root is readonly
      or flagged as dead.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      2f1f465a
    • F
      Btrfs: send, add missing check for dead clone root · 5cc2b17e
      Filipe Manana 提交于
      After we locked the root's root item, a concurrent snapshot deletion
      call might have set the dead flag on it. So check if the dead flag
      is set and abort if it is, just like we do for the parent root.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      5cc2b17e
  12. 03 3月, 2015 1 次提交
    • F
      Btrfs: incremental send, don't rename a directory too soon · 84471e24
      Filipe Manana 提交于
      There's one more case where we can't issue a rename operation for a
      directory as soon as we process it. We used to delay directory renames
      only if they have some ancestor directory with a higher inode number
      that got renamed too, but there's another case where we need to delay
      the rename too - when a directory A is renamed to the old name of a
      directory B but that directory B has its rename delayed because it
      has now (in the send root) an ancestor with a higher inode number that
      was renamed. If we don't delay the directory rename in this case, the
      receiving end of the send stream will attempt to rename A to the old
      name of B before B got renamed to its new name, which results in a
      "directory not empty" error. So fix this by delaying directory renames
      for this case too.
      
      Steps to reproduce:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        $ mkdir /mnt/a
        $ mkdir /mnt/b
        $ mkdir /mnt/c
        $ touch /mnt/a/file
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap1
      
        $ mv /mnt/c /mnt/x
        $ mv /mnt/a /mnt/x/y
        $ mv /mnt/b /mnt/a
      
        $ btrfs subvolume snapshot -r /mnt /mnt/snap2
      
        $ btrfs send /mnt/snap1 -f /tmp/1.send
        $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.send
      
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt2
        $ btrfs receive /mnt2 -f /tmp/1.send
        $ btrfs receive /mnt2 -f /tmp/2.send
        ERROR: rename b -> a failed. Directory not empty
      
      A test case for xfstests follows soon.
      Reported-by: NAmes Cornish <ames@cornishes.net>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      84471e24
  13. 03 2月, 2015 1 次提交
  14. 25 11月, 2014 1 次提交
  15. 04 10月, 2014 1 次提交
    • F
      Btrfs: send, don't delay dir move if there's a new parent inode · bf8e8ca6
      Filipe Manana 提交于
      If between two snapshots we rename an existing directory named X to Y and
      make it a child (direct or not) of a new inode named X, we were delaying
      the move/rename of the former directory unnecessarily, which would result
      in attempting to rename the new directory from its orphan name to name X
      prematurely.
      
      Minimal reproducer:
      
          $ mkfs.btrfs -f /dev/vdd
          $ mount /dev/vdd /mnt
          $ mkdir -p /mnt/merlin/RC/OSD/Source
      
          $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
      
          $ mkdir /mnt/OSD
          $ mv /mnt/merlin/RC/OSD /mnt/OSD/OSD-Plane_788
          $ mv /mnt/OSD /mnt/merlin/RC
      
          $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
      
          $ btrfs send /mnt/mysnap1 -f /tmp/1.snap
          $ btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap
      
          $ mkfs.btrfs -f /dev/vdc
          $ mount /dev/vdc /mnt2
      
          $ btrfs receive /mnt2 -f /tmp/1.snap
          $ btrfs receive /mnt2 -f /tmp/2.snap
      
      The second receive (from an incremental send) failed with the following
      error message: "rename o261-7-0 -> merlin/RC/OSD failed".
      This is a regression introduced in the 3.16 kernel.
      
      A test case for xfstests follows.
      Reported-by: NMarc Merlin <marc@merlins.org>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      bf8e8ca6
  16. 02 10月, 2014 1 次提交
  17. 18 9月, 2014 2 次提交
  18. 10 6月, 2014 8 次提交
    • F
      Btrfs: send, use the right limits for xattr names and values · 7e3ae33e
      Filipe Manana 提交于
      We were limiting the sum of the xattr name and value lengths to PATH_MAX,
      which is not correct, specially on filesystems created with btrfs-progs
      v3.12 or higher, where the default leaf size is max(16384, PAGE_SIZE), or
      systems with page sizes larger than 4096 bytes.
      
      Xattrs have their own specific maximum name and value lengths, which depend
      on the leaf size, therefore use these limits to be able to send xattrs with
      sizes larger than PATH_MAX.
      
      A test case for xfstests follows.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      7e3ae33e
    • F
      Btrfs: send, don't error in the presence of subvols/snapshots · 1af56070
      Filipe Manana 提交于
      If we are doing an incremental send and the base snapshot has a
      directory with name X that doesn't exist anymore in the second
      snapshot and a new subvolume/snapshot exists in the second snapshot
      that has the same name as the directory (name X), the incremental
      send would fail with -ENOENT error. This is because it attempts
      to lookup for an inode with a number matching the objectid of a
      root, which doesn't exist.
      
      Steps to reproduce:
      
          mkfs.btrfs -f /dev/sdd
          mount /dev/sdd /mnt
      
          mkdir /mnt/testdir
          btrfs subvolume snapshot -r /mnt /mnt/mysnap1
      
          rmdir /mnt/testdir
          btrfs subvolume create /mnt/testdir
          btrfs subvolume snapshot -r /mnt /mnt/mysnap2
      
          btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data
      
      A test case for xfstests follows.
      Reported-by: NRobert White <rwhite@pobox.com>
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      1af56070
    • D
      btrfs: remove stale newlines from log messages · 351fd353
      David Sterba 提交于
      I've noticed an extra line after "use no compression", but search
      revealed much more in messages of more critical levels and rare errors.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      351fd353
    • F
      Btrfs: send, fix more issues related to directory renames · f959492f
      Filipe Manana 提交于
      This is a continuation of the previous changes titled:
      
         Btrfs: fix incremental send's decision to delay a dir move/rename
         Btrfs: part 2, fix incremental send's decision to delay a dir move/rename
      
      There's a few more cases where a directory rename/move must be delayed which was
      previously overlooked. If our immediate ancestor has a lower inode number than
      ours and it doesn't have a delayed rename/move operation associated to it, it
      doesn't mean there isn't any non-direct ancestor of our current inode that needs
      to be renamed/moved before our current inode (i.e. with a higher inode number
      than ours).
      
      So we can't stop the search if our immediate ancestor has a lower inode number than
      ours, we need to navigate the directory hierarchy upwards until we hit the root or:
      
      1) find an ancestor with an higher inode number that was renamed/moved in the send
         root too (or already has a pending rename/move registered);
      2) find an ancestor that is a new directory (higher inode number than ours and
         exists only in the send root).
      
      Reproducer for case 1)
      
          $ mkfs.btrfs -f /dev/sdd
          $ mount /dev/sdd /mnt
      
          $ mkdir -p /mnt/a/b
          $ mkdir -p /mnt/a/c/d
          $ mkdir /mnt/a/b/e
          $ mkdir /mnt/a/c/d/f
          $ mv /mnt/a/b /mnt/a/c/d/2b
          $ mkdir /mnt/a/x
          $ mkdir /mnt/a/y
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap1
          $ btrfs send /mnt/snap1 -f /tmp/base.send
      
          $ mv /mnt/a/x /mnt/a/y
          $ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e
          $ mv /mnt/a/c/d /mnt/a/h/2d
          $ mv /mnt/a/c /mnt/a/h/2d/2b/2c
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap2
          $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send
      
      Simple reproducer for case 2)
      
          $ mkfs.btrfs -f /dev/sdd
          $ mount /dev/sdd /mnt
      
          $ mkdir -p /mnt/a/b
          $ mkdir /mnt/a/c
          $ mv /mnt/a/b /mnt/a/c/b2
          $ mkdir /mnt/a/e
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap1
          $ btrfs send /mnt/snap1 -f /tmp/base.send
      
          $ mv /mnt/a/c/b2 /mnt/a/e/b3
          $ mkdir /mnt/a/e/b3/f
          $ mkdir /mnt/a/h
          $ mv /mnt/a/c /mnt/a/e/b3/f/c2
          $ mv /mnt/a/e /mnt/a/h/e2
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap2
          $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send
      
      Another simple reproducer for case 2)
      
          $ mkfs.btrfs -f /dev/sdd
          $ mount /dev/sdd /mnt
      
          $ mkdir -p /mnt/a/b
          $ mkdir /mnt/a/c
          $ mkdir /mnt/a/b/d
          $ mkdir /mnt/a/c/e
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap1
          $ btrfs send /mnt/snap1 -f /tmp/base.send
      
          $ mkdir /mnt/a/b/d/f
          $ mkdir /mnt/a/b/g
          $ mv /mnt/a/c/e /mnt/a/b/g/e2
          $ mv /mnt/a/c /mnt/a/b/d/f/c2
          $ mv /mnt/a/b/d/f /mnt/a/b/g/e2/f2
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap2
          $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send
      
      More complex reproducer for case 2)
      
          $ mkfs.btrfs -f /dev/sdd
          $ mount /dev/sdd /mnt
      
          $ mkdir -p /mnt/a/b
          $ mkdir -p /mnt/a/c/d
          $ mkdir /mnt/a/b/e
          $ mkdir /mnt/a/c/d/f
          $ mv /mnt/a/b /mnt/a/c/d/2b
          $ mkdir /mnt/a/x
          $ mkdir /mnt/a/y
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap1
          $ btrfs send /mnt/snap1 -f /tmp/base.send
      
          $ mv /mnt/a/x /mnt/a/y
          $ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e
          $ mv /mnt/a/c/d /mnt/a/h/2d
          $ mv /mnt/a/c /mnt/a/h/2d/2b/2c
      
          $ btrfs subvolume snapshot -r /mnt /mnt/snap2
          $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send
      
      For both cases the incremental send would enter an infinite loop when building
      path strings.
      
      While solving these cases, this change also re-implements the code to detect
      when directory moves/renames should be delayed. Instead of dealing with several
      specific cases separately, it's now more generic handling all cases with a simple
      detection algorithm and if when applying a delayed move/rename there's a path loop
      detected, it further delays the move/rename registering a new ancestor inode as
      the dependency inode (so our rename happens after that ancestor is renamed).
      
      Tests for these cases is being added to xfstests too.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f959492f
    • F
    • F
      Btrfs: send, account for orphan directories when building path strings · c992ec94
      Filipe Manana 提交于
      If we have directories with a pending move/rename operation, we must take into
      account any orphan directories that got created before executing the pending
      move/rename. Those orphan directories are directories with an inode number higher
      then the current send progress and that don't exist in the parent snapshot, they
      are created before current progress reaches their inode number, with a generated
      name of the form oN-M-I and at the root of the filesystem tree, and later when
      progress matches their inode number, moved/renamed to their final location.
      
      Reproducer:
      
                $ mkfs.btrfs -f /dev/sdd
                $ mount /dev/sdd /mnt
      
                $ mkdir -p /mnt/a/b/c/d
                $ mkdir /mnt/a/b/e
                $ mv /mnt/a/b/c /mnt/a/b/e/CC
                $ mkdir /mnt/a/b/e/CC/d/f
      	  $ mkdir /mnt/a/g
      
                $ btrfs subvolume snapshot -r /mnt /mnt/snap1
                $ btrfs send /mnt/snap1 -f /tmp/base.send
      
                $ mkdir /mnt/a/g/h
      	  $ mv /mnt/a/b/e /mnt/a/g/h/EE
                $ mv /mnt/a/g/h/EE/CC/d /mnt/a/g/h/EE/DD
      
                $ btrfs subvolume snapshot -r /mnt /mnt/snap2
                $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send
      
      The second receive command failed with the following error:
      
          ERROR: rename a/b/e/CC/d -> o264-7-0/EE/DD failed. No such file or directory
      
      A test case for xfstests follows soon.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c992ec94
    • F
      Btrfs: send, avoid unnecessary inode item lookup in the btree · b46ab97b
      Filipe Manana 提交于
      Regardless of whether the caller is interested or not in knowing the inode's
      generation (dir_gen != NULL), get_first_ref always does a btree lookup to get
      the inode item. Avoid this useless lookup if dir_gen parameter is NULL (which
      is in some cases).
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      b46ab97b
    • D
      btrfs: protect snapshots from deleting during send · 521e0546
      David Sterba 提交于
      The patch "Btrfs: fix protection between send and root deletion"
      (18f687d5) does not actually prevent to delete the snapshot
      and just takes care during background cleaning, but this seems rather
      user unfriendly, this patch implements the idea presented in
      
      http://www.spinics.net/lists/linux-btrfs/msg30813.html
      
      - add an internal root_item flag to denote a dead root
      - check if the send_in_progress is set and refuse to delete, otherwise
        set the flag and proceed
      - check the flag in send similar to the btrfs_root_readonly checks, for
        all involved roots
      
      The root lookup in send via btrfs_read_fs_root_no_name will check if the
      root is really dead or not. If it is, ENOENT, aborted send. If it's
      alive, it's protected by send_in_progress, send can continue.
      
      CC: Miao Xie <miaox@cn.fujitsu.com>
      CC: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      521e0546
  19. 07 6月, 2014 1 次提交
    • F
      Btrfs: send, fix corrupted path strings for long paths · 01a9a8a9
      Filipe Manana 提交于
      If a path has more than 230 characters, we allocate a new buffer to
      use for the path, but we were forgotting to copy the contents of the
      previous buffer into the new one, which has random content from the
      kmalloc call.
      
      Test:
      
          mkfs.btrfs -f /dev/sdd
          mount /dev/sdd /mnt
      
          TEST_PATH="/mnt/fdmanana/.config/google-chrome-mysetup/Default/Pepper_Data/Shockwave_Flash/WritableRoot/#SharedObjects/JSHJ4ZKN/s.wsj.net/[[IMPORT]]/players.edgesuite.net/flash/plugins/osmf/advanced-streaming-plugin/v2.7/osmf1.6/Ak#"
          mkdir -p $TEST_PATH
          echo "hello world" > $TEST_PATH/amaiAdvancedStreamingPlugin.txt
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap1
          btrfs send /mnt/mysnap1 -f /tmp/1.snap
      
      A test for xfstests follows.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Cc: Marc Merlin <marc@merlins.org>
      Tested-by: NMarc MERLIN <marc@merlins.org>
      Signed-off-by: NChris Mason <clm@fb.com>
      01a9a8a9
  20. 21 5月, 2014 1 次提交
    • F
      Btrfs: send, fix incorrect ref access when using extrefs · 51a60253
      Filipe Manana 提交于
      When running send, if an inode only has extended reference items
      associated to it and no regular references, send.c:get_first_ref()
      was incorrectly assuming the reference it found was of type
      BTRFS_INODE_REF_KEY due to use of the wrong key variable.
      This caused weird behaviour when using the found item has a regular
      reference, such as weird path string, and occasionally (when lucky)
      a crash:
      
      [  190.600652] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  190.600994] Modules linked in: btrfs xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc psmouse serio_raw evbug pcspkr i2c_piix4 e1000 floppy
      [  190.602565] CPU: 2 PID: 14520 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-26+ #1
      [  190.602728] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  190.602868] task: ffff8800d447c920 ti: ffff8801fa79e000 task.ti: ffff8801fa79e000
      [  190.603030] RIP: 0010:[<ffffffff813266b4>]  [<ffffffff813266b4>] memcpy+0x54/0x110
      [  190.603262] RSP: 0018:ffff8801fa79f880  EFLAGS: 00010202
      [  190.603395] RAX: ffff8800d4326e3f RBX: 000000000000036a RCX: ffff880000000000
      [  190.603553] RDX: 000000000000032a RSI: ffe708844042936a RDI: ffff8800d43271a9
      [  190.603710] RBP: ffff8801fa79f8c8 R08: 00000000003a4ef0 R09: 0000000000000000
      [  190.603867] R10: 793a4ef09f000000 R11: 9f0000000053726f R12: ffff8800d43271a9
      [  190.604020] R13: 0000160000000000 R14: ffff8802110134f0 R15: 000000000000036a
      [  190.604020] FS:  00007fb423d09b80(0000) GS:ffff880216200000(0000) knlGS:0000000000000000
      [  190.604020] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  190.604020] CR2: 00007fb4229d4b78 CR3: 00000001f5d76000 CR4: 00000000000006e0
      [  190.604020] Stack:
      [  190.604020]  ffffffffa01f4d49 ffff8801fa79f8f0 00000000000009f9 ffff8801fa79f8c8
      [  190.604020]  00000000000009f9 ffff880211013260 000000000000f971 ffff88021147dba8
      [  190.604020]  00000000000009f9 ffff8801fa79f918 ffffffffa02367f5 ffff8801fa79f928
      [  190.604020] Call Trace:
      [  190.604020]  [<ffffffffa01f4d49>] ? read_extent_buffer+0xb9/0x120 [btrfs]
      [  190.604020]  [<ffffffffa02367f5>] fs_path_add_from_extent_buffer+0x45/0x60 [btrfs]
      [  190.604020]  [<ffffffffa0238806>] get_first_ref+0x1f6/0x210 [btrfs]
      [  190.604020]  [<ffffffffa0238994>] __get_cur_name_and_parent+0x174/0x3a0 [btrfs]
      [  190.604020]  [<ffffffff8118df3d>] ? kmem_cache_alloc_trace+0x11d/0x1e0
      [  190.604020]  [<ffffffffa0236674>] ? fs_path_alloc+0x24/0x60 [btrfs]
      [  190.604020]  [<ffffffffa0238c91>] get_cur_path+0xd1/0x240 [btrfs]
      (...)
      
      Steps to reproduce (either crash or some weirdness like an odd path string):
      
          mkfs.btrfs -f -O extref /dev/sdd
          mount /dev/sdd /mnt
      
          mkdir /mnt/testdir
          touch /mnt/testdir/foobar
      
          for i in `seq 1 2550`; do
              ln /mnt/testdir/foobar /mnt/testdir/foobar_link_`printf "%04d" $i`
          done
      
          ln /mnt/testdir/foobar /mnt/testdir/final_foobar_name
      
          rm -f /mnt/testdir/foobar
          for i in `seq 1 2550`; do
              rm -f /mnt/testdir/foobar_link_`printf "%04d" $i`
          done
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap
          btrfs send /mnt/mysnap -f /tmp/mysnap.send
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      51a60253
  21. 26 4月, 2014 1 次提交
    • C
      Btrfs: limit the path size in send to PATH_MAX · cfd4a535
      Chris Mason 提交于
      fs_path_ensure_buf is used to make sure our path buffers for
      send are big enough for the path names as we construct them.
      The buffer size is limited to 32K by the length field in
      the struct.
      
      But bugs in the path construction can end up trying to build
      a huge buffer, and we'll do invalid memmmoves when the
      buffer length field wraps.
      
      This patch is step one, preventing the overflows.
      Signed-off-by: NChris Mason <clm@fb.com>
      cfd4a535
  22. 08 4月, 2014 3 次提交
    • F
      Btrfs: send, build path string only once in send_hole · c715e155
      Filipe Manana 提交于
      There's no point building the path string in each iteration of the
      send_hole loop, as it produces always the same string.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c715e155
    • F
      Btrfs: send, fix data corruption due to incorrect hole detection · 766b5e5a
      Filipe Manana 提交于
      During an incremental send, when we finish processing an inode (corresponding to
      a regular file) we would assume the gap between the end of the last processed file
      extent and the file's size corresponded to a file hole, and therefore incorrectly
      send a bunch of zero bytes to overwrite that region in the file.
      
      This affects only kernel 3.14.
      
      Reproducer:
      
          mkfs.btrfs -f /dev/sdc
          mount /dev/sdc /mnt
      
          xfs_io -f -c "falloc -k 0 268435456" /mnt/foo
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap0
      
          xfs_io -c "pwrite -S 0x01 -b 9216 16190218 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x02 -b 1121 198720104 1121" /mnt/foo
          xfs_io -c "pwrite -S 0x05 -b 9216 107887439 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x06 -b 9216 225520207 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x07 -b 67584 102138300 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x08 -b 7000 94897484 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x09 -b 113664 245083212 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x10 -b 123 17937788 123" /mnt/foo
          xfs_io -c "pwrite -S 0x11 -b 39936 229573311 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x12 -b 67584 174792222 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x13 -b 9216 249253213 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x16 -b 67584 150046083 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x17 -b 39936 118246040 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x18 -b 67584 215965442 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x19 -b 33792 97096725 33792" /mnt/foo
          xfs_io -c "pwrite -S 0x20 -b 125952 166300596 125952" /mnt/foo
          xfs_io -c "pwrite -S 0x21 -b 123 1078957 123" /mnt/foo
          xfs_io -c "pwrite -S 0x25 -b 9216 212044492 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x26 -b 7000 265037146 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x27 -b 42757 215922685 42757" /mnt/foo
          xfs_io -c "pwrite -S 0x28 -b 7000 69865411 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x29 -b 67584 67948958 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x30 -b 39936 266967019 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x31 -b 1121 19582453 1121" /mnt/foo
          xfs_io -c "pwrite -S 0x32 -b 17408 257710255 17408" /mnt/foo
          xfs_io -c "pwrite -S 0x33 -b 39936 3895518 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x34 -b 125952 12045847 125952" /mnt/foo
          xfs_io -c "pwrite -S 0x35 -b 17408 19156379 17408" /mnt/foo
          xfs_io -c "pwrite -S 0x36 -b 39936 50160066 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x37 -b 113664 9549793 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x38 -b 105472 94391506 105472" /mnt/foo
          xfs_io -c "pwrite -S 0x39 -b 23552 143632863 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x40 -b 39936 241283845 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x41 -b 113664 199937606 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x42 -b 67584 67380093 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x43 -b 67584 26793129 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x44 -b 39936 14421913 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x45 -b 123 253097405 123" /mnt/foo
          xfs_io -c "pwrite -S 0x46 -b 1121 128233424 1121" /mnt/foo
          xfs_io -c "pwrite -S 0x47 -b 105472 91577959 105472" /mnt/foo
          xfs_io -c "pwrite -S 0x48 -b 1121 7245381 1121" /mnt/foo
          xfs_io -c "pwrite -S 0x49 -b 113664 182414694 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x50 -b 9216 32750608 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x51 -b 67584 266546049 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x52 -b 67584 87969398 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x53 -b 9216 260848797 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x54 -b 39936 119461243 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x55 -b 7000 200178693 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x56 -b 9216 243316029 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x57 -b 7000 209658229 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x58 -b 101376 179745192 101376" /mnt/foo
          xfs_io -c "pwrite -S 0x59 -b 9216 64012300 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x60 -b 125952 181705139 125952" /mnt/foo
          xfs_io -c "pwrite -S 0x61 -b 23552 235737348 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x62 -b 113664 106021355 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x63 -b 67584 135753552 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x64 -b 23552 95730888 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x65 -b 11 17311415 11" /mnt/foo
          xfs_io -c "pwrite -S 0x66 -b 33792 120695553 33792" /mnt/foo
          xfs_io -c "pwrite -S 0x67 -b 9216 17164631 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x68 -b 9216 136065853 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x69 -b 67584 37752198 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x70 -b 101376 189717473 101376" /mnt/foo
          xfs_io -c "pwrite -S 0x71 -b 7000 227463698 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x72 -b 9216 12655137 9216" /mnt/foo
          xfs_io -c "pwrite -S 0x73 -b 7000 7488866 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x74 -b 113664 87813649 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x75 -b 33792 25802183 33792" /mnt/foo
          xfs_io -c "pwrite -S 0x76 -b 39936 93524024 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x77 -b 33792 113336388 33792" /mnt/foo
          xfs_io -c "pwrite -S 0x78 -b 105472 184955320 105472" /mnt/foo
          xfs_io -c "pwrite -S 0x79 -b 101376 225691598 101376" /mnt/foo
          xfs_io -c "pwrite -S 0x80 -b 23552 77023155 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x81 -b 11 201888192 11" /mnt/foo
          xfs_io -c "pwrite -S 0x82 -b 11 115332492 11" /mnt/foo
          xfs_io -c "pwrite -S 0x83 -b 67584 230278015 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x84 -b 11 120589073 11" /mnt/foo
          xfs_io -c "pwrite -S 0x85 -b 125952 202207819 125952" /mnt/foo
          xfs_io -c "pwrite -S 0x86 -b 113664 86672080 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x87 -b 17408 208459603 17408" /mnt/foo
          xfs_io -c "pwrite -S 0x88 -b 7000 73372211 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x89 -b 7000 42252122 7000" /mnt/foo
          xfs_io -c "pwrite -S 0x90 -b 23552 46784881 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x91 -b 101376 63172351 101376" /mnt/foo
          xfs_io -c "pwrite -S 0x92 -b 23552 59341931 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x93 -b 39936 239599283 39936" /mnt/foo
          xfs_io -c "pwrite -S 0x94 -b 67584 175643105 67584" /mnt/foo
          xfs_io -c "pwrite -S 0x97 -b 23552 105534880 23552" /mnt/foo
          xfs_io -c "pwrite -S 0x98 -b 113664 8236844 113664" /mnt/foo
          xfs_io -c "pwrite -S 0x99 -b 125952 144489686 125952" /mnt/foo
          xfs_io -c "pwrite -S 0xa0 -b 7000 73273112 7000" /mnt/foo
          xfs_io -c "pwrite -S 0xa1 -b 125952 194580243 125952" /mnt/foo
          xfs_io -c "pwrite -S 0xa2 -b 123 56296779 123" /mnt/foo
          xfs_io -c "pwrite -S 0xa3 -b 11 233066845 11" /mnt/foo
          xfs_io -c "pwrite -S 0xa4 -b 39936 197727090 39936" /mnt/foo
          xfs_io -c "pwrite -S 0xa5 -b 101376 53579812 101376" /mnt/foo
          xfs_io -c "pwrite -S 0xa6 -b 9216 85669738 9216" /mnt/foo
          xfs_io -c "pwrite -S 0xa7 -b 125952 21266322 125952" /mnt/foo
          xfs_io -c "pwrite -S 0xa8 -b 23552 125726568 23552" /mnt/foo
          xfs_io -c "pwrite -S 0xa9 -b 9216 18423680 9216" /mnt/foo
          xfs_io -c "pwrite -S 0xb0 -b 1121 165901483 1121" /mnt/foo
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap1
      
          xfs_io -c "pwrite -S 0xff -b 10 16190218 10" /mnt/foo
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap2
      
          md5sum /mnt/foo          # returns 79e53f1466bfc09fd82b450689e6119e
          md5sum /mnt/mysnap2/foo  # returns 79e53f1466bfc09fd82b450689e6119e too
      
          btrfs send /mnt/mysnap1 -f /tmp/1.snap
          btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap
      
          mkfs.btrfs -f /dev/sdc
          mount /dev/sdc /mnt
      
          btrfs receive /mnt -f /tmp/1.snap
          btrfs receive /mnt -f /tmp/2.snap
      
          md5sum /mnt/mysnap2/foo  # returns 2bb414c5155767cedccd7063e51beabd !!
      
      A testcase for xfstests follows soon too.
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      766b5e5a
    • J
      Btrfs: hold the commit_root_sem when getting the commit root during send · 3f8a18cc
      Josef Bacik 提交于
      We currently rely too heavily on roots being read-only to save us from just
      accessing root->commit_root.  We can easily balance blocks out from underneath a
      read only root, so to save us from getting screwed make sure we only access
      root->commit_root under the commit root sem.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3f8a18cc
  23. 07 4月, 2014 2 次提交
    • J
      Btrfs: remove transaction from send · 9e351cc8
      Josef Bacik 提交于
      Lets try this again.  We can deadlock the box if we send on a box and try to
      write onto the same fs with the app that is trying to listen to the send pipe.
      This is because the writer could get stuck waiting for a transaction commit
      which is being blocked by the send.  So fix this by making sure looking at the
      commit roots is always going to be consistent.  We do this by keeping track of
      which roots need to have their commit roots swapped during commit, and then
      taking the commit_root_sem and swapping them all at once.  Then make sure we
      take a read lock on the commit_root_sem in cases where we search the commit root
      to make sure we're always looking at a consistent view of the commit roots.
      Previously we had problems with this because we would swap a fs tree commit root
      and then swap the extent tree commit root independently which would cause the
      backref walking code to screw up sometimes.  With this patch we no longer
      deadlock and pass all the weird send/receive corner cases.  Thanks,
      Reportedy-by: NHugo Mills <hugo@carfax.org.uk>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      9e351cc8
    • J
      Btrfs: don't clear uptodate if the eb is under IO · a26e8c9f
      Josef Bacik 提交于
      So I have an awful exercise script that will run snapshot, balance and
      send/receive in parallel.  This sometimes would crash spectacularly and when it
      came back up the fs would be completely hosed.  Turns out this is because of a
      bad interaction of balance and send/receive.  Send will hold onto its entire
      path for the whole send, but its blocks could get relocated out from underneath
      it, and because it doesn't old tree locks theres nothing to keep this from
      happening.  So it will go to read in a slot with an old transid, and we could
      have re-allocated this block for something else and it could have a completely
      different transid.  But because we think it is invalid we clear uptodate and
      re-read in the block.  If we do this before we actually write out the new block
      we could write back stale data to the fs, and boom we're screwed.
      
      Now we definitely need to fix this disconnect between send and balance, but we
      really really need to not allow ourselves to accidently read in stale data over
      new data.  So make sure we check if the extent buffer is not under io before
      clearing uptodate, this will kick back EIO to the caller instead of reading in
      stale data and keep us from corrupting the fs.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      a26e8c9f
  24. 22 3月, 2014 1 次提交