1. 21 1月, 2016 1 次提交
  2. 12 1月, 2016 1 次提交
  3. 11 1月, 2016 2 次提交
  4. 08 1月, 2016 1 次提交
    • F
      Btrfs: fix fitrim discarding device area reserved for boot loader's use · 8cdc7c5b
      Filipe Manana 提交于
      As of the 4.3 kernel release, the fitrim ioctl can now discard any region
      of a disk that is not allocated to any chunk/block group, including the
      first megabyte which is used for our primary superblock and by the boot
      loader (grub for example).
      
      Fix this by not allowing to trim/discard any region in the device starting
      with an offset not greater than min(alloc_start_mount_option, 1Mb), just
      as it was not possible before 4.3.
      
      A reproducer test case for xfstests follows.
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            cd /
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
      
        rm -f $seqres.full
      
        _scratch_mkfs >>$seqres.full 2>&1
      
        # Write to the [0, 64Kb[ and [68Kb, 1Mb[ ranges of the device. These ranges are
        # reserved for a boot loader to use (GRUB for example) and btrfs should never
        # use them - neither for allocating metadata/data nor should trim/discard them.
        # The range [64Kb, 68Kb[ is used for the primary superblock of the filesystem.
        $XFS_IO_PROG -c "pwrite -S 0xfd 0 64K" $SCRATCH_DEV | _filter_xfs_io
        $XFS_IO_PROG -c "pwrite -S 0xfd 68K 956K" $SCRATCH_DEV | _filter_xfs_io
      
        # Now mount the filesystem and perform a fitrim against it.
        _scratch_mount
        _require_batched_discard $SCRATCH_MNT
        $FSTRIM_PROG $SCRATCH_MNT
      
        # Now unmount the filesystem and verify the content of the ranges was not
        # modified (no trim/discard happened on them).
        _scratch_unmount
        echo "Content of the ranges [0, 64Kb] and [68Kb, 1Mb[ after fitrim:"
        od -t x1 -N $((64 * 1024)) $SCRATCH_DEV
        od -t x1 -j $((68 * 1024)) -N $((956 * 1024)) $SCRATCH_DEV
      
        status=0
        exit
      Reported-by: NVincent Petry  <PVince81@yahoo.fr>
      Reported-by: NAndrei Borzenkov <arvidjaar@gmail.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109341
      Fixes: 499f377f (btrfs: iterate over unused chunk space in FITRIM)
      Cc: stable@vger.kernel.org # 4.3+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      8cdc7c5b
  5. 07 1月, 2016 31 次提交
  6. 01 1月, 2016 3 次提交
    • F
      Btrfs: fix number of transaction units required to create symlink · 9269d12b
      Filipe Manana 提交于
      We weren't accounting for the insertion of an inline extent item for the
      symlink inode nor that we need to update the parent inode item (through
      the call to btrfs_add_nondir()). So fix this by including two more
      transaction units.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      9269d12b
    • F
      Btrfs: don't leave dangling dentry if symlink creation failed · d50866d0
      Filipe Manana 提交于
      When we are creating a symlink we might fail with an error after we
      created its inode and added the corresponding directory indexes to its
      parent inode. In this case we end up never removing the directory indexes
      because the inode eviction handler, called for our symlink inode on the
      final iput(), only removes items associated with the symlink inode and
      not with the parent inode.
      
      Example:
      
        $ mkfs.btrfs -f /dev/sdi
        $ mount /dev/sdi /mnt
        $ touch /mnt/foo
        $ ln -s /mnt/foo /mnt/bar
        ln: failed to create symbolic link ‘bar’: Cannot allocate memory
        $ umount /mnt
        $ btrfsck /dev/sdi
        Checking filesystem on /dev/sdi
        UUID: d5acb5ba-31bd-42da-b456-89dca2e716e1
        checking extents
        checking free space cache
        checking fs roots
        root 5 inode 258 errors 2001, no inode item, link count wrong
      	unresolved ref dir 256 index 3 namelen 3 name bar filetype 7 errors 4, no inode ref
        found 131073 bytes used err is 1
        total csum bytes: 0
        total tree bytes: 131072
        total fs tree bytes: 32768
        total extent tree bytes: 16384
        btree space waste bytes: 124305
        file data blocks allocated: 262144
         referenced 262144
        btrfs-progs v4.2.3
      
      So fix this by adding the directory index entries as the very last
      step of symlink creation.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      d50866d0
    • F
      Btrfs: send, don't BUG_ON() when an empty symlink is found · a879719b
      Filipe Manana 提交于
      When a symlink is successfully created it always has an inline extent
      containing the source path. However if an error happens when creating
      the symlink, we can leave in the subvolume's tree a symlink inode without
      any such inline extent item - this happens if after btrfs_symlink() calls
      btrfs_end_transaction() and before it calls the inode eviction handler
      (through the final iput() call), the transaction gets committed and a
      crash happens before the eviction handler gets called, or if a snapshot
      of the subvolume is made before the eviction handler gets called. Sadly
      we can't just avoid this by making btrfs_symlink() call
      btrfs_end_transaction() after it calls the eviction handler, because the
      later can commit the current transaction before it removes any items from
      the subvolume tree (if it encounters ENOSPC errors while reserving space
      for removing all the items).
      
      So make send fail more gracefully, with an -EIO error, and print a
      message to dmesg/syslog informing that there's an empty symlink inode,
      so that the user can delete the empty symlink or do something else
      about it.
      Reported-by: NStephen R. van den Berg <srb@cuci.nl>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      a879719b
  7. 31 12月, 2015 1 次提交
    • F
      Btrfs: fix race between free space endio workers and space cache writeout · 2bc0bb5f
      Filipe Manana 提交于
      While running a stress test I ran into the following trace/transaction
      abort:
      
      [471626.672243] ------------[ cut here ]------------
      [471626.673322] WARNING: CPU: 9 PID: 19107 at fs/btrfs/extent-tree.c:3740 btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]()
      [471626.675492] BTRFS: Transaction aborted (error -2)
      [471626.676748] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc i2c_piix
      [471626.688802] CPU: 14 PID: 19107 Comm: fsstress Tainted: G        W       4.3.0-rc5-btrfs-next-17+ #1
      [471626.690148] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
      [471626.691901]  0000000000000000 ffff880016037cf0 ffffffff812566f4 ffff880016037d38
      [471626.695009]  ffff880016037d28 ffffffff8104d0a6 ffffffffa040c84e 00000000fffffffe
      [471626.697490]  ffff88011fe855f8 ffff88000c484cb0 ffff88000d195000 ffff880016037d90
      [471626.699201] Call Trace:
      [471626.699804]  [<ffffffff812566f4>] dump_stack+0x4e/0x79
      [471626.701049]  [<ffffffff8104d0a6>] warn_slowpath_common+0x9f/0xb8
      [471626.702542]  [<ffffffffa040c84e>] ? btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
      [471626.704326]  [<ffffffff8104d107>] warn_slowpath_fmt+0x48/0x50
      [471626.705636]  [<ffffffffa0403717>] ? write_one_cache_group.isra.32+0x77/0x82 [btrfs]
      [471626.707048]  [<ffffffffa040c84e>] btrfs_write_dirty_block_groups+0x17c/0x214 [btrfs]
      [471626.708616]  [<ffffffffa048a50a>] commit_cowonly_roots+0x1d7/0x25a [btrfs]
      [471626.709950]  [<ffffffffa041e34a>] btrfs_commit_transaction+0x4c4/0x991 [btrfs]
      [471626.711286]  [<ffffffff81081c61>] ? signal_pending_state+0x31/0x31
      [471626.712611]  [<ffffffffa03f6df4>] btrfs_sync_fs+0x145/0x1ad [btrfs]
      [471626.715610]  [<ffffffff811962a2>] ? SyS_tee+0x226/0x226
      [471626.716718]  [<ffffffff811962c2>] sync_fs_one_sb+0x20/0x22
      [471626.717672]  [<ffffffff8116fc01>] iterate_supers+0x75/0xc2
      [471626.718800]  [<ffffffff8119669a>] sys_sync+0x52/0x80
      [471626.719990]  [<ffffffff8147cd97>] entry_SYSCALL_64_fastpath+0x12/0x6f
      [471626.721835] ---[ end trace baf57f43d76693f4 ]---
      [471626.722954] BTRFS: error (device sdc) in btrfs_write_dirty_block_groups:3740: errno=-2 No such entry
      
      This is a very rare situation and it happened due to a race between a free
      space endio worker and writing the space caches for dirty block groups at
      a transaction's commit critical section. The steps leading to this are:
      
      1) A task calls btrfs_commit_transaction() and starts the writeout of the
         space caches for all currently dirty block groups (i.e. it calls
         btrfs_start_dirty_block_groups());
      
      2) The previous step starts writeback for space caches;
      
      3) When the writeback finishes it queues jobs for free space endio work
         queue (fs_info->endio_freespace_worker) that execute
         btrfs_finish_ordered_io();
      
      4) The task committing the transaction sets the transaction's state
         to TRANS_STATE_COMMIT_DOING and shortly after calls
         btrfs_write_dirty_block_groups();
      
      5) A free space endio job joins the transaction, through
         btrfs_join_transaction_nolock(), and updates a free space inode item
         in the root tree through btrfs_update_inode_fallback();
      
      6) Updating the free space inode item resulted in COWing one or more
         nodes/leaves of the root tree, and that resulted in creating a new
         metadata block group, which gets added to the transaction's list
         of dirty block groups (this is a very rare case);
      
      7) The free space endio job has not released yet its transaction handle
         at this point, so the new metadata block group was not yet fully
         created (didn't go through btrfs_create_pending_block_groups() yet);
      
      8) The transaction commit task sees the new metadata block group in
         the transaction's list of dirty block groups and processes it.
         When it attempts to update the block group's block group item in
         the extent tree, through write_one_cache_group(), it isn't able
         to find it and aborts the transaction with error -ENOENT - this
         is because the free space endio job hasn't yet released its
         transaction handle (which calls btrfs_create_pending_block_groups())
         and therefore the block group item was not yet added to the extent
         tree.
      
      Fix this waiting for free space endio jobs if we fail to find a block
      group item in the extent tree and then retry once updating the block
      group item.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      2bc0bb5f