1. 06 12月, 2022 2 次提交
  2. 24 10月, 2022 1 次提交
    • Q
      btrfs: make thaw time super block check to also verify checksum · 3d17adea
      Qu Wenruo 提交于
      Previous commit a05d3c91 ("btrfs: check superblock to ensure the fs
      was not modified at thaw time") only checks the content of the super
      block, but it doesn't really check if the on-disk super block has a
      matching checksum.
      
      This patch will add the checksum verification to thaw time superblock
      verification.
      
      This involves the following extra changes:
      
      - Export btrfs_check_super_csum()
        As we need to call it in super.c.
      
      - Change the argument list of btrfs_check_super_csum()
        Instead of passing a char *, directly pass struct btrfs_super_block *
        pointer.
      
      - Verify that our checksum type didn't change before checking the
        checksum value, like it's done at mount time
      
      Fixes: a05d3c91 ("btrfs: check superblock to ensure the fs was not modified at thaw time")
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3d17adea
  3. 26 9月, 2022 3 次提交
    • Q
      btrfs: relax block-group-tree feature dependency checks · d7f67ac9
      Qu Wenruo 提交于
      [BUG]
      When one user did a wrong attempt to clear block group tree, which can
      not be done through mount option, by using "-o clear_cache,space_cache=v2",
      it will cause the following error on a fs with block-group-tree feature:
      
        BTRFS info (device dm-1): force clearing of disk cache
        BTRFS info (device dm-1): using free space tree
        BTRFS info (device dm-1): clearing free space tree
        BTRFS info (device dm-1): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
        BTRFS info (device dm-1): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
        BTRFS error (device dm-1): block-group-tree feature requires fres-space-tree and no-holes
        BTRFS error (device dm-1): super block corruption detected before writing it to disk
        BTRFS: error (device dm-1) in write_all_supers:4318: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
        BTRFS warning (device dm-1: state E): Skipping commit of aborted transaction.
      
      [CAUSE]
      Although the dependency for block-group-tree feature is just an
      artificial one (to reduce test matrix), we put the dependency check into
      btrfs_validate_super().
      
      This is too strict, and during space cache clearing, we will have a
      window where free space tree is cleared, and we need to commit the super
      block.
      
      In that window, we had block group tree without v2 cache, and triggered
      the artificial dependency check.
      
      This is not necessary at all, especially for such a soft dependency.
      
      [FIX]
      Introduce a new helper, btrfs_check_features(), to do all the runtime
      limitation checks, including:
      
      - Unsupported incompat flags check
      
      - Unsupported compat RO flags check
      
      - Setting missing incompat flags
      
      - Artificial feature dependency checks
        Currently only block group tree will rely on this.
      
      - Subpage runtime check for v1 cache
      
      With this helper, we can move quite some checks from
      open_ctree()/btrfs_remount() into it, and just call it after
      btrfs_parse_options().
      
      Now "-o clear_cache,space_cache=v2" will not trigger the above error
      anymore.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ edit messages ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d7f67ac9
    • Q
      btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 · 1c56ab99
      Qu Wenruo 提交于
      The problem of long mount time caused by block group item search is
      already known for some time, and the solution of block group tree has
      been proposed.
      
      There is really no need to bound this feature into extent tree v2, just
      introduce compat RO flag, BLOCK_GROUP_TREE, to correctly solve the
      problem.
      
      All the code handling block group root is already in the upstream
      kernel, thus this patch really only needs to introduce the new compat RO
      flag.
      
      This patch introduces one extra artificial limitation on block group
      tree feature, that free space cache v2 and no-holes feature must be
      enabled to use this new compat RO feature.
      
      This artificial requirement is mostly to reduce the test combinations,
      and can be a guideline for future features, to mostly rely on the latest
      default features.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1c56ab99
    • Q
      btrfs: check superblock to ensure the fs was not modified at thaw time · a05d3c91
      Qu Wenruo 提交于
      [BACKGROUND]
      There is an incident report that, one user hibernated the system, with
      one btrfs on removable device still mounted.
      
      Then by some incident, the btrfs got mounted and modified by another
      system/OS, then back to the hibernated system.
      
      After resuming from the hibernation, new write happened into the victim btrfs.
      
      Now the fs is completely broken, since the underlying btrfs is no longer
      the same one before the hibernation, and the user lost their data due to
      various transid mismatch.
      
      [REPRODUCER]
      We can emulate the situation using the following small script:
      
        truncate -s 1G $dev
        mkfs.btrfs -f $dev
        mount $dev $mnt
        fsstress -w -d $mnt -n 500
        sync
        xfs_freeze -f $mnt
        cp $dev $dev.backup
      
        # There is no way to mount the same cloned fs on the same system,
        # as the conflicting fsid will be rejected by btrfs.
        # Thus here we have to wipe the fs using a different btrfs.
        mkfs.btrfs -f $dev.backup
      
        dd if=$dev.backup of=$dev bs=1M
        xfs_freeze -u $mnt
        fsstress -w -d $mnt -n 20
        umount $mnt
        btrfs check $dev
      
      The final fsck will fail due to some tree blocks has incorrect fsid.
      
      This is enough to emulate the problem hit by the unfortunate user.
      
      [ENHANCEMENT]
      Although such case should not be that common, it can still happen from
      time to time.
      
      From the view of btrfs, we can detect any unexpected super block change,
      and if there is any unexpected change, we just mark the fs read-only,
      and thaw the fs.
      
      By this we can limit the damage to minimal, and I hope no one would lose
      their data by this anymore.
      Suggested-by: NGoffredo Baroncelli <kreijack@libero.it>
      Link: https://lore.kernel.org/linux-btrfs/83bf3b4b-7f4c-387a-b286-9251e3991e34@bluemole.com/Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a05d3c91
  4. 17 8月, 2022 1 次提交
  5. 25 7月, 2022 3 次提交
  6. 16 5月, 2022 4 次提交
  7. 14 3月, 2022 1 次提交
  8. 03 1月, 2022 4 次提交
  9. 27 10月, 2021 2 次提交
    • Q
      btrfs: make btrfs_super_block size match BTRFS_SUPER_INFO_SIZE · 38732474
      Qu Wenruo 提交于
      It's a common practice to avoid use sizeof(struct btrfs_super_block)
      (3531), but to use BTRFS_SUPER_INFO_SIZE (4096).
      
      The problem is that, sizeof(struct btrfs_super_block) doesn't match
      BTRFS_SUPER_INFO_SIZE from the very beginning.
      
      Furthermore, for all call sites except selftests, we always allocate
      BTRFS_SUPER_INFO_SIZE space for super block, there isn't any real reason
      to use the smaller value, and it doesn't really save any space.
      
      So let's get rid of such confusing behavior, and unify those two values.
      
      This modification also adds a new static_assert() to verify the size,
      and moves the BTRFS_SUPER_INFO_* macros to the definition of
      btrfs_super_block for the static_assert().
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      38732474
    • Q
      btrfs: rename struct btrfs_io_bio to btrfs_bio · c3a3b19b
      Qu Wenruo 提交于
      Previously we had "struct btrfs_bio", which records IO context for
      mirrored IO and RAID56, and "strcut btrfs_io_bio", which records extra
      btrfs specific info for logical bytenr bio.
      
      With "btrfs_bio" renamed to "btrfs_io_context", we are safe to rename
      "btrfs_io_bio" to "btrfs_bio" which is a more suitable name now.
      
      The struct btrfs_bio changes meaning by this commit. There was a
      suggested name like btrfs_logical_bio but it's a bit long and we'd
      prefer to use a shorter name.
      
      This could be a concern for backports to older kernels where the
      different meaning could possibly cause confusion or bugs. Comparing the
      new and old structures, there's no overlap among the struct members so a
      build would break in case of incorrect backport.
      
      We haven't had many backports to bio code anyway so this is more of a
      theoretical cause of bugs and a matter of precaution but we'll need to
      keep the semantic change in mind.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c3a3b19b
  10. 09 2月, 2021 3 次提交
  11. 10 12月, 2020 4 次提交
  12. 08 12月, 2020 6 次提交
  13. 26 10月, 2020 1 次提交
    • J
      btrfs: add a helper to read the tree_root commit root for backref lookup · 49d11bea
      Josef Bacik 提交于
      I got the following lockdep splat with tree locks converted to rwsem
      patches on btrfs/104:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.9.0+ #102 Not tainted
        ------------------------------------------------------
        btrfs-cleaner/903 is trying to acquire lock:
        ffff8e7fab6ffe30 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x32/0x170
      
        but task is already holding lock:
        ffff8e7fab628a88 (&fs_info->commit_root_sem){++++}-{3:3}, at: btrfs_find_all_roots+0x41/0x80
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #3 (&fs_info->commit_root_sem){++++}-{3:3}:
      	 down_read+0x40/0x130
      	 caching_thread+0x53/0x5a0
      	 btrfs_work_helper+0xfa/0x520
      	 process_one_work+0x238/0x540
      	 worker_thread+0x55/0x3c0
      	 kthread+0x13a/0x150
      	 ret_from_fork+0x1f/0x30
      
        -> #2 (&caching_ctl->mutex){+.+.}-{3:3}:
      	 __mutex_lock+0x7e/0x7b0
      	 btrfs_cache_block_group+0x1e0/0x510
      	 find_free_extent+0xb6e/0x12f0
      	 btrfs_reserve_extent+0xb3/0x1b0
      	 btrfs_alloc_tree_block+0xb1/0x330
      	 alloc_tree_block_no_bg_flush+0x4f/0x60
      	 __btrfs_cow_block+0x11d/0x580
      	 btrfs_cow_block+0x10c/0x220
      	 commit_cowonly_roots+0x47/0x2e0
      	 btrfs_commit_transaction+0x595/0xbd0
      	 sync_filesystem+0x74/0x90
      	 generic_shutdown_super+0x22/0x100
      	 kill_anon_super+0x14/0x30
      	 btrfs_kill_super+0x12/0x20
      	 deactivate_locked_super+0x36/0xa0
      	 cleanup_mnt+0x12d/0x190
      	 task_work_run+0x5c/0xa0
      	 exit_to_user_mode_prepare+0x1df/0x200
      	 syscall_exit_to_user_mode+0x54/0x280
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        -> #1 (&space_info->groups_sem){++++}-{3:3}:
      	 down_read+0x40/0x130
      	 find_free_extent+0x2ed/0x12f0
      	 btrfs_reserve_extent+0xb3/0x1b0
      	 btrfs_alloc_tree_block+0xb1/0x330
      	 alloc_tree_block_no_bg_flush+0x4f/0x60
      	 __btrfs_cow_block+0x11d/0x580
      	 btrfs_cow_block+0x10c/0x220
      	 commit_cowonly_roots+0x47/0x2e0
      	 btrfs_commit_transaction+0x595/0xbd0
      	 sync_filesystem+0x74/0x90
      	 generic_shutdown_super+0x22/0x100
      	 kill_anon_super+0x14/0x30
      	 btrfs_kill_super+0x12/0x20
      	 deactivate_locked_super+0x36/0xa0
      	 cleanup_mnt+0x12d/0x190
      	 task_work_run+0x5c/0xa0
      	 exit_to_user_mode_prepare+0x1df/0x200
      	 syscall_exit_to_user_mode+0x54/0x280
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        -> #0 (btrfs-root-00){++++}-{3:3}:
      	 __lock_acquire+0x1167/0x2150
      	 lock_acquire+0xb9/0x3d0
      	 down_read_nested+0x43/0x130
      	 __btrfs_tree_read_lock+0x32/0x170
      	 __btrfs_read_lock_root_node+0x3a/0x50
      	 btrfs_search_slot+0x614/0x9d0
      	 btrfs_find_root+0x35/0x1b0
      	 btrfs_read_tree_root+0x61/0x120
      	 btrfs_get_root_ref+0x14b/0x600
      	 find_parent_nodes+0x3e6/0x1b30
      	 btrfs_find_all_roots_safe+0xb4/0x130
      	 btrfs_find_all_roots+0x60/0x80
      	 btrfs_qgroup_trace_extent_post+0x27/0x40
      	 btrfs_add_delayed_data_ref+0x3fd/0x460
      	 btrfs_free_extent+0x42/0x100
      	 __btrfs_mod_ref+0x1d7/0x2f0
      	 walk_up_proc+0x11c/0x400
      	 walk_up_tree+0xf0/0x180
      	 btrfs_drop_snapshot+0x1c7/0x780
      	 btrfs_clean_one_deleted_snapshot+0xfb/0x110
      	 cleaner_kthread+0xd4/0x140
      	 kthread+0x13a/0x150
      	 ret_from_fork+0x1f/0x30
      
        other info that might help us debug this:
      
        Chain exists of:
          btrfs-root-00 --> &caching_ctl->mutex --> &fs_info->commit_root_sem
      
         Possible unsafe locking scenario:
      
      	 CPU0                    CPU1
      	 ----                    ----
          lock(&fs_info->commit_root_sem);
      				 lock(&caching_ctl->mutex);
      				 lock(&fs_info->commit_root_sem);
          lock(btrfs-root-00);
      
         *** DEADLOCK ***
      
        3 locks held by btrfs-cleaner/903:
         #0: ffff8e7fab628838 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: cleaner_kthread+0x6e/0x140
         #1: ffff8e7faadac640 (sb_internal){.+.+}-{0:0}, at: start_transaction+0x40b/0x5c0
         #2: ffff8e7fab628a88 (&fs_info->commit_root_sem){++++}-{3:3}, at: btrfs_find_all_roots+0x41/0x80
      
        stack backtrace:
        CPU: 0 PID: 903 Comm: btrfs-cleaner Not tainted 5.9.0+ #102
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
        Call Trace:
         dump_stack+0x8b/0xb0
         check_noncircular+0xcf/0xf0
         __lock_acquire+0x1167/0x2150
         ? __bfs+0x42/0x210
         lock_acquire+0xb9/0x3d0
         ? __btrfs_tree_read_lock+0x32/0x170
         down_read_nested+0x43/0x130
         ? __btrfs_tree_read_lock+0x32/0x170
         __btrfs_tree_read_lock+0x32/0x170
         __btrfs_read_lock_root_node+0x3a/0x50
         btrfs_search_slot+0x614/0x9d0
         ? find_held_lock+0x2b/0x80
         btrfs_find_root+0x35/0x1b0
         ? do_raw_spin_unlock+0x4b/0xa0
         btrfs_read_tree_root+0x61/0x120
         btrfs_get_root_ref+0x14b/0x600
         find_parent_nodes+0x3e6/0x1b30
         btrfs_find_all_roots_safe+0xb4/0x130
         btrfs_find_all_roots+0x60/0x80
         btrfs_qgroup_trace_extent_post+0x27/0x40
         btrfs_add_delayed_data_ref+0x3fd/0x460
         btrfs_free_extent+0x42/0x100
         __btrfs_mod_ref+0x1d7/0x2f0
         walk_up_proc+0x11c/0x400
         walk_up_tree+0xf0/0x180
         btrfs_drop_snapshot+0x1c7/0x780
         ? btrfs_clean_one_deleted_snapshot+0x73/0x110
         btrfs_clean_one_deleted_snapshot+0xfb/0x110
         cleaner_kthread+0xd4/0x140
         ? btrfs_alloc_root+0x50/0x50
         kthread+0x13a/0x150
         ? kthread_create_worker_on_cpu+0x40/0x40
         ret_from_fork+0x1f/0x30
        BTRFS info (device sdb): disk space caching is enabled
        BTRFS info (device sdb): has skinny extents
      
      This happens because qgroups does a backref lookup when we create a
      delayed ref.  From here it may have to look up a root from an indirect
      ref, which does a normal lookup on the tree_root, which takes the read
      lock on the tree_root nodes.
      
      To fix this we need to add a variant for looking up roots that searches
      the commit root of the tree_root.  Then when we do the backref search
      using the commit root we are sure to not take any locks on the tree_root
      nodes.  This gets rid of the lockdep splat when running btrfs/104.
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      49d11bea
  14. 07 10月, 2020 3 次提交
  15. 27 7月, 2020 1 次提交
    • Q
      btrfs: preallocate anon block device at first phase of snapshot creation · 2dfb1e43
      Qu Wenruo 提交于
      [BUG]
      When the anonymous block device pool is exhausted, subvolume/snapshot
      creation fails with EMFILE (Too many files open). This has been reported
      by a user. The allocation happens in the second phase during transaction
      commit where it's only way out is to abort the transaction
      
        BTRFS: Transaction aborted (error -24)
        WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
        RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
        Call Trace:
         create_pending_snapshots+0x82/0xa0 [btrfs]
         btrfs_commit_transaction+0x275/0x8c0 [btrfs]
         btrfs_mksubvol+0x4b9/0x500 [btrfs]
         btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
         btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
         btrfs_ioctl+0x11a4/0x2da0 [btrfs]
         do_vfs_ioctl+0xa9/0x640
         ksys_ioctl+0x67/0x90
         __x64_sys_ioctl+0x1a/0x20
         do_syscall_64+0x5a/0x110
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        ---[ end trace 33f2f83f3d5250e9 ]---
        BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
        BTRFS info (device sda1): forced readonly
        BTRFS warning (device sda1): Skipping commit of aborted transaction.
        BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
      
      [CAUSE]
      When the global anonymous block device pool is exhausted, the following
      call chain will fail, and lead to transaction abort:
      
       btrfs_ioctl_snap_create_v2()
       |- btrfs_ioctl_snap_create_transid()
          |- btrfs_mksubvol()
             |- btrfs_commit_transaction()
                |- create_pending_snapshot()
                   |- btrfs_get_fs_root()
                      |- btrfs_init_fs_root()
                         |- get_anon_bdev()
      
      [FIX]
      Although we can't enlarge the anonymous block device pool, at least we
      can preallocate anon_dev for subvolume/snapshot in the first phase,
      outside of transaction context and exactly at the moment the user calls
      the creation ioctl.
      Reported-by: NGreed Rong <greedrong@gmail.com>
      Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@mail.gmail.com/
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      2dfb1e43
  16. 25 5月, 2020 1 次提交
    • D
      btrfs: simplify root lookup by id · 56e9357a
      David Sterba 提交于
      The main function to lookup a root by its id btrfs_get_fs_root takes the
      whole key, while only using the objectid. The value of offset is preset
      to (u64)-1 but not actually used until btrfs_find_root that does the
      actual search.
      
      Switch btrfs_get_fs_root to use only objectid and remove all local
      variables that existed just for the lookup. The actual key for search is
      set up in btrfs_get_fs_root, reusing another key variable.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      56e9357a