1. 12 1月, 2023 1 次提交
  2. 03 1月, 2023 2 次提交
    • Q
      btrfs: fix compat_ro checks against remount · 2ba48b20
      Qu Wenruo 提交于
      [BUG]
      Even with commit 81d5d614 ("btrfs: enhance unsupported compat RO
      flags handling"), btrfs can still mount a fs with unsupported compat_ro
      flags read-only, then remount it RW:
      
        # btrfs ins dump-super /dev/loop0 | grep compat_ro_flags -A 3
        compat_ro_flags		0x403
      			( FREE_SPACE_TREE |
      			  FREE_SPACE_TREE_VALID |
      			  unknown flag: 0x400 )
      
        # mount /dev/loop0 /mnt/btrfs
        mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
               dmesg(1) may have more information after failed mount system call.
        ^^^ RW mount failed as expected ^^^
      
        # dmesg -t | tail -n5
        loop0: detected capacity change from 0 to 1048576
        BTRFS: device fsid cb5b82f5-0fdd-4d81-9b4b-78533c324afa devid 1 transid 7 /dev/loop0 scanned by mount (1146)
        BTRFS info (device loop0): using crc32c (crc32c-intel) checksum algorithm
        BTRFS info (device loop0): using free space tree
        BTRFS error (device loop0): cannot mount read-write because of unknown compat_ro features (0x403)
        BTRFS error (device loop0): open_ctree failed
      
        # mount /dev/loop0 -o ro /mnt/btrfs
        # mount -o remount,rw /mnt/btrfs
        ^^^ RW remount succeeded unexpectedly ^^^
      
      [CAUSE]
      Currently we use btrfs_check_features() to check compat_ro flags against
      our current mount flags.
      
      That function get reused between open_ctree() and btrfs_remount().
      
      But for btrfs_remount(), the super block we passed in still has the old
      mount flags, thus btrfs_check_features() still believes we're mounting
      read-only.
      
      [FIX]
      Replace the existing @sb argument with @is_rw_mount.
      
      As originally we only use @sb to determine if the mount is RW.
      
      Now it's callers' responsibility to determine if the mount is RW, and
      since there are only two callers, the check is pretty simple:
      
      - caller in open_ctree()
        Just pass !sb_rdonly().
      
      - caller in btrfs_remount()
        Pass !(*flags & SB_RDONLY), as our check should be against the new
        flags.
      
      Now we can correctly reject the RW remount:
      
        # mount /dev/loop0 -o ro /mnt/btrfs
        # mount -o remount,rw /mnt/btrfs
        mount: /mnt/btrfs: mount point not mounted or bad option.
               dmesg(1) may have more information after failed mount system call.
        # dmesg -t | tail -n 1
        BTRFS error (device loop0: state M): cannot mount read-write because of unknown compat_ro features (0x403)
      Reported-by: NChung-Chiang Cheng <shepjeng@gmail.com>
      Fixes: 81d5d614 ("btrfs: enhance unsupported compat RO flags handling")
      CC: stable@vger.kernel.org # 5.15+
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      2ba48b20
    • Q
      btrfs: add error message for metadata level mismatch · 77177ed1
      Qu Wenruo 提交于
      From a recent regression report, we found that after commit 947a6299
      ("btrfs: move tree block parentness check into
      validate_extent_buffer()") if we have a level mismatch (false alert
      though), there is no error message at all.
      
      This makes later debugging harder.  This patch will add the proper error
      message for such case.
      
      Link: https://lore.kernel.org/linux-btrfs/CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com/Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      77177ed1
  3. 06 12月, 2022 27 次提交
  4. 07 11月, 2022 1 次提交
  5. 24 10月, 2022 1 次提交
    • Q
      btrfs: make thaw time super block check to also verify checksum · 3d17adea
      Qu Wenruo 提交于
      Previous commit a05d3c91 ("btrfs: check superblock to ensure the fs
      was not modified at thaw time") only checks the content of the super
      block, but it doesn't really check if the on-disk super block has a
      matching checksum.
      
      This patch will add the checksum verification to thaw time superblock
      verification.
      
      This involves the following extra changes:
      
      - Export btrfs_check_super_csum()
        As we need to call it in super.c.
      
      - Change the argument list of btrfs_check_super_csum()
        Instead of passing a char *, directly pass struct btrfs_super_block *
        pointer.
      
      - Verify that our checksum type didn't change before checking the
        checksum value, like it's done at mount time
      
      Fixes: a05d3c91 ("btrfs: check superblock to ensure the fs was not modified at thaw time")
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3d17adea
  6. 26 9月, 2022 8 次提交
    • Q
      btrfs: relax block-group-tree feature dependency checks · d7f67ac9
      Qu Wenruo 提交于
      [BUG]
      When one user did a wrong attempt to clear block group tree, which can
      not be done through mount option, by using "-o clear_cache,space_cache=v2",
      it will cause the following error on a fs with block-group-tree feature:
      
        BTRFS info (device dm-1): force clearing of disk cache
        BTRFS info (device dm-1): using free space tree
        BTRFS info (device dm-1): clearing free space tree
        BTRFS info (device dm-1): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1)
        BTRFS info (device dm-1): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2)
        BTRFS error (device dm-1): block-group-tree feature requires fres-space-tree and no-holes
        BTRFS error (device dm-1): super block corruption detected before writing it to disk
        BTRFS: error (device dm-1) in write_all_supers:4318: errno=-117 Filesystem corrupted (unexpected superblock corruption detected)
        BTRFS warning (device dm-1: state E): Skipping commit of aborted transaction.
      
      [CAUSE]
      Although the dependency for block-group-tree feature is just an
      artificial one (to reduce test matrix), we put the dependency check into
      btrfs_validate_super().
      
      This is too strict, and during space cache clearing, we will have a
      window where free space tree is cleared, and we need to commit the super
      block.
      
      In that window, we had block group tree without v2 cache, and triggered
      the artificial dependency check.
      
      This is not necessary at all, especially for such a soft dependency.
      
      [FIX]
      Introduce a new helper, btrfs_check_features(), to do all the runtime
      limitation checks, including:
      
      - Unsupported incompat flags check
      
      - Unsupported compat RO flags check
      
      - Setting missing incompat flags
      
      - Artificial feature dependency checks
        Currently only block group tree will rely on this.
      
      - Subpage runtime check for v1 cache
      
      With this helper, we can move quite some checks from
      open_ctree()/btrfs_remount() into it, and just call it after
      btrfs_parse_options().
      
      Now "-o clear_cache,space_cache=v2" will not trigger the above error
      anymore.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ edit messages ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d7f67ac9
    • J
      btrfs: open code and remove btrfs_insert_inode_hash helper · e256927b
      Josef Bacik 提交于
      This exists to insert the btree_inode in the super blocks inode hash
      table.  Since it's only used for the btree inode move the code to where
      we use it in disk-io.c and remove the helper.
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e256927b
    • J
      btrfs: don't init io tree with private data for non-inodes · efb0645b
      Josef Bacik 提交于
      We only use this for normal inodes, so don't set it if we're not a
      normal inode.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      efb0645b
    • J
      btrfs: remove extent_io_tree::track_uptodate · 4374d03d
      Josef Bacik 提交于
      Since commit 78361f64ff42 ("btrfs: remove unnecessary EXTENT_UPTODATE
      state in buffered I/O path") we no longer check ->track_uptodate, remove
      it.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4374d03d
    • J
      btrfs: unify the lock/unlock extent variants · 570eb97b
      Josef Bacik 提交于
      We have two variants of lock/unlock extent, one set that takes a cached
      state, another that does not.  This is slightly annoying, and generally
      speaking there are only a few places where we don't have a cached state.
      Simplify this by making lock_extent/unlock_extent the only variant and
      make it take a cached state, then convert all the callers appropriately.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      570eb97b
    • Q
      btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction() · 011b46c3
      Qu Wenruo 提交于
      Btrfs qgroup has a long history of bringing performance penalty in
      btrfs_commit_transaction().
      
      Although we tried our best to migrate such impact, there is still an
      unsolved call site, btrfs_drop_snapshot().
      
      This function will find the highest shared tree block and modify its
      extent ownership to do a subvolume/snapshot dropping.
      
      Such change will affect the whole subtree, and cause tons of qgroup
      dirty extents and stall btrfs_commit_transaction().
      
      To avoid such problem, here we introduce a new sysfs interface,
      /sys/fs/btrfs/<uuid>/qgroups/drop_subptree_threshold, to determine at
      whether and at which level we should skip qgroup accounting for subtree
      dropping.
      
      The default value is BTRFS_MAX_LEVEL, thus every subtree drop will go
      through qgroup accounting, to ensure qgroup numbers are kept as
      consistent as possible.
      
      While for performance sensitive cases, add a way to change the values to
      more reasonable values like 3, to make any subtree, which is at or higher
      than level 3, to mark qgroup inconsistent and skip the accounting.
      
      The cost is obvious, the qgroup number is no longer consistent, but at
      least performance is more reasonable, and users have the control.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      011b46c3
    • Q
      btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 · 1c56ab99
      Qu Wenruo 提交于
      The problem of long mount time caused by block group item search is
      already known for some time, and the solution of block group tree has
      been proposed.
      
      There is really no need to bound this feature into extent tree v2, just
      introduce compat RO flag, BLOCK_GROUP_TREE, to correctly solve the
      problem.
      
      All the code handling block group root is already in the upstream
      kernel, thus this patch really only needs to introduce the new compat RO
      flag.
      
      This patch introduces one extra artificial limitation on block group
      tree feature, that free space cache v2 and no-holes feature must be
      enabled to use this new compat RO feature.
      
      This artificial requirement is mostly to reduce the test combinations,
      and can be a guideline for future features, to mostly rely on the latest
      default features.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1c56ab99
    • Q
      btrfs: don't save block group root into super block · 14033b08
      Qu Wenruo 提交于
      The extent tree v2 needs a new root for storing all block group items,
      the whole feature hasn't been finished yet so we can afford to do some
      changes.
      
      My initial proposal years ago just added a new tree rootid, and load it
      from tree root, just like what we did for quota/free space tree/uuid/extent
      roots.
      
      But the extent tree v2 patches introduced a completely new way to store
      block group tree root into super block which is arguably wasteful.
      
      Currently there are only 3 trees stored in super blocks, and they all
      have their valid reasons:
      
      - Chunk root
        Needed for bootstrap.
      
      - Tree root
        Really the entry point for all trees.
      
      - Log root
        This is special as log root has to be updated out of existing
        transaction mechanism.
      
      There is not even any reason to put block group root into super blocks,
      the block group tree is updated at the same time as the old extent tree,
      no need for extra bootstrap/out-of-transaction update.
      
      So just move block group root from super block into tree root.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      14033b08