1. 25 7月, 2022 13 次提交
  2. 16 5月, 2022 10 次提交
  3. 18 4月, 2022 2 次提交
  4. 06 4月, 2022 1 次提交
  5. 25 3月, 2022 1 次提交
    • Q
      btrfs: remove device item and update super block in the same transaction · bbac5869
      Qu Wenruo 提交于
      [BUG]
      There is a report that a btrfs has a bad super block num devices.
      
      This makes btrfs to reject the fs completely.
      
        BTRFS error (device sdd3): super_num_devices 3 mismatch with num_devices 2 found here
        BTRFS error (device sdd3): failed to read chunk tree: -22
        BTRFS error (device sdd3): open_ctree failed
      
      [CAUSE]
      During btrfs device removal, chunk tree and super block num devs are
      updated in two different transactions:
      
        btrfs_rm_device()
        |- btrfs_rm_dev_item(device)
        |  |- trans = btrfs_start_transaction()
        |  |  Now we got transaction X
        |  |
        |  |- btrfs_del_item()
        |  |  Now device item is removed from chunk tree
        |  |
        |  |- btrfs_commit_transaction()
        |     Transaction X got committed, super num devs untouched,
        |     but device item removed from chunk tree.
        |     (AKA, super num devs is already incorrect)
        |
        |- cur_devices->num_devices--;
        |- cur_devices->total_devices--;
        |- btrfs_set_super_num_devices()
           All those operations are not in transaction X, thus it will
           only be written back to disk in next transaction.
      
      So after the transaction X in btrfs_rm_dev_item() committed, but before
      transaction X+1 (which can be minutes away), a power loss happen, then
      we got the super num mismatch.
      
      [FIX]
      Instead of starting and committing a transaction inside
      btrfs_rm_dev_item(), start a transaction in side btrfs_rm_device() and
      pass it to btrfs_rm_dev_item().
      
      And only commit the transaction after everything is done.
      Reported-by: NLuca Béla Palkovics <luca.bela.palkovics@gmail.com>
      Link: https://lore.kernel.org/linux-btrfs/CA+8xDSpvdm_U0QLBAnrH=zqDq_cWCOH5TiV46CKmp3igr44okQ@mail.gmail.com/
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      bbac5869
  6. 14 3月, 2022 10 次提交
  7. 07 1月, 2022 3 次提交
    • Q
      btrfs: remove reada infrastructure · f26c9238
      Qu Wenruo 提交于
      Currently there is only one user for btrfs metadata readahead, and
      that's scrub.
      
      But even for the single user, it's not providing the correct
      functionality it needs, as scrub needs reada for commit root, which
      current readahead can't provide. (Although it's pretty easy to add such
      feature).
      
      Despite this, there are some extra problems related to metadata
      readahead:
      
      - Duplicated feature with btrfs_path::reada
      
      - Partly duplicated feature of btrfs_fs_info::buffer_radix
        Btrfs already caches its metadata in buffer_radix, while readahead
        tries to read the tree block no matter if it's already cached.
      
      - Poor layer separation
        Metadata readahead works kinda at device level.
        This is definitely not the correct layer it should be, since metadata
        is at btrfs logical address space, it should not bother device at all.
      
        This brings extra chance for bugs to sneak in, while brings
        unnecessary complexity.
      
      - Dead code
        In the very beginning of scrub.c we have #undef DEBUG, rendering all
        the debug related code useless and unable to test.
      
      Thus here I purpose to remove the metadata readahead mechanism
      completely.
      
      [BENCHMARK]
      There is a full benchmark for the scrub performance difference using the
      old btrfs_reada_add() and btrfs_path::reada.
      
      For the worst case (no dirty metadata, slow HDD), there could be a 5%
      performance drop for scrub.
      For other cases (even SATA SSD), there is no distinguishable performance
      difference.
      
      The number is reported scrub speed, in MiB/s.
      The resolution is limited by the reported duration, which only has a
      resolution of 1 second.
      
      	Old		New		Diff
      SSD	455.3		466.332		+2.42%
      HDD	103.927 	98.012		-5.69%
      
      Comprehensive test methodology is in the cover letter of the patch.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f26c9238
    • J
      btrfs: zoned: sink zone check into btrfs_repair_one_zone · 554aed7d
      Johannes Thumshirn 提交于
      Sink zone check into btrfs_repair_one_zone() so we don't need to do it
      in all callers.
      
      Also as btrfs_repair_one_zone() doesn't return a sensible error, make it
      a boolean function and return false in case it got called on a non-zoned
      filesystem and true on a zoned filesystem.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      554aed7d
    • N
      btrfs: introduce exclusive operation BALANCE_PAUSED state · efc0e69c
      Nikolay Borisov 提交于
      Current set of exclusive operation states is not sufficient to handle
      all practical use cases. In particular there is a need to be able to add
      a device to a filesystem that have paused balance. Currently there is no
      way to distinguish between a running and a paused balance. Fix this by
      introducing BTRFS_EXCLOP_BALANCE_PAUSED which is going to be set in 2
      occasions:
      
      1. When a filesystem is mounted with skip_balance and there is an
         unfinished balance it will now be into BALANCE_PAUSED instead of
         simply BALANCE state.
      
      2. When a running balance is paused.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      efc0e69c