1. 25 7月, 2022 31 次提交
  2. 16 7月, 2022 2 次提交
  3. 09 7月, 2022 1 次提交
  4. 21 6月, 2022 2 次提交
    • N
      btrfs: zoned: prevent allocation from previous data relocation BG · 343d8a30
      Naohiro Aota 提交于
      After commit 5f0addf7 ("btrfs: zoned: use dedicated lock for data
      relocation"), we observe IO errors on e.g, btrfs/232 like below.
      
        [09.0][T4038707] WARNING: CPU: 3 PID: 4038707 at fs/btrfs/extent-tree.c:2381 btrfs_cross_ref_exist+0xfc/0x120 [btrfs]
        <snip>
        [09.9][T4038707] Call Trace:
        [09.5][T4038707]  <TASK>
        [09.3][T4038707]  run_delalloc_nocow+0x7f1/0x11a0 [btrfs]
        [09.6][T4038707]  ? test_range_bit+0x174/0x320 [btrfs]
        [09.2][T4038707]  ? fallback_to_cow+0x980/0x980 [btrfs]
        [09.3][T4038707]  ? find_lock_delalloc_range+0x33e/0x3e0 [btrfs]
        [09.5][T4038707]  btrfs_run_delalloc_range+0x445/0x1320 [btrfs]
        [09.2][T4038707]  ? test_range_bit+0x320/0x320 [btrfs]
        [09.4][T4038707]  ? lock_downgrade+0x6a0/0x6a0
        [09.2][T4038707]  ? orc_find.part.0+0x1ed/0x300
        [09.5][T4038707]  ? __module_address.part.0+0x25/0x300
        [09.0][T4038707]  writepage_delalloc+0x159/0x310 [btrfs]
        <snip>
        [09.4][    C3] sd 10:0:1:0: [sde] tag#2620 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
        [09.5][    C3] sd 10:0:1:0: [sde] tag#2620 Sense Key : Illegal Request [current]
        [09.9][    C3] sd 10:0:1:0: [sde] tag#2620 Add. Sense: Unaligned write command
        [09.5][    C3] sd 10:0:1:0: [sde] tag#2620 CDB: Write(16) 8a 00 00 00 00 00 02 f3 63 87 00 00 00 2c 00 00
        [09.4][    C3] critical target error, dev sde, sector 396041272 op 0x1:(WRITE) flags 0x800 phys_seg 3 prio class 0
        [09.9][    C3] BTRFS error (device dm-1): bdev /dev/mapper/dml_102_2 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
      
      The IO errors occur when we allocate a regular extent in previous data
      relocation block group.
      
      On zoned btrfs, we use a dedicated block group to relocate a data
      extent. Thus, we allocate relocating data extents (pre-alloc) only from
      the dedicated block group and vice versa. Once the free space in the
      dedicated block group gets tight, a relocating extent may not fit into
      the block group. In that case, we need to switch the dedicated block
      group to the next one. Then, the previous one is now freed up for
      allocating a regular extent. The BG is already not enough to allocate
      the relocating extent, but there is still room to allocate a smaller
      extent. Now the problem happens. By allocating a regular extent while
      nocow IOs for the relocation is still on-going, we will issue WRITE IOs
      (for relocation) and ZONE APPEND IOs (for the regular writes) at the
      same time. That mixed IOs confuses the write pointer and arises the
      unaligned write errors.
      
      This commit introduces a new bit 'zoned_data_reloc_ongoing' to the
      btrfs_block_group. We set this bit before releasing the dedicated block
      group, and no extent are allocated from a block group having this bit
      set. This bit is similar to setting block_group->ro, but is different from
      it by allowing nocow writes to start.
      
      Once all the nocow IO for relocation is done (hooked from
      btrfs_finish_ordered_io), we reset the bit to release the block group for
      further allocation.
      
      Fixes: c2707a25 ("btrfs: zoned: add a dedicated data relocation block group")
      CC: stable@vger.kernel.org # 5.16+
      Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      343d8a30
    • F
      btrfs: add missing inode updates on each iteration when replacing extents · 983d8209
      Filipe Manana 提交于
      When replacing file extents, called during fallocate, hole punching,
      clone and deduplication, we may not be able to replace/drop all the
      target file extent items with a single transaction handle. We may get
      -ENOSPC while doing it, in which case we release the transaction handle,
      balance the dirty pages of the btree inode, flush delayed items and get
      a new transaction handle to operate on what's left of the target range.
      
      By dropping and replacing file extent items we have effectively modified
      the inode, so we should bump its iversion and update its mtime/ctime
      before we update the inode item. This is because if the transaction
      we used for partially modifying the inode gets committed by someone after
      we release it and before we finish the rest of the range, a power failure
      happens, then after mounting the filesystem our inode has an outdated
      iversion and mtime/ctime, corresponding to the values it had before we
      changed it.
      
      So add the missing iversion and mtime/ctime updates.
      Reviewed-by: NBoris Burkov <boris@bur.io>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      983d8209
  5. 18 5月, 2022 1 次提交
  6. 16 5月, 2022 3 次提交