1. 24 1月, 2020 1 次提交
    • Q
      btrfs: scrub: Require mandatory block group RO for dev-replace · 1bbb97b8
      Qu Wenruo 提交于
      [BUG]
      For dev-replace test cases with fsstress, like btrfs/06[45] btrfs/071,
      looped runs can lead to random failure, where scrub finds csum error.
      
      The possibility is not high, around 1/20 to 1/100, but it's causing data
      corruption.
      
      The bug is observable after commit b12de528 ("btrfs: scrub: Don't
      check free space before marking a block group RO")
      
      [CAUSE]
      Dev-replace has two source of writes:
      
      - Write duplication
        All writes to source device will also be duplicated to target device.
      
        Content:	Not yet persisted data/meta
      
      - Scrub copy
        Dev-replace reused scrub code to iterate through existing extents, and
        copy the verified data to target device.
      
        Content:	Previously persisted data and metadata
      
      The difference in contents makes the following race possible:
      	Regular Writer		|	Dev-replace
      -----------------------------------------------------------------
        ^                             |
        | Preallocate one data extent |
        | at bytenr X, len 1M		|
        v				|
        ^ Commit transaction		|
        | Now extent [X, X+1M) is in  |
        v commit root			|
       ================== Dev replace starts =========================
        				| ^
      				| | Scrub extent [X, X+1M)
      				| | Read [X, X+1M)
      				| | (The content are mostly garbage
      				| |  since it's preallocated)
        ^				| v
        | Write back happens for	|
        | extent [X, X+512K)		|
        | New data writes to both	|
        | source and target dev.	|
        v				|
      				| ^
      				| | Scrub writes back extent [X, X+1M)
      				| | to target device.
      				| | This will over write the new data in
      				| | [X, X+512K)
      				| v
      
      This race can only happen for nocow writes. Thus metadata and data cow
      writes are safe, as COW will never overwrite extents of previous
      transaction (in commit root).
      
      This behavior can be confirmed by disabling all fallocate related calls
      in fsstress (*), then all related tests can pass a 2000 run loop.
      
      *: FSSTRESS_AVOID="-f fallocate=0 -f allocsp=0 -f zero=0 -f insert=0 \
      		   -f collapse=0 -f punch=0 -f resvsp=0"
         I didn't expect resvsp ioctl will fallback to fallocate in VFS...
      
      [FIX]
      Make dev-replace to require mandatory block group RO, and wait for current
      nocow writes before calling scrub_chunk().
      
      This patch will mostly revert commit 76a8efa1 ("btrfs: Continue replace
      when set_block_ro failed") for dev-replace path.
      
      The side effect is, dev-replace can be more strict on avaialble space, but
      definitely worth to avoid data corruption.
      Reported-by: NFilipe Manana <fdmanana@suse.com>
      Fixes: 76a8efa1 ("btrfs: Continue replace when set_block_ro failed")
      Fixes: b12de528 ("btrfs: scrub: Don't check free space before marking a block group RO")
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1bbb97b8
  2. 19 11月, 2019 6 次提交
    • F
      Btrfs: fix block group remaining RO forever after error during device replace · 042528f8
      Filipe Manana 提交于
      When doing a device replace, while at scrub.c:scrub_enumerate_chunks(), we
      set the block group to RO mode and then wait for any ongoing writes into
      extents of the block group to complete. While doing that wait we overwrite
      the value of the variable 'ret' and can break out of the loop if an error
      happens without turning the block group back into RW mode. So what happens
      is the following:
      
      1) btrfs_inc_block_group_ro() returns 0, meaning it set the block group
         to RO mode (its ->ro field set to 1 or incremented to some value > 1);
      
      2) Then btrfs_wait_ordered_roots() returns a value > 0;
      
      3) Then if either joining or committing the transaction fails, we break
         out of the loop wihtout calling btrfs_dec_block_group_ro(), leaving
         the block group in RO mode forever.
      
      To fix this, just remove the code that waits for ongoing writes to extents
      of the block group, since it's not needed because in the initial setup
      phase of a device replace operation, before starting to find all chunks
      and their extents, we set the target device for replace while holding
      fs_info->dev_replace->rwsem, which ensures that after releasing that
      semaphore, any writes into the source device are made to the target device
      as well (__btrfs_map_block() guarantees that). So while at
      scrub_enumerate_chunks() we only need to worry about finding and copying
      extents (from the source device to the target device) that were written
      before we started the device replace operation.
      
      Fixes: f0e9b7d6 ("Btrfs: fix race setting block group readonly during device replace")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      042528f8
    • Q
      btrfs: scrub: Don't check free space before marking a block group RO · b12de528
      Qu Wenruo 提交于
      [BUG]
      When running btrfs/072 with only one online CPU, it has a pretty high
      chance to fail:
      
        btrfs/072 12s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//btrfs/072.dmesg)
        - output mismatch (see xfstests-dev/results//btrfs/072.out.bad)
            --- tests/btrfs/072.out     2019-10-22 15:18:14.008965340 +0800
            +++ /xfstests-dev/results//btrfs/072.out.bad      2019-11-14 15:56:45.877152240 +0800
            @@ -1,2 +1,3 @@
             QA output created by 072
             Silence is golden
            +Scrub find errors in "-m dup -d single" test
            ...
      
      And with the following call trace:
      
        BTRFS info (device dm-5): scrub: started on devid 1
        ------------[ cut here ]------------
        BTRFS: Transaction aborted (error -27)
        WARNING: CPU: 0 PID: 55087 at fs/btrfs/block-group.c:1890 btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
        CPU: 0 PID: 55087 Comm: btrfs Tainted: G        W  O      5.4.0-rc1-custom+ #13
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        RIP: 0010:btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
        Call Trace:
         __btrfs_end_transaction+0xdb/0x310 [btrfs]
         btrfs_end_transaction+0x10/0x20 [btrfs]
         btrfs_inc_block_group_ro+0x1c9/0x210 [btrfs]
         scrub_enumerate_chunks+0x264/0x940 [btrfs]
         btrfs_scrub_dev+0x45c/0x8f0 [btrfs]
         btrfs_ioctl+0x31a1/0x3fb0 [btrfs]
         do_vfs_ioctl+0x636/0xaa0
         ksys_ioctl+0x67/0x90
         __x64_sys_ioctl+0x43/0x50
         do_syscall_64+0x79/0xe0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
        ---[ end trace 166c865cec7688e7 ]---
      
      [CAUSE]
      The error number -27 is -EFBIG, returned from the following call chain:
      btrfs_end_transaction()
      |- __btrfs_end_transaction()
         |- btrfs_create_pending_block_groups()
            |- btrfs_finish_chunk_alloc()
               |- btrfs_add_system_chunk()
      
      This happens because we have used up all space of
      btrfs_super_block::sys_chunk_array.
      
      The root cause is, we have the following bad loop of creating tons of
      system chunks:
      
      1. The only SYSTEM chunk is being scrubbed
         It's very common to have only one SYSTEM chunk.
      2. New SYSTEM bg will be allocated
         As btrfs_inc_block_group_ro() will check if we have enough space
         after marking current bg RO. If not, then allocate a new chunk.
      3. New SYSTEM bg is still empty, will be reclaimed
         During the reclaim, we will mark it RO again.
      4. That newly allocated empty SYSTEM bg get scrubbed
         We go back to step 2, as the bg is already mark RO but still not
         cleaned up yet.
      
      If the cleaner kthread doesn't get executed fast enough (e.g. only one
      CPU), then we will get more and more empty SYSTEM chunks, using up all
      the space of btrfs_super_block::sys_chunk_array.
      
      [FIX]
      Since scrub/dev-replace doesn't always need to allocate new extent,
      especially chunk tree extent, so we don't really need to do chunk
      pre-allocation.
      
      To break above spiral, here we introduce a new parameter to
      btrfs_inc_block_group(), @do_chunk_alloc, which indicates whether we
      need extra chunk pre-allocation.
      
      For relocation, we pass @do_chunk_alloc=true, while for scrub, we pass
      @do_chunk_alloc=false.
      This should keep unnecessary empty chunks from popping up for scrub.
      
      Also, since there are two parameters for btrfs_inc_block_group_ro(),
      add more comment for it.
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      b12de528
    • D
      btrfs: rename btrfs_block_group_cache · 32da5386
      David Sterba 提交于
      The type name is misleading, a single entry is named 'cache' while this
      normally means a collection of objects. Rename that everywhere. Also the
      identifier was quite long, making function prototypes harder to format.
      Suggested-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      32da5386
    • D
      btrfs: clean up locking name in scrub_enumerate_chunks() · 3ec17a67
      Dan Carpenter 提交于
      The "&fs_info->dev_replace.rwsem" and "&dev_replace->rwsem" refer to
      the same lock but Smatch is not clever enough to figure that out so it
      leads to static checker warnings.  It's better to use it consistently
      anyway.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3ec17a67
    • D
      btrfs: add dedicated members for start and length of a block group · b3470b5d
      David Sterba 提交于
      The on-disk format of block group item makes use of the key that stores
      the offset and length. This is further used in the code, although this
      makes thing harder to understand. The key is also packed so the
      offset/length is not properly aligned as u64.
      
      Add start (key.objectid) and length (key.offset) members to block group
      and remove the embedded key.  When the item is searched or written, a
      local variable for key is used.
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      b3470b5d
    • D
      btrfs: move block_group_item::used to block group · bf38be65
      David Sterba 提交于
      For unknown reasons, the member 'used' in the block group struct is
      stored in the b-tree item and accessed everywhere using the special
      accessor helper. Let's unify it and make it a regular member and only
      update the item before writing it to the tree.
      
      The item is still being used for flags and chunk_objectid, there's some
      duplication until the item is removed in following patches.
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      bf38be65
  3. 18 11月, 2019 2 次提交
  4. 09 9月, 2019 1 次提交
  5. 02 7月, 2019 1 次提交
  6. 01 7月, 2019 4 次提交
  7. 30 4月, 2019 2 次提交
  8. 25 2月, 2019 8 次提交
    • D
      btrfs: init csum_list before possible free · e49be14b
      Dan Robertson 提交于
      The scrub_ctx csum_list member must be initialized before scrub_free_ctx
      is called. If the csum_list is not initialized beforehand, the
      list_empty call in scrub_free_csums will result in a null deref if the
      allocation fails in the for loop.
      
      Fixes: a2de733c ("btrfs: scrub")
      CC: stable@vger.kernel.org # 3.0+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDan Robertson <dan@dlrobertson.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e49be14b
    • D
      btrfs: scrub: add assertions for worker pointers · c8352942
      David Sterba 提交于
      The scrub worker pointers are not NULL iff the scrub is running, so
      reset them back once the last reference is dropped. Add assertions to
      the initial phase of scrub to verify that.
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c8352942
    • A
      btrfs: scrub: convert scrub_workers_refcnt to refcount_t · ff09c4ca
      Anand Jain 提交于
      Use the refcount_t for fs_info::scrub_workers_refcnt instead of int so
      we get the extra checks. All reference changes are still done under
      scrub_lock.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ff09c4ca
    • A
      btrfs: scrub: add scrub_lock lockdep check in scrub_workers_get · eb4318e5
      Anand Jain 提交于
      scrub_workers_refcnt is protected by scrub_lock, add lockdep_assert_held()
      in scrub_workers_get().
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Suggested-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      eb4318e5
    • A
      btrfs: scrub: fix circular locking dependency warning · 1cec3f27
      Anand Jain 提交于
      This fixes a longstanding lockdep warning triggered by
      fstests/btrfs/011.
      
      Circular locking dependency check reports warning[1], that's because the
      btrfs_scrub_dev() calls the stack #0 below with, the fs_info::scrub_lock
      held. The test case leading to this warning:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /btrfs
        $ btrfs scrub start -B /btrfs
      
      In fact we have fs_info::scrub_workers_refcnt to track if the init and destroy
      of the scrub workers are needed. So once we have incremented and decremented
      the fs_info::scrub_workers_refcnt value in the thread, its ok to drop the
      scrub_lock, and then actually do the btrfs_destroy_workqueue() part. So this
      patch drops the scrub_lock before calling btrfs_destroy_workqueue().
      
        [359.258534] ======================================================
        [359.260305] WARNING: possible circular locking dependency detected
        [359.261938] 5.0.0-rc6-default #461 Not tainted
        [359.263135] ------------------------------------------------------
        [359.264672] btrfs/20975 is trying to acquire lock:
        [359.265927] 00000000d4d32bea ((wq_completion)"%s-%s""btrfs", name){+.+.}, at: flush_workqueue+0x87/0x540
        [359.268416]
        [359.268416] but task is already holding lock:
        [359.270061] 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs]
        [359.272418]
        [359.272418] which lock already depends on the new lock.
        [359.272418]
        [359.274692]
        [359.274692] the existing dependency chain (in reverse order) is:
        [359.276671]
        [359.276671] -> #3 (&fs_info->scrub_lock){+.+.}:
        [359.278187]        __mutex_lock+0x86/0x9c0
        [359.279086]        btrfs_scrub_pause+0x31/0x100 [btrfs]
        [359.280421]        btrfs_commit_transaction+0x1e4/0x9e0 [btrfs]
        [359.281931]        close_ctree+0x30b/0x350 [btrfs]
        [359.283208]        generic_shutdown_super+0x64/0x100
        [359.284516]        kill_anon_super+0x14/0x30
        [359.285658]        btrfs_kill_super+0x12/0xa0 [btrfs]
        [359.286964]        deactivate_locked_super+0x29/0x60
        [359.288242]        cleanup_mnt+0x3b/0x70
        [359.289310]        task_work_run+0x98/0xc0
        [359.290428]        exit_to_usermode_loop+0x83/0x90
        [359.291445]        do_syscall_64+0x15b/0x180
        [359.292598]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [359.294011]
        [359.294011] -> #2 (sb_internal#2){.+.+}:
        [359.295432]        __sb_start_write+0x113/0x1d0
        [359.296394]        start_transaction+0x369/0x500 [btrfs]
        [359.297471]        btrfs_finish_ordered_io+0x2aa/0x7c0 [btrfs]
        [359.298629]        normal_work_helper+0xcd/0x530 [btrfs]
        [359.299698]        process_one_work+0x246/0x610
        [359.300898]        worker_thread+0x3c/0x390
        [359.302020]        kthread+0x116/0x130
        [359.303053]        ret_from_fork+0x24/0x30
        [359.304152]
        [359.304152] -> #1 ((work_completion)(&work->normal_work)){+.+.}:
        [359.306100]        process_one_work+0x21f/0x610
        [359.307302]        worker_thread+0x3c/0x390
        [359.308465]        kthread+0x116/0x130
        [359.309357]        ret_from_fork+0x24/0x30
        [359.310229]
        [359.310229] -> #0 ((wq_completion)"%s-%s""btrfs", name){+.+.}:
        [359.311812]        lock_acquire+0x90/0x180
        [359.312929]        flush_workqueue+0xaa/0x540
        [359.313845]        drain_workqueue+0xa1/0x180
        [359.314761]        destroy_workqueue+0x17/0x240
        [359.315754]        btrfs_destroy_workqueue+0x57/0x200 [btrfs]
        [359.317245]        scrub_workers_put+0x2c/0x60 [btrfs]
        [359.318585]        btrfs_scrub_dev+0x336/0x590 [btrfs]
        [359.319944]        btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs]
        [359.321622]        btrfs_ioctl+0x28a4/0x2e40 [btrfs]
        [359.322908]        do_vfs_ioctl+0xa2/0x6d0
        [359.324021]        ksys_ioctl+0x3a/0x70
        [359.325066]        __x64_sys_ioctl+0x16/0x20
        [359.326236]        do_syscall_64+0x54/0x180
        [359.327379]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [359.328772]
        [359.328772] other info that might help us debug this:
        [359.328772]
        [359.330990] Chain exists of:
        [359.330990]   (wq_completion)"%s-%s""btrfs", name --> sb_internal#2 --> &fs_info->scrub_lock
        [359.330990]
        [359.334376]  Possible unsafe locking scenario:
        [359.334376]
        [359.336020]        CPU0                    CPU1
        [359.337070]        ----                    ----
        [359.337821]   lock(&fs_info->scrub_lock);
        [359.338506]                                lock(sb_internal#2);
        [359.339506]                                lock(&fs_info->scrub_lock);
        [359.341461]   lock((wq_completion)"%s-%s""btrfs", name);
        [359.342437]
        [359.342437]  *** DEADLOCK ***
        [359.342437]
        [359.343745] 1 lock held by btrfs/20975:
        [359.344788]  #0: 0000000053ea26a6 (&fs_info->scrub_lock){+.+.}, at: btrfs_scrub_dev+0x322/0x590 [btrfs]
        [359.346778]
        [359.346778] stack backtrace:
        [359.347897] CPU: 0 PID: 20975 Comm: btrfs Not tainted 5.0.0-rc6-default #461
        [359.348983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
        [359.350501] Call Trace:
        [359.350931]  dump_stack+0x67/0x90
        [359.351676]  print_circular_bug.isra.37.cold.56+0x15c/0x195
        [359.353569]  check_prev_add.constprop.44+0x4f9/0x750
        [359.354849]  ? check_prev_add.constprop.44+0x286/0x750
        [359.356505]  __lock_acquire+0xb84/0xf10
        [359.357505]  lock_acquire+0x90/0x180
        [359.358271]  ? flush_workqueue+0x87/0x540
        [359.359098]  flush_workqueue+0xaa/0x540
        [359.359912]  ? flush_workqueue+0x87/0x540
        [359.360740]  ? drain_workqueue+0x1e/0x180
        [359.361565]  ? drain_workqueue+0xa1/0x180
        [359.362391]  drain_workqueue+0xa1/0x180
        [359.363193]  destroy_workqueue+0x17/0x240
        [359.364539]  btrfs_destroy_workqueue+0x57/0x200 [btrfs]
        [359.365673]  scrub_workers_put+0x2c/0x60 [btrfs]
        [359.366618]  btrfs_scrub_dev+0x336/0x590 [btrfs]
        [359.367594]  ? start_transaction+0xa1/0x500 [btrfs]
        [359.368679]  btrfs_dev_replace_by_ioctl.cold.19+0x179/0x1bb [btrfs]
        [359.369545]  btrfs_ioctl+0x28a4/0x2e40 [btrfs]
        [359.370186]  ? __lock_acquire+0x263/0xf10
        [359.370777]  ? kvm_clock_read+0x14/0x30
        [359.371392]  ? kvm_sched_clock_read+0x5/0x10
        [359.372248]  ? sched_clock+0x5/0x10
        [359.372786]  ? sched_clock_cpu+0xc/0xc0
        [359.373662]  ? do_vfs_ioctl+0xa2/0x6d0
        [359.374552]  do_vfs_ioctl+0xa2/0x6d0
        [359.375378]  ? do_sigaction+0xff/0x250
        [359.376233]  ksys_ioctl+0x3a/0x70
        [359.376954]  __x64_sys_ioctl+0x16/0x20
        [359.377772]  do_syscall_64+0x54/0x180
        [359.378841]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [359.380422] RIP: 0033:0x7f5429296a97
      
      Backporting to older kernels: scrub_nocow_workers must be freed the same
      way as the others.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      [ update changelog ]
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1cec3f27
    • A
      btrfs: scrub: print messages when started or finished · d1e14420
      Anand Jain 提交于
      The kernel log messages help debugging and audit, add them for scrub
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d1e14420
    • A
      btrfs: merge btrfs_find_device and find_device · 09ba3bc9
      Anand Jain 提交于
      Both btrfs_find_device() and find_device() does the same thing except
      that the latter does not take the seed device onto account in the device
      scanning context. We can merge them.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      09ba3bc9
    • A
      btrfs: refactor btrfs_find_device() take fs_devices as argument · e4319cd9
      Anand Jain 提交于
      btrfs_find_device() accepts fs_info as an argument and retrieves
      fs_devices from fs_info.
      
      Instead use fs_devices, so that this function can be used in non-mount
      (during device scanning) context as well.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e4319cd9
  9. 17 12月, 2018 6 次提交
  10. 15 10月, 2018 2 次提交
  11. 06 8月, 2018 6 次提交
  12. 17 7月, 2018 1 次提交
    • Q
      btrfs: scrub: Don't use inode page cache in scrub_handle_errored_block() · 665d4953
      Qu Wenruo 提交于
      In commit ac0b4145 ("btrfs: scrub: Don't use inode pages for device
      replace") we removed the branch of copy_nocow_pages() to avoid
      corruption for compressed nodatasum extents.
      
      However above commit only solves the problem in scrub_extent(), if
      during scrub_pages() we failed to read some pages,
      sctx->no_io_error_seen will be non-zero and we go to fixup function
      scrub_handle_errored_block().
      
      In scrub_handle_errored_block(), for sctx without csum (no matter if
      we're doing replace or scrub) we go to scrub_fixup_nodatasum() routine,
      which does the similar thing with copy_nocow_pages(), but does it
      without the extra check in copy_nocow_pages() routine.
      
      So for test cases like btrfs/100, where we emulate read errors during
      replace/scrub, we could corrupt compressed extent data again.
      
      This patch will fix it just by avoiding any "optimization" for
      nodatasum, just falls back to the normal fixup routine by try read from
      any good copy.
      
      This also solves WARN_ON() or dead lock caused by lame backref iteration
      in scrub_fixup_nodatasum() routine.
      
      The deadlock or WARN_ON() won't be triggered before commit ac0b4145
      ("btrfs: scrub: Don't use inode pages for device replace") since
      copy_nocow_pages() have better locking and extra check for data extent,
      and it's already doing the fixup work by try to read data from any good
      copy, so it won't go scrub_fixup_nodatasum() anyway.
      
      This patch disables the faulty code and will be removed completely in a
      followup patch.
      
      Fixes: ac0b4145 ("btrfs: scrub: Don't use inode pages for device replace")
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      665d4953