1. 19 7月, 2022 5 次提交
  2. 18 7月, 2022 2 次提交
  3. 06 7月, 2022 1 次提交
  4. 28 5月, 2022 3 次提交
    • F
      btrfs: add missing run of delayed items after unlink during log replay · 5bed171a
      Filipe Manana 提交于
      stable inclusion
      from stable-v5.10.104
      commit 292e1c88b8a5616ada179f1f4f14c799571217af
      bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=292e1c88b8a5616ada179f1f4f14c799571217af
      
      --------------------------------
      
      commit 4751dc99 upstream.
      
      During log replay, whenever we need to check if a name (dentry) exists in
      a directory we do searches on the subvolume tree for inode references or
      or directory entries (BTRFS_DIR_INDEX_KEY keys, and BTRFS_DIR_ITEM_KEY
      keys as well, before kernel 5.17). However when during log replay we
      unlink a name, through btrfs_unlink_inode(), we may not delete inode
      references and dir index keys from a subvolume tree and instead just add
      the deletions to the delayed inode's delayed items, which will only be
      run when we commit the transaction used for log replay. This means that
      after an unlink operation during log replay, if we attempt to search for
      the same name during log replay, we will not see that the name was already
      deleted, since the deletion is recorded only on the delayed items.
      
      We run delayed items after every unlink operation during log replay,
      except at unlink_old_inode_refs() and at add_inode_ref(). This was due
      to an overlook, as delayed items should be run after evert unlink, for
      the reasons stated above.
      
      So fix those two cases.
      
      Fixes: 0d836392 ("Btrfs: fix mount failure after fsync due to hard link recreation")
      Fixes: 1f250e92 ("Btrfs: fix log replay failure after unlink and link combination")
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      5bed171a
    • S
      btrfs: qgroup: fix deadlock between rescan worker and remove qgroup · 778f4dcc
      Sidong Yang 提交于
      stable inclusion
      from stable-v5.10.104
      commit 41712c5fa51887252b349700a286ae151d55e460
      bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=41712c5fa51887252b349700a286ae151d55e460
      
      --------------------------------
      
      commit d4aef1e1 upstream.
      
      The commit e804861b ("btrfs: fix deadlock between quota disable and
      qgroup rescan worker") by Kawasaki resolves deadlock between quota
      disable and qgroup rescan worker. But also there is a deadlock case like
      it. It's about enabling or disabling quota and creating or removing
      qgroup. It can be reproduced in simple script below.
      
      for i in {1..100}
      do
          btrfs quota enable /mnt &
          btrfs qgroup create 1/0 /mnt &
          btrfs qgroup destroy 1/0 /mnt &
          btrfs quota disable /mnt &
      done
      
      Here's why the deadlock happens:
      
      1) The quota rescan task is running.
      
      2) Task A calls btrfs_quota_disable(), locks the qgroup_ioctl_lock
         mutex, and then calls btrfs_qgroup_wait_for_completion(), to wait for
         the quota rescan task to complete.
      
      3) Task B calls btrfs_remove_qgroup() and it blocks when trying to lock
         the qgroup_ioctl_lock mutex, because it's being held by task A. At that
         point task B is holding a transaction handle for the current transaction.
      
      4) The quota rescan task calls btrfs_commit_transaction(). This results
         in it waiting for all other tasks to release their handles on the
         transaction, but task B is blocked on the qgroup_ioctl_lock mutex
         while holding a handle on the transaction, and that mutex is being held
         by task A, which is waiting for the quota rescan task to complete,
         resulting in a deadlock between these 3 tasks.
      
      To resolve this issue, the thread disabling quota should unlock
      qgroup_ioctl_lock before waiting rescan completion. Move
      btrfs_qgroup_wait_for_completion() after unlock of qgroup_ioctl_lock.
      
      Fixes: e804861b ("btrfs: fix deadlock between quota disable and qgroup rescan worker")
      CC: stable@vger.kernel.org # 5.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: NSidong Yang <realwakka@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      778f4dcc
    • F
      btrfs: fix lost prealloc extents beyond eof after full fsync · 0b1b18af
      Filipe Manana 提交于
      stable inclusion
      from stable-v5.10.104
      commit 6e0319e770839ab9aaee10e0e2b34edb92491831
      bugzilla: https://gitee.com/openeuler/kernel/issues/I56XAC
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6e0319e770839ab9aaee10e0e2b34edb92491831
      
      --------------------------------
      
      commit d9947887 upstream.
      
      When doing a full fsync, if we have prealloc extents beyond (or at) eof,
      and the leaves that contain them were not modified in the current
      transaction, we end up not logging them. This results in losing those
      extents when we replay the log after a power failure, since the inode is
      truncated to the current value of the logged i_size.
      
      Just like for the fast fsync path, we need to always log all prealloc
      extents starting at or beyond i_size. The fast fsync case was fixed in
      commit 471d557a ("Btrfs: fix loss of prealloc extents past i_size
      after fsync log replay") but it missed the full fsync path. The problem
      exists since the very early days, when the log tree was added by
      commit e02119d5 ("Btrfs: Add a write ahead tree log to optimize
      synchronous operations").
      
      Example reproducer:
      
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt
      
        # Create our test file with many file extent items, so that they span
        # several leaves of metadata, even if the node/page size is 64K. Use
        # direct IO and not fsync/O_SYNC because it's both faster and it avoids
        # clearing the full sync flag from the inode - we want the fsync below
        # to trigger the slow full sync code path.
        $ xfs_io -f -d -c "pwrite -b 4K 0 16M" /mnt/foo
      
        # Now add two preallocated extents to our file without extending the
        # file's size. One right at i_size, and another further beyond, leaving
        # a gap between the two prealloc extents.
        $ xfs_io -c "falloc -k 16M 1M" /mnt/foo
        $ xfs_io -c "falloc -k 20M 1M" /mnt/foo
      
        # Make sure everything is durably persisted and the transaction is
        # committed. This makes all created extents to have a generation lower
        # than the generation of the transaction used by the next write and
        # fsync.
        sync
      
        # Now overwrite only the first extent, which will result in modifying
        # only the first leaf of metadata for our inode. Then fsync it. This
        # fsync will use the slow code path (inode full sync bit is set) because
        # it's the first fsync since the inode was created/loaded.
        $ xfs_io -c "pwrite 0 4K" -c "fsync" /mnt/foo
      
        # Extent list before power failure.
        $ xfs_io -c "fiemap -v" /mnt/foo
        /mnt/foo:
         EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
           0: [0..7]:          2178048..2178055     8   0x0
           1: [8..16383]:      26632..43007     16376   0x0
           2: [16384..32767]:  2156544..2172927 16384   0x0
           3: [32768..34815]:  2172928..2174975  2048 0x800
           4: [34816..40959]:  hole              6144
           5: [40960..43007]:  2174976..2177023  2048 0x801
      
        <power fail>
      
        # Mount fs again, trigger log replay.
        $ mount /dev/sdc /mnt
      
        # Extent list after power failure and log replay.
        $ xfs_io -c "fiemap -v" /mnt/foo
        /mnt/foo:
         EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
           0: [0..7]:          2178048..2178055     8   0x0
           1: [8..16383]:      26632..43007     16376   0x0
           2: [16384..32767]:  2156544..2172927 16384   0x1
      
        # The prealloc extents at file offsets 16M and 20M are missing.
      
      So fix this by calling btrfs_log_prealloc_extents() when we are doing a
      full fsync, so that we always log all prealloc extents beyond eof.
      
      A test case for fstests will follow soon.
      
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      0b1b18af
  5. 26 5月, 2022 2 次提交
    • S
      btrfs: tree-checker: check item_size for dev_item · 7b855f4b
      Su Yue 提交于
      stable inclusion
      from stable-v5.10.103
      commit 72a5b01875b279196b30af9cca737318fbf3f634
      bugzilla: https://gitee.com/openeuler/kernel/issues/I56NE7
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=72a5b01875b279196b30af9cca737318fbf3f634
      
      --------------------------------
      
      commit ea1d1ca4 upstream.
      
      Check item size before accessing the device item to avoid out of bound
      access, similar to inode_item check.
      Signed-off-by: NSu Yue <l@damenly.su>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      7b855f4b
    • S
      btrfs: tree-checker: check item_size for inode_item · 5e18954c
      Su Yue 提交于
      stable inclusion
      from stable-v5.10.103
      commit 5c967dd07311da972a68eb318e9b43bb4b0f0c3a
      bugzilla: https://gitee.com/openeuler/kernel/issues/I56NE7
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5c967dd07311da972a68eb318e9b43bb4b0f0c3a
      
      --------------------------------
      
      commit 0c982944 upstream.
      
      while mounting the crafted image, out-of-bounds access happens:
      
        [350.429619] UBSAN: array-index-out-of-bounds in fs/btrfs/struct-funcs.c:161:1
        [350.429636] index 1048096 is out of range for type 'page *[16]'
        [350.429650] CPU: 0 PID: 9 Comm: kworker/u8:1 Not tainted 5.16.0-rc4 #1
        [350.429652] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
        [350.429653] Workqueue: btrfs-endio-meta btrfs_work_helper [btrfs]
        [350.429772] Call Trace:
        [350.429774]  <TASK>
        [350.429776]  dump_stack_lvl+0x47/0x5c
        [350.429780]  ubsan_epilogue+0x5/0x50
        [350.429786]  __ubsan_handle_out_of_bounds+0x66/0x70
        [350.429791]  btrfs_get_16+0xfd/0x120 [btrfs]
        [350.429832]  check_leaf+0x754/0x1a40 [btrfs]
        [350.429874]  ? filemap_read+0x34a/0x390
        [350.429878]  ? load_balance+0x175/0xfc0
        [350.429881]  validate_extent_buffer+0x244/0x310 [btrfs]
        [350.429911]  btrfs_validate_metadata_buffer+0xf8/0x100 [btrfs]
        [350.429935]  end_bio_extent_readpage+0x3af/0x850 [btrfs]
        [350.429969]  ? newidle_balance+0x259/0x480
        [350.429972]  end_workqueue_fn+0x29/0x40 [btrfs]
        [350.429995]  btrfs_work_helper+0x71/0x330 [btrfs]
        [350.430030]  ? __schedule+0x2fb/0xa40
        [350.430033]  process_one_work+0x1f6/0x400
        [350.430035]  ? process_one_work+0x400/0x400
        [350.430036]  worker_thread+0x2d/0x3d0
        [350.430037]  ? process_one_work+0x400/0x400
        [350.430038]  kthread+0x165/0x190
        [350.430041]  ? set_kthread_struct+0x40/0x40
        [350.430043]  ret_from_fork+0x1f/0x30
        [350.430047]  </TASK>
        [350.430077] BTRFS warning (device loop0): bad eb member start: ptr 0xffe20f4e start 20975616 member offset 4293005178 size 2
      
      check_leaf() is checking the leaf:
      
        corrupt leaf: root=4 block=29396992 slot=1, bad key order, prev (16140901064495857664 1 0) current (1 204 12582912)
        leaf 29396992 items 6 free space 3565 generation 6 owner DEV_TREE
        leaf 29396992 flags 0x1(WRITTEN) backref revision 1
        fs uuid a62e00e8-e94e-4200-8217-12444de93c2e
        chunk uuid cecbd0f7-9ca0-441e-ae9f-f782f9732bd8
      	  item 0 key (16140901064495857664 INODE_ITEM 0) itemoff 3955 itemsize 40
      		  generation 0 transid 0 size 0 nbytes 17592186044416
      		  block group 0 mode 52667 links 33 uid 0 gid 2104132511 rdev 94223634821136
      		  sequence 100305 flags 0x2409000a(none)
      		  atime 0.0 (1970-01-01 08:00:00)
      		  ctime 2973280098083405823.4294967295 (-269783007-01-01 21:37:03)
      		  mtime 18446744071572723616.4026825121 (1902-04-16 12:40:00)
      		  otime 9249929404488876031.4294967295 (622322949-04-16 04:25:58)
      	  item 1 key (1 DEV_EXTENT 12582912) itemoff 3907 itemsize 48
      		  dev extent chunk_tree 3
      		  chunk_objectid 256 chunk_offset 12582912 length 8388608
      		  chunk_tree_uuid cecbd0f7-9ca0-441e-ae9f-f782f9732bd8
      
      The corrupted leaf of device tree has an inode item. The leaf passed
      checksum and others checks in validate_extent_buffer until check_leaf_item().
      Because of the key type BTRFS_INODE_ITEM, check_inode_item() is called even we
      are in the device tree. Since the
      item offset + sizeof(struct btrfs_inode_item) > eb->len, out-of-bounds access
      is triggered.
      
      The item end vs leaf boundary check has been done before
      check_leaf_item(), so fix it by checking item size in check_inode_item()
      before access of the inode item in extent buffer.
      
      Other check functions except check_dev_item() in check_leaf_item()
      have their item size checks.
      The commit for check_dev_item() is followed.
      
      No regression observed during running fstests.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215299
      CC: stable@vger.kernel.org # 5.10+
      CC: Wenqing Liu <wenqingliu0120@gmail.com>
      Signed-off-by: NSu Yue <l@damenly.su>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      5e18954c
  6. 23 5月, 2022 1 次提交
  7. 17 5月, 2022 1 次提交
    • S
      btrfs: fix deadlock between quota disable and qgroup rescan worker · 1620a4fb
      Shin'ichiro Kawasaki 提交于
      stable inclusion
      from stable-v5.10.99
      commit 32747e01436aac8ef93fe85b5b523b4f3b52f040
      bugzilla: https://gitee.com/openeuler/kernel/issues/I55O7H
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=32747e01436aac8ef93fe85b5b523b4f3b52f040
      
      --------------------------------
      
      commit e804861b upstream.
      
      Quota disable ioctl starts a transaction before waiting for the qgroup
      rescan worker completes. However, this wait can be infinite and results
      in deadlock because of circular dependency among the quota disable
      ioctl, the qgroup rescan worker and the other task with transaction such
      as block group relocation task.
      
      The deadlock happens with the steps following:
      
      1) Task A calls ioctl to disable quota. It starts a transaction and
         waits for qgroup rescan worker completes.
      2) Task B such as block group relocation task starts a transaction and
         joins to the transaction that task A started. Then task B commits to
         the transaction. In this commit, task B waits for a commit by task A.
      3) Task C as the qgroup rescan worker starts its job and starts a
         transaction. In this transaction start, task C waits for completion
         of the transaction that task A started and task B committed.
      
      This deadlock was found with fstests test case btrfs/115 and a zoned
      null_blk device. The test case enables and disables quota, and the
      block group reclaim was triggered during the quota disable by chance.
      The deadlock was also observed by running quota enable and disable in
      parallel with 'btrfs balance' command on regular null_blk devices.
      
      An example report of the deadlock:
      
        [372.469894] INFO: task kworker/u16:6:103 blocked for more than 122 seconds.
        [372.479944]       Not tainted 5.16.0-rc8 #7
        [372.485067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        [372.493898] task:kworker/u16:6   state:D stack:    0 pid:  103 ppid:     2 flags:0x00004000
        [372.503285] Workqueue: btrfs-qgroup-rescan btrfs_work_helper [btrfs]
        [372.510782] Call Trace:
        [372.514092]  <TASK>
        [372.521684]  __schedule+0xb56/0x4850
        [372.530104]  ? io_schedule_timeout+0x190/0x190
        [372.538842]  ? lockdep_hardirqs_on+0x7e/0x100
        [372.547092]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
        [372.555591]  schedule+0xe0/0x270
        [372.561894]  btrfs_commit_transaction+0x18bb/0x2610 [btrfs]
        [372.570506]  ? btrfs_apply_pending_changes+0x50/0x50 [btrfs]
        [372.578875]  ? free_unref_page+0x3f2/0x650
        [372.585484]  ? finish_wait+0x270/0x270
        [372.591594]  ? release_extent_buffer+0x224/0x420 [btrfs]
        [372.599264]  btrfs_qgroup_rescan_worker+0xc13/0x10c0 [btrfs]
        [372.607157]  ? lock_release+0x3a9/0x6d0
        [372.613054]  ? btrfs_qgroup_account_extent+0xda0/0xda0 [btrfs]
        [372.620960]  ? do_raw_spin_lock+0x11e/0x250
        [372.627137]  ? rwlock_bug.part.0+0x90/0x90
        [372.633215]  ? lock_is_held_type+0xe4/0x140
        [372.639404]  btrfs_work_helper+0x1ae/0xa90 [btrfs]
        [372.646268]  process_one_work+0x7e9/0x1320
        [372.652321]  ? lock_release+0x6d0/0x6d0
        [372.658081]  ? pwq_dec_nr_in_flight+0x230/0x230
        [372.664513]  ? rwlock_bug.part.0+0x90/0x90
        [372.670529]  worker_thread+0x59e/0xf90
        [372.676172]  ? process_one_work+0x1320/0x1320
        [372.682440]  kthread+0x3b9/0x490
        [372.687550]  ? _raw_spin_unlock_irq+0x24/0x50
        [372.693811]  ? set_kthread_struct+0x100/0x100
        [372.700052]  ret_from_fork+0x22/0x30
        [372.705517]  </TASK>
        [372.709747] INFO: task btrfs-transacti:2347 blocked for more than 123 seconds.
        [372.729827]       Not tainted 5.16.0-rc8 #7
        [372.745907] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        [372.767106] task:btrfs-transacti state:D stack:    0 pid: 2347 ppid:     2 flags:0x00004000
        [372.787776] Call Trace:
        [372.801652]  <TASK>
        [372.812961]  __schedule+0xb56/0x4850
        [372.830011]  ? io_schedule_timeout+0x190/0x190
        [372.852547]  ? lockdep_hardirqs_on+0x7e/0x100
        [372.871761]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
        [372.886792]  schedule+0xe0/0x270
        [372.901685]  wait_current_trans+0x22c/0x310 [btrfs]
        [372.919743]  ? btrfs_put_transaction+0x3d0/0x3d0 [btrfs]
        [372.938923]  ? finish_wait+0x270/0x270
        [372.959085]  ? join_transaction+0xc75/0xe30 [btrfs]
        [372.977706]  start_transaction+0x938/0x10a0 [btrfs]
        [372.997168]  transaction_kthread+0x19d/0x3c0 [btrfs]
        [373.013021]  ? btrfs_cleanup_transaction.isra.0+0xfc0/0xfc0 [btrfs]
        [373.031678]  kthread+0x3b9/0x490
        [373.047420]  ? _raw_spin_unlock_irq+0x24/0x50
        [373.064645]  ? set_kthread_struct+0x100/0x100
        [373.078571]  ret_from_fork+0x22/0x30
        [373.091197]  </TASK>
        [373.105611] INFO: task btrfs:3145 blocked for more than 123 seconds.
        [373.114147]       Not tainted 5.16.0-rc8 #7
        [373.120401] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        [373.130393] task:btrfs           state:D stack:    0 pid: 3145 ppid:  3141 flags:0x00004000
        [373.140998] Call Trace:
        [373.145501]  <TASK>
        [373.149654]  __schedule+0xb56/0x4850
        [373.155306]  ? io_schedule_timeout+0x190/0x190
        [373.161965]  ? lockdep_hardirqs_on+0x7e/0x100
        [373.168469]  ? _raw_spin_unlock_irqrestore+0x3e/0x60
        [373.175468]  schedule+0xe0/0x270
        [373.180814]  wait_for_commit+0x104/0x150 [btrfs]
        [373.187643]  ? test_and_set_bit+0x20/0x20 [btrfs]
        [373.194772]  ? kmem_cache_free+0x124/0x550
        [373.201191]  ? btrfs_put_transaction+0x69/0x3d0 [btrfs]
        [373.208738]  ? finish_wait+0x270/0x270
        [373.214704]  ? __btrfs_end_transaction+0x347/0x7b0 [btrfs]
        [373.222342]  btrfs_commit_transaction+0x44d/0x2610 [btrfs]
        [373.230233]  ? join_transaction+0x255/0xe30 [btrfs]
        [373.237334]  ? btrfs_record_root_in_trans+0x4d/0x170 [btrfs]
        [373.245251]  ? btrfs_apply_pending_changes+0x50/0x50 [btrfs]
        [373.253296]  relocate_block_group+0x105/0xc20 [btrfs]
        [373.260533]  ? mutex_lock_io_nested+0x1270/0x1270
        [373.267516]  ? btrfs_wait_nocow_writers+0x85/0x180 [btrfs]
        [373.275155]  ? merge_reloc_roots+0x710/0x710 [btrfs]
        [373.283602]  ? btrfs_wait_ordered_extents+0xd30/0xd30 [btrfs]
        [373.291934]  ? kmem_cache_free+0x124/0x550
        [373.298180]  btrfs_relocate_block_group+0x35c/0x930 [btrfs]
        [373.306047]  btrfs_relocate_chunk+0x85/0x210 [btrfs]
        [373.313229]  btrfs_balance+0x12f4/0x2d20 [btrfs]
        [373.320227]  ? lock_release+0x3a9/0x6d0
        [373.326206]  ? btrfs_relocate_chunk+0x210/0x210 [btrfs]
        [373.333591]  ? lock_is_held_type+0xe4/0x140
        [373.340031]  ? rcu_read_lock_sched_held+0x3f/0x70
        [373.346910]  btrfs_ioctl_balance+0x548/0x700 [btrfs]
        [373.354207]  btrfs_ioctl+0x7f2/0x71b0 [btrfs]
        [373.360774]  ? lockdep_hardirqs_on_prepare+0x410/0x410
        [373.367957]  ? lockdep_hardirqs_on_prepare+0x410/0x410
        [373.375327]  ? btrfs_ioctl_get_supported_features+0x20/0x20 [btrfs]
        [373.383841]  ? find_held_lock+0x2c/0x110
        [373.389993]  ? lock_release+0x3a9/0x6d0
        [373.395828]  ? mntput_no_expire+0xf7/0xad0
        [373.402083]  ? lock_is_held_type+0xe4/0x140
        [373.408249]  ? vfs_fileattr_set+0x9f0/0x9f0
        [373.414486]  ? selinux_file_ioctl+0x349/0x4e0
        [373.420938]  ? trace_raw_output_lock+0xb4/0xe0
        [373.427442]  ? selinux_inode_getsecctx+0x80/0x80
        [373.434224]  ? lockdep_hardirqs_on+0x7e/0x100
        [373.440660]  ? force_qs_rnp+0x2a0/0x6b0
        [373.446534]  ? lock_is_held_type+0x9b/0x140
        [373.452763]  ? __blkcg_punt_bio_submit+0x1b0/0x1b0
        [373.459732]  ? security_file_ioctl+0x50/0x90
        [373.466089]  __x64_sys_ioctl+0x127/0x190
        [373.472022]  do_syscall_64+0x3b/0x90
        [373.477513]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [373.484823] RIP: 0033:0x7f8f4af7e2bb
        [373.490493] RSP: 002b:00007ffcbf936178 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [373.500197] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8f4af7e2bb
        [373.509451] RDX: 00007ffcbf936220 RSI: 00000000c4009420 RDI: 0000000000000003
        [373.518659] RBP: 00007ffcbf93774a R08: 0000000000000013 R09: 00007f8f4b02d4e0
        [373.527872] R10: 00007f8f4ae87740 R11: 0000000000000246 R12: 0000000000000001
        [373.537222] R13: 00007ffcbf936220 R14: 0000000000000000 R15: 0000000000000002
        [373.546506]  </TASK>
        [373.550878] INFO: task btrfs:3146 blocked for more than 123 seconds.
        [373.559383]       Not tainted 5.16.0-rc8 #7
        [373.565748] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        [373.575748] task:btrfs           state:D stack:    0 pid: 3146 ppid:  2168 flags:0x00000000
        [373.586314] Call Trace:
        [373.590846]  <TASK>
        [373.595121]  __schedule+0xb56/0x4850
        [373.600901]  ? __lock_acquire+0x23db/0x5030
        [373.607176]  ? io_schedule_timeout+0x190/0x190
        [373.613954]  schedule+0xe0/0x270
        [373.619157]  schedule_timeout+0x168/0x220
        [373.625170]  ? usleep_range_state+0x150/0x150
        [373.631653]  ? mark_held_locks+0x9e/0xe0
        [373.637767]  ? do_raw_spin_lock+0x11e/0x250
        [373.643993]  ? lockdep_hardirqs_on_prepare+0x17b/0x410
        [373.651267]  ? _raw_spin_unlock_irq+0x24/0x50
        [373.657677]  ? lockdep_hardirqs_on+0x7e/0x100
        [373.664103]  wait_for_completion+0x163/0x250
        [373.670437]  ? bit_wait_timeout+0x160/0x160
        [373.676585]  btrfs_quota_disable+0x176/0x9a0 [btrfs]
        [373.683979]  ? btrfs_quota_enable+0x12f0/0x12f0 [btrfs]
        [373.691340]  ? down_write+0xd0/0x130
        [373.696880]  ? down_write_killable+0x150/0x150
        [373.703352]  btrfs_ioctl+0x3945/0x71b0 [btrfs]
        [373.710061]  ? find_held_lock+0x2c/0x110
        [373.716192]  ? lock_release+0x3a9/0x6d0
        [373.722047]  ? __handle_mm_fault+0x23cd/0x3050
        [373.728486]  ? btrfs_ioctl_get_supported_features+0x20/0x20 [btrfs]
        [373.737032]  ? set_pte+0x6a/0x90
        [373.742271]  ? do_raw_spin_unlock+0x55/0x1f0
        [373.748506]  ? lock_is_held_type+0xe4/0x140
        [373.754792]  ? vfs_fileattr_set+0x9f0/0x9f0
        [373.761083]  ? selinux_file_ioctl+0x349/0x4e0
        [373.767521]  ? selinux_inode_getsecctx+0x80/0x80
        [373.774247]  ? __up_read+0x182/0x6e0
        [373.780026]  ? count_memcg_events.constprop.0+0x46/0x60
        [373.787281]  ? up_write+0x460/0x460
        [373.792932]  ? security_file_ioctl+0x50/0x90
        [373.799232]  __x64_sys_ioctl+0x127/0x190
        [373.805237]  do_syscall_64+0x3b/0x90
        [373.810947]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [373.818102] RIP: 0033:0x7f1383ea02bb
        [373.823847] RSP: 002b:00007fffeb4d71f8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
        [373.833641] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1383ea02bb
        [373.842961] RDX: 00007fffeb4d7210 RSI: 00000000c0109428 RDI: 0000000000000003
        [373.852179] RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000078
        [373.861408] R10: 00007f1383daec78 R11: 0000000000000202 R12: 00007fffeb4d874a
        [373.870647] R13: 0000000000493099 R14: 0000000000000001 R15: 0000000000000000
        [373.879838]  </TASK>
        [373.884018]
                     Showing all locks held in the system:
        [373.894250] 3 locks held by kworker/4:1/58:
        [373.900356] 1 lock held by khungtaskd/63:
        [373.906333]  #0: ffffffff8945ff60 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260
        [373.917307] 3 locks held by kworker/u16:6/103:
        [373.923938]  #0: ffff888127b4f138 ((wq_completion)btrfs-qgroup-rescan){+.+.}-{0:0}, at: process_one_work+0x712/0x1320
        [373.936555]  #1: ffff88810b817dd8 ((work_completion)(&work->normal_work)){+.+.}-{0:0}, at: process_one_work+0x73f/0x1320
        [373.951109]  #2: ffff888102dd4650 (sb_internal#2){.+.+}-{0:0}, at: btrfs_qgroup_rescan_worker+0x1f6/0x10c0 [btrfs]
        [373.964027] 2 locks held by less/1803:
        [373.969982]  #0: ffff88813ed56098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80
        [373.981295]  #1: ffffc90000b3b2e8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x9e2/0x1060
        [373.992969] 1 lock held by btrfs-transacti/2347:
        [373.999893]  #0: ffff88813d4887a8 (&fs_info->transaction_kthread_mutex){+.+.}-{3:3}, at: transaction_kthread+0xe3/0x3c0 [btrfs]
        [374.015872] 3 locks held by btrfs/3145:
        [374.022298]  #0: ffff888102dd4460 (sb_writers#18){.+.+}-{0:0}, at: btrfs_ioctl_balance+0xc3/0x700 [btrfs]
        [374.034456]  #1: ffff88813d48a0a0 (&fs_info->reclaim_bgs_lock){+.+.}-{3:3}, at: btrfs_balance+0xfe5/0x2d20 [btrfs]
        [374.047646]  #2: ffff88813d488838 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x354/0x930 [btrfs]
        [374.063295] 4 locks held by btrfs/3146:
        [374.069647]  #0: ffff888102dd4460 (sb_writers#18){.+.+}-{0:0}, at: btrfs_ioctl+0x38b1/0x71b0 [btrfs]
        [374.081601]  #1: ffff88813d488bb8 (&fs_info->subvol_sem){+.+.}-{3:3}, at: btrfs_ioctl+0x38fd/0x71b0 [btrfs]
        [374.094283]  #2: ffff888102dd4650 (sb_internal#2){.+.+}-{0:0}, at: btrfs_quota_disable+0xc8/0x9a0 [btrfs]
        [374.106885]  #3: ffff88813d489800 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_disable+0xd5/0x9a0 [btrfs]
      
        [374.126780] =============================================
      
      To avoid the deadlock, wait for the qgroup rescan worker to complete
      before starting the transaction for the quota disable ioctl. Clear
      BTRFS_FS_QUOTA_ENABLE flag before the wait and the transaction to
      request the worker to complete. On transaction start failure, set the
      BTRFS_FS_QUOTA_ENABLE flag again. These BTRFS_FS_QUOTA_ENABLE flag
      changes can be done safely since the function btrfs_quota_disable is not
      called concurrently because of fs_info->subvol_sem.
      
      Also check the BTRFS_FS_QUOTA_ENABLE flag in qgroup_rescan_init to avoid
      another qgroup rescan worker to start after the previous qgroup worker
      completed.
      
      CC: stable@vger.kernel.org # 5.4+
      Suggested-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYu Liao <liaoyu15@huawei.com>
      Reviewed-by: NWei Li <liwei391@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      1620a4fb
  8. 10 5月, 2022 1 次提交
  9. 27 4月, 2022 5 次提交
  10. 14 1月, 2022 7 次提交
  11. 06 12月, 2021 5 次提交
    • J
      btrfs: do not take the uuid_mutex in btrfs_rm_device · 3c911db8
      Josef Bacik 提交于
      stable inclusion
      from stable-5.10.80
      commit b917f9b94633bde8982f98965aa1fa534b9e8f46
      bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b917f9b94633bde8982f98965aa1fa534b9e8f46
      
      --------------------------------
      
      [ Upstream commit 8ef9dc0f ]
      
      We got the following lockdep splat while running fstests (specifically
      btrfs/003 and btrfs/020 in a row) with the new rc.  This was uncovered
      by 87579e9b ("loop: use worker per cgroup instead of kworker") which
      converted loop to using workqueues, which comes with lockdep
      annotations that don't exist with kworkers.  The lockdep splat is as
      follows:
      
        WARNING: possible circular locking dependency detected
        5.14.0-rc2-custom+ #34 Not tainted
        ------------------------------------------------------
        losetup/156417 is trying to acquire lock:
        ffff9c7645b02d38 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x84/0x600
      
        but task is already holding lock:
        ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop]
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #5 (&lo->lo_mutex){+.+.}-{3:3}:
      	 __mutex_lock+0xba/0x7c0
      	 lo_open+0x28/0x60 [loop]
      	 blkdev_get_whole+0x28/0xf0
      	 blkdev_get_by_dev.part.0+0x168/0x3c0
      	 blkdev_open+0xd2/0xe0
      	 do_dentry_open+0x163/0x3a0
      	 path_openat+0x74d/0xa40
      	 do_filp_open+0x9c/0x140
      	 do_sys_openat2+0xb1/0x170
      	 __x64_sys_openat+0x54/0x90
      	 do_syscall_64+0x3b/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        -> #4 (&disk->open_mutex){+.+.}-{3:3}:
      	 __mutex_lock+0xba/0x7c0
      	 blkdev_get_by_dev.part.0+0xd1/0x3c0
      	 blkdev_get_by_path+0xc0/0xd0
      	 btrfs_scan_one_device+0x52/0x1f0 [btrfs]
      	 btrfs_control_ioctl+0xac/0x170 [btrfs]
      	 __x64_sys_ioctl+0x83/0xb0
      	 do_syscall_64+0x3b/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        -> #3 (uuid_mutex){+.+.}-{3:3}:
      	 __mutex_lock+0xba/0x7c0
      	 btrfs_rm_device+0x48/0x6a0 [btrfs]
      	 btrfs_ioctl+0x2d1c/0x3110 [btrfs]
      	 __x64_sys_ioctl+0x83/0xb0
      	 do_syscall_64+0x3b/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        -> #2 (sb_writers#11){.+.+}-{0:0}:
      	 lo_write_bvec+0x112/0x290 [loop]
      	 loop_process_work+0x25f/0xcb0 [loop]
      	 process_one_work+0x28f/0x5d0
      	 worker_thread+0x55/0x3c0
      	 kthread+0x140/0x170
      	 ret_from_fork+0x22/0x30
      
        -> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
      	 process_one_work+0x266/0x5d0
      	 worker_thread+0x55/0x3c0
      	 kthread+0x140/0x170
      	 ret_from_fork+0x22/0x30
      
        -> #0 ((wq_completion)loop0){+.+.}-{0:0}:
      	 __lock_acquire+0x1130/0x1dc0
      	 lock_acquire+0xf5/0x320
      	 flush_workqueue+0xae/0x600
      	 drain_workqueue+0xa0/0x110
      	 destroy_workqueue+0x36/0x250
      	 __loop_clr_fd+0x9a/0x650 [loop]
      	 lo_ioctl+0x29d/0x780 [loop]
      	 block_ioctl+0x3f/0x50
      	 __x64_sys_ioctl+0x83/0xb0
      	 do_syscall_64+0x3b/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
        other info that might help us debug this:
        Chain exists of:
          (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
         Possible unsafe locking scenario:
      	 CPU0                    CPU1
      	 ----                    ----
          lock(&lo->lo_mutex);
      				 lock(&disk->open_mutex);
      				 lock(&lo->lo_mutex);
          lock((wq_completion)loop0);
      
         *** DEADLOCK ***
        1 lock held by losetup/156417:
         #0: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop]
      
        stack backtrace:
        CPU: 8 PID: 156417 Comm: losetup Not tainted 5.14.0-rc2-custom+ #34
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        Call Trace:
         dump_stack_lvl+0x57/0x72
         check_noncircular+0x10a/0x120
         __lock_acquire+0x1130/0x1dc0
         lock_acquire+0xf5/0x320
         ? flush_workqueue+0x84/0x600
         flush_workqueue+0xae/0x600
         ? flush_workqueue+0x84/0x600
         drain_workqueue+0xa0/0x110
         destroy_workqueue+0x36/0x250
         __loop_clr_fd+0x9a/0x650 [loop]
         lo_ioctl+0x29d/0x780 [loop]
         ? __lock_acquire+0x3a0/0x1dc0
         ? update_dl_rq_load_avg+0x152/0x360
         ? lock_is_held_type+0xa5/0x120
         ? find_held_lock.constprop.0+0x2b/0x80
         block_ioctl+0x3f/0x50
         __x64_sys_ioctl+0x83/0xb0
         do_syscall_64+0x3b/0x90
         entry_SYSCALL_64_after_hwframe+0x44/0xae
        RIP: 0033:0x7f645884de6b
      
      Usually the uuid_mutex exists to protect the fs_devices that map
      together all of the devices that match a specific uuid.  In rm_device
      we're messing with the uuid of a device, so it makes sense to protect
      that here.
      
      However in doing that it pulls in a whole host of lockdep dependencies,
      as we call mnt_may_write() on the sb before we grab the uuid_mutex, thus
      we end up with the dependency chain under the uuid_mutex being added
      under the normal sb write dependency chain, which causes problems with
      loop devices.
      
      We don't need the uuid mutex here however.  If we call
      btrfs_scan_one_device() before we scratch the super block we will find
      the fs_devices and not find the device itself and return EBUSY because
      the fs_devices is open.  If we call it after the scratch happens it will
      not appear to be a valid btrfs file system.
      
      We do not need to worry about other fs_devices modifying operations here
      because we're protected by the exclusive operations locking.
      
      So drop the uuid_mutex here in order to fix the lockdep splat.
      
      A more detailed explanation from the discussion:
      
      We are worried about rm and scan racing with each other, before this
      change we'll zero the device out under the UUID mutex so when scan does
      run it'll make sure that it can go through the whole device scan thing
      without rm messing with us.
      
      We aren't worried if the scratch happens first, because the result is we
      don't think this is a btrfs device and we bail out.
      
      The only case we are concerned with is we scratch _after_ scan is able
      to read the superblock and gets a seemingly valid super block, so lets
      consider this case.
      
      Scan will call device_list_add() with the device we're removing.  We'll
      call find_fsid_with_metadata_uuid() and get our fs_devices for this
      UUID.  At this point we lock the fs_devices->device_list_mutex.  This is
      what protects us in this case, but we have two cases here.
      
      1. We aren't to the device removal part of the RM.  We found our device,
         and device name matches our path, we go down and we set total_devices
         to our super number of devices, which doesn't affect anything because
         we haven't done the remove yet.
      
      2. We are past the device removal part, which is protected by the
         device_list_mutex.  Scan doesn't find the device, it goes down and
         does the
      
         if (fs_devices->opened)
      	   return -EBUSY;
      
         check and we bail out.
      
      Nothing about this situation is ideal, but the lockdep splat is real,
      and the fix is safe, tho admittedly a bit scary looking.
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ copy more from the discussion ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      3c911db8
    • S
      btrfs: reflink: initialize return value to 0 in btrfs_extent_same() · 4566911d
      Sidong Yang 提交于
      stable inclusion
      from stable-5.10.80
      commit 428bb3d71e35b41f6ee325d680c56a6d6e9594f8
      bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=428bb3d71e35b41f6ee325d680c56a6d6e9594f8
      
      --------------------------------
      
      [ Upstream commit 44bee215 ]
      
      Fix a warning reported by smatch that ret could be returned without
      initialized.  The dedupe operations are supposed to to return 0 for a 0
      length range but the caller does not pass olen == 0. To keep this
      behaviour and also fix the warning initialize ret to 0.
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NSidong Yang <realwakka@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      4566911d
    • A
      btrfs: call btrfs_check_rw_degradable only if there is a missing device · 2cee87e3
      Anand Jain 提交于
      stable inclusion
      from stable-5.10.80
      commit b4a4c9dc4407ae77ebf4ad8bdf9ecf57e29658b4
      bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b4a4c9dc4407ae77ebf4ad8bdf9ecf57e29658b4
      
      --------------------------------
      
      commit 5c78a5e7 upstream.
      
      In open_ctree() in btrfs_check_rw_degradable() [1], we check each block
      group individually if at least the minimum number of devices is available
      for that profile. If all the devices are available, then we don't have to
      check degradable.
      
      [1]
      open_ctree()
      ::
      3559 if (!sb_rdonly(sb) && !btrfs_check_rw_degradable(fs_info, NULL)) {
      
      Also before calling btrfs_check_rw_degradable() in open_ctee() at the
      line number shown below [2] we call btrfs_read_chunk_tree() and down to
      add_missing_dev() to record number of missing devices.
      
      [2]
      open_ctree()
      ::
      3454         ret = btrfs_read_chunk_tree(fs_info);
      
      btrfs_read_chunk_tree()
        read_one_chunk() / read_one_dev()
          add_missing_dev()
      
      So, check if there is any missing device before btrfs_check_rw_degradable()
      in open_ctree().
      
      Also, with this the mount command could save ~16ms.[3] in the most
      common case, that is no device is missing.
      
      [3]
       1) * 16934.96 us | btrfs_check_rw_degradable [btrfs]();
      
      CC: stable@vger.kernel.org # 4.19+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      2cee87e3
    • F
      btrfs: fix lost error handling when replaying directory deletes · 6f0c1a03
      Filipe Manana 提交于
      stable inclusion
      from stable-5.10.80
      commit b406439afe734da8bbd15b7dcaaa297c77a98905
      bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b406439afe734da8bbd15b7dcaaa297c77a98905
      
      --------------------------------
      
      commit 10adb115 upstream.
      
      At replay_dir_deletes(), if find_dir_range() returns an error we break out
      of the main while loop and then assign a value of 0 (success) to the 'ret'
      variable, resulting in completely ignoring that an error happened. Fix
      that by jumping to the 'out' label when find_dir_range() returns an error
      (negative value).
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      6f0c1a03
    • L
      btrfs: clear MISSING device status bit in btrfs_close_one_device · 08a4f14e
      Li Zhang 提交于
      stable inclusion
      from stable-5.10.80
      commit 8992aab294cb7c70d430e7ad6671ccb0002de5b7
      bugzilla: 185821 https://gitee.com/openeuler/kernel/issues/I4L7CG
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8992aab294cb7c70d430e7ad6671ccb0002de5b7
      
      --------------------------------
      
      commit 5d03dbeb upstream.
      
      Reported bug: https://github.com/kdave/btrfs-progs/issues/389
      
      There's a problem with scrub reporting aborted status but returning
      error code 0, on a filesystem with missing and readded device.
      
      Roughly these steps:
      
      - mkfs -d raid1 dev1 dev2
      - fill with data
      - unmount
      - make dev1 disappear
      - mount -o degraded
      - copy more data
      - make dev1 appear again
      
      Running scrub afterwards reports that the command was aborted, but the
      system log message says the exit code was 0.
      
      It seems that the cause of the error is decrementing
      fs_devices->missing_devices but not clearing device->dev_state.  Every
      time we umount filesystem, it would call close_ctree, And it would
      eventually involve btrfs_close_one_device to close the device, but it
      only decrements fs_devices->missing_devices but does not clear the
      device BTRFS_DEV_STATE_MISSING bit. Worse, this bug will cause Integer
      Overflow, because every time umount, fs_devices->missing_devices will
      decrease. If fs_devices->missing_devices value hit 0, it would overflow.
      
      With added debugging:
      
         loop1: detected capacity change from 0 to 20971520
         BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 1 transid 21 /dev/loop1 scanned by systemd-udevd (2311)
         loop2: detected capacity change from 0 to 20971520
         BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 2 transid 17 /dev/loop2 scanned by systemd-udevd (2313)
         BTRFS info (device loop1): flagging fs with big metadata feature
         BTRFS info (device loop1): allowing degraded mounts
         BTRFS info (device loop1): using free space tree
         BTRFS info (device loop1): has skinny extents
         BTRFS info (device loop1):  before clear_missing.00000000f706684d /dev/loop1 0
         BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
         BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
         BTRFS info (device loop1): flagging fs with big metadata feature
         BTRFS info (device loop1): allowing degraded mounts
         BTRFS info (device loop1): using free space tree
         BTRFS info (device loop1): has skinny extents
         BTRFS info (device loop1):  before clear_missing.00000000f706684d /dev/loop1 0
         BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
         BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 0
         BTRFS info (device loop1): flagging fs with big metadata feature
         BTRFS info (device loop1): allowing degraded mounts
         BTRFS info (device loop1): using free space tree
         BTRFS info (device loop1): has skinny extents
         BTRFS info (device loop1):  before clear_missing.00000000f706684d /dev/loop1 18446744073709551615
         BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
         BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 18446744073709551615
      
      If fs_devices->missing_devices is 0, next time it would be 18446744073709551615
      
      After apply this patch, the fs_devices->missing_devices seems to be
      right:
      
        $ truncate -s 10g test1
        $ truncate -s 10g test2
        $ losetup /dev/loop1 test1
        $ losetup /dev/loop2 test2
        $ mkfs.btrfs -draid1 -mraid1 /dev/loop1 /dev/loop2 -f
        $ losetup -d /dev/loop2
        $ mount -o degraded /dev/loop1 /mnt/1
        $ umount /mnt/1
        $ mount -o degraded /dev/loop1 /mnt/1
        $ umount /mnt/1
        $ mount -o degraded /dev/loop1 /mnt/1
        $ umount /mnt/1
        $ dmesg
      
         loop1: detected capacity change from 0 to 20971520
         loop2: detected capacity change from 0 to 20971520
         BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 1 transid 5 /dev/loop1 scanned by mkfs.btrfs (1863)
         BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 2 transid 5 /dev/loop2 scanned by mkfs.btrfs (1863)
         BTRFS info (device loop1): flagging fs with big metadata feature
         BTRFS info (device loop1): allowing degraded mounts
         BTRFS info (device loop1): disk space caching is enabled
         BTRFS info (device loop1): has skinny extents
         BTRFS info (device loop1):  before clear_missing.00000000975bd577 /dev/loop1 0
         BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
         BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
         BTRFS info (device loop1): checking UUID tree
         BTRFS info (device loop1): flagging fs with big metadata feature
         BTRFS info (device loop1): allowing degraded mounts
         BTRFS info (device loop1): disk space caching is enabled
         BTRFS info (device loop1): has skinny extents
         BTRFS info (device loop1):  before clear_missing.00000000975bd577 /dev/loop1 0
         BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
         BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
         BTRFS info (device loop1): flagging fs with big metadata feature
         BTRFS info (device loop1): allowing degraded mounts
         BTRFS info (device loop1): disk space caching is enabled
         BTRFS info (device loop1): has skinny extents
         BTRFS info (device loop1):  before clear_missing.00000000975bd577 /dev/loop1 0
         BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
         BTRFS info (device loop1):  before clear_missing.0000000000000000 /dev/loop2 1
      
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: NLi Zhang <zhanglikernel@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
      Acked-by: NWeilong Chen <chenweilong@huawei.com>
      Signed-off-by: NChen Jun <chenjun102@huawei.com>
      Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
      08a4f14e
  12. 15 11月, 2021 7 次提交