1. 25 5月, 2020 34 次提交
  2. 23 4月, 2020 1 次提交
    • X
      btrfs: fix transaction leak in btrfs_recover_relocation · 1402d17d
      Xiyu Yang 提交于
      btrfs_recover_relocation() invokes btrfs_join_transaction(), which joins
      a btrfs_trans_handle object into transactions and returns a reference of
      it with increased refcount to "trans".
      
      When btrfs_recover_relocation() returns, "trans" becomes invalid, so the
      refcount should be decreased to keep refcount balanced.
      
      The reference counting issue happens in one exception handling path of
      btrfs_recover_relocation(). When read_fs_root() failed, the refcnt
      increased by btrfs_join_transaction() is not decreased, causing a refcnt
      leak.
      
      Fix this issue by calling btrfs_end_transaction() on this error path
      when read_fs_root() failed.
      
      Fixes: 79787eaa ("btrfs: replace many BUG_ONs with proper error handling")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn>
      Signed-off-by: NXin Tan <tanxin.ctf@gmail.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1402d17d
  3. 17 4月, 2020 1 次提交
    • J
      btrfs: fix setting last_trans for reloc roots · aec7db3b
      Josef Bacik 提交于
      I made a mistake with my previous fix, I assumed that we didn't need to
      mess with the reloc roots once we were out of the part of relocation where
      we are actually moving the extents.
      
      The subtle thing that I missed is that btrfs_init_reloc_root() also
      updates the last_trans for the reloc root when we do
      btrfs_record_root_in_trans() for the corresponding fs_root.  I've added a
      comment to make sure future me doesn't make this mistake again.
      
      This showed up as a WARN_ON() in btrfs_copy_root() because our
      last_trans didn't == the current transid.  This could happen if we
      snapshotted a fs root with a reloc root after we set
      rc->create_reloc_tree = 0, but before we actually merge the reloc root.
      
      Worth mentioning that the regression produced the following warning
      when running snapshot creation and balance in parallel:
      
        BTRFS info (device sdc): relocating block group 30408704 flags metadata|dup
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 12823 at fs/btrfs/ctree.c:191 btrfs_copy_root+0x26f/0x430 [btrfs]
        CPU: 0 PID: 12823 Comm: btrfs Tainted: G        W 5.6.0-rc7-btrfs-next-58 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
        RIP: 0010:btrfs_copy_root+0x26f/0x430 [btrfs]
        RSP: 0018:ffffb96e044279b8 EFLAGS: 00010202
        RAX: 0000000000000009 RBX: ffff9da70bf61000 RCX: ffffb96e04427a48
        RDX: ffff9da733a770c8 RSI: ffff9da70bf61000 RDI: ffff9da694163818
        RBP: ffff9da733a770c8 R08: fffffffffffffff8 R09: 0000000000000002
        R10: ffffb96e044279a0 R11: 0000000000000000 R12: ffff9da694163818
        R13: fffffffffffffff8 R14: ffff9da6d2512000 R15: ffff9da714cdac00
        FS:  00007fdeacf328c0(0000) GS:ffff9da735e00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 000055a2a5b8a118 CR3: 00000001eed78002 CR4: 00000000003606f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         ? create_reloc_root+0x49/0x2b0 [btrfs]
         ? kmem_cache_alloc_trace+0xe5/0x200
         create_reloc_root+0x8b/0x2b0 [btrfs]
         btrfs_reloc_post_snapshot+0x96/0x5b0 [btrfs]
         create_pending_snapshot+0x610/0x1010 [btrfs]
         create_pending_snapshots+0xa8/0xd0 [btrfs]
         btrfs_commit_transaction+0x4c7/0xc50 [btrfs]
         ? btrfs_mksubvol+0x3cd/0x560 [btrfs]
         btrfs_mksubvol+0x455/0x560 [btrfs]
         __btrfs_ioctl_snap_create+0x15f/0x190 [btrfs]
         btrfs_ioctl_snap_create_v2+0xa4/0xf0 [btrfs]
         ? mem_cgroup_commit_charge+0x6e/0x540
         btrfs_ioctl+0x12d8/0x3760 [btrfs]
         ? do_raw_spin_unlock+0x49/0xc0
         ? _raw_spin_unlock+0x29/0x40
         ? __handle_mm_fault+0x11b3/0x14b0
         ? ksys_ioctl+0x92/0xb0
         ksys_ioctl+0x92/0xb0
         ? trace_hardirqs_off_thunk+0x1a/0x1c
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x5c/0x280
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
        RIP: 0033:0x7fdeabd3bdd7
      
      Fixes: 2abc726a ("btrfs: do not init a reloc root if we aren't relocating")
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      aec7db3b
  4. 09 4月, 2020 1 次提交
    • J
      btrfs: check commit root generation in should_ignore_root · 4d4225fc
      Josef Bacik 提交于
      Previously we would set the reloc root's last snapshot to transid - 1.
      However there was a problem with doing this, and we changed it to
      setting the last snapshot to the generation of the commit node of the fs
      root.
      
      This however broke should_ignore_root().  The assumption is that if we
      are in a generation newer than when the reloc root was created, then we
      would find the reloc root through normal backref lookups, and thus can
      ignore any fs roots we find with an old enough reloc root.
      
      Now that the last snapshot could be considerably further in the past
      than before, we'd end up incorrectly ignoring an fs root.  Thus we'd
      find no nodes for the bytenr we were searching for, and we'd fail to
      relocate anything.  We'd loop through the relocate code again and see
      that there were still used space in that block group, attempt to
      relocate those bytenr's again, fail in the same way, and just loop like
      this forever.  This is tricky in that we have to not modify the fs root
      at all during this time, so we need to have a block group that has data
      in this fs root that is not shared by any other root, which is why this
      has been difficult to reproduce.
      
      Fixes: 054570a1 ("Btrfs: fix relocation incorrectly dropping data references")
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4d4225fc
  5. 24 3月, 2020 3 次提交
    • J
      btrfs: track reloc roots based on their commit root bytenr · ea287ab1
      Josef Bacik 提交于
      We always search the commit root of the extent tree for looking up back
      references, however we track the reloc roots based on their current
      bytenr.
      
      This is wrong, if we commit the transaction between relocating tree
      blocks we could end up in this code in build_backref_tree
      
        if (key.objectid == key.offset) {
      	  /*
      	   * Only root blocks of reloc trees use backref
      	   * pointing to itself.
      	   */
      	  root = find_reloc_root(rc, cur->bytenr);
      	  ASSERT(root);
      	  cur->root = root;
      	  break;
        }
      
      find_reloc_root() is looking based on the bytenr we had in the commit
      root, but if we've COWed this reloc root we will not find that bytenr,
      and we will trip over the ASSERT(root).
      
      Fix this by using the commit_root->start bytenr for indexing the commit
      root.  Then we change the __update_reloc_root() caller to be used when
      we switch the commit root for the reloc root during commit.
      
      This fixes the panic I was seeing when we started throttling relocation
      for delayed refs.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ea287ab1
    • J
      btrfs: restart relocate_tree_blocks properly · 50dbbb71
      Josef Bacik 提交于
      There are two bugs here, but fixing them independently would just result
      in pain if you happened to bisect between the two patches.
      
      First is how we handle the -EAGAIN from relocate_tree_block().  We don't
      set error, unless we happen to be the first node, which makes no sense,
      I have no idea what the code was trying to accomplish here.
      
      We in fact _do_ want err set here so that we know we need to restart in
      relocate_block_group().  Also we need finish_pending_nodes() to not
      actually call link_to_upper(), because we didn't actually relocate the
      block.
      
      And then if we do get -EAGAIN we do not want to set our backref cache
      last_trans to the one before ours.  This would force us to update our
      backref cache if we didn't cross transaction ids, which would mean we'd
      have some nodes updated to their new_bytenr, but still able to find
      their old bytenr because we're searching the same commit root as the
      last time we went through relocate_tree_blocks.
      
      Fixing these two things keeps us from panicing when we start breaking
      out of relocate_tree_blocks() either for delayed ref flushing or enospc.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      50dbbb71
    • J
      btrfs: reloc: reorder reservation before root selection · 5f6b2e5c
      Josef Bacik 提交于
      Since we're not only checking for metadata reservations but also if we
      need to throttle our delayed ref generation, reorder
      reserve_metadata_space() above the select_one_root() call in
      relocate_tree_block().
      
      The reason we want this is because select_reloc_root() will mess with
      the backref cache, and if we're going to bail we want to be able to
      cleanly remove this node from the backref cache and come back along to
      regenerate it.  Move it up so this is the first thing we do to make
      restarting cleaner.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5f6b2e5c