1. 02 2月, 2018 9 次提交
    • F
      Btrfs: fix null pointer dereference when replacing missing device · 627e0873
      Filipe Manana 提交于
      When we are replacing a missing device we mount the filesystem with the
      degraded mode option in which case we are allowed to have a btrfs device
      structure without a backing device member (its bdev member is NULL) and
      therefore we can't dereference that member. Commit 38b5f68e
      ("btrfs: drop btrfs_device::can_discard to query directly") started to
      dereference that member when discarding extents, resulting in a null
      pointer dereference:
      
       [ 3145.322257] BTRFS warning (device sdf): devid 2 uuid 4d922414-58eb-4880-8fed-9c3840f6c5d5 is missing
       [ 3145.364116] BTRFS info (device sdf): dev_replace from <missing disk> (devid 2) to /dev/sdg started
       [ 3145.413489] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
       [ 3145.415085] IP: btrfs_discard_extent+0x6a/0xf8 [btrfs]
       [ 3145.415085] PGD 0 P4D 0
       [ 3145.415085] Oops: 0000 [#1] PREEMPT SMP PTI
       [ 3145.415085] Modules linked in: ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper evdev psmouse parport_pc serio_raw i2c_piix4 i2
       [ 3145.415085] CPU: 0 PID: 11989 Comm: btrfs Tainted: G        W        4.15.0-rc9-btrfs-next-55+ #1
       [ 3145.415085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [ 3145.415085] RIP: 0010:btrfs_discard_extent+0x6a/0xf8 [btrfs]
       [ 3145.415085] RSP: 0018:ffffc90004813c60 EFLAGS: 00010293
       [ 3145.415085] RAX: ffff88020d39cc00 RBX: ffff88020c4ea2a0 RCX: 0000000000000002
       [ 3145.415085] RDX: 0000000000000000 RSI: ffff88020c4ea240 RDI: 0000000000000000
       [ 3145.415085] RBP: 0000000000000000 R08: 0000000000004000 R09: 0000000000000000
       [ 3145.415085] R10: ffffc90004813ae8 R11: 0000000000000000 R12: 0000000000000000
       [ 3145.415085] R13: ffff88020c418000 R14: 0000000000000000 R15: 0000000000000000
       [ 3145.415085] FS:  00007f565681f8c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [ 3145.415085] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [ 3145.415085] CR2: 00000000000000e0 CR3: 000000020d208006 CR4: 00000000001606f0
       [ 3145.415085] Call Trace:
       [ 3145.415085]  btrfs_finish_extent_commit+0x9a/0x1be [btrfs]
       [ 3145.415085]  btrfs_commit_transaction+0x649/0x7a0 [btrfs]
       [ 3145.415085]  ? start_transaction+0x2b0/0x3b3 [btrfs]
       [ 3145.415085]  btrfs_dev_replace_start+0x274/0x30c [btrfs]
       [ 3145.415085]  btrfs_dev_replace_by_ioctl+0x45/0x59 [btrfs]
       [ 3145.415085]  btrfs_ioctl+0x1a91/0x1d62 [btrfs]
       [ 3145.415085]  ? lock_acquire+0x16a/0x1af
       [ 3145.415085]  ? vfs_ioctl+0x1b/0x28
       [ 3145.415085]  ? trace_hardirqs_on_caller+0x14c/0x1a6
       [ 3145.415085]  vfs_ioctl+0x1b/0x28
       [ 3145.415085]  do_vfs_ioctl+0x5a9/0x5e0
       [ 3145.415085]  ? _raw_spin_unlock_irq+0x34/0x46
       [ 3145.415085]  ? entry_SYSCALL_64_fastpath+0x5/0x8b
       [ 3145.415085]  ? trace_hardirqs_on_caller+0x14c/0x1a6
       [ 3145.415085]  SyS_ioctl+0x52/0x76
       [ 3145.415085]  entry_SYSCALL_64_fastpath+0x1e/0x8b
       [ 3145.415085] RIP: 0033:0x7f56558b3c47
       [ 3145.415085] RSP: 002b:00007ffdcfac4c58 EFLAGS: 00000202
       [ 3145.415085] Code: be 02 00 00 00 4c 89 ef e8 b9 e7 03 00 85 c0 89 c5 75 75 48 8b 44 24 08 45 31 f6 48 8d 58 60 eb 52 48 8b 03 48 8b b8 a0 00 00 00 <48> 8b 87 e0 00
       [ 3145.415085] RIP: btrfs_discard_extent+0x6a/0xf8 [btrfs] RSP: ffffc90004813c60
       [ 3145.415085] CR2: 00000000000000e0
       [ 3145.458185] ---[ end trace 06302e7ac31902bf ]---
      
      This is trivially reproduced by running the test btrfs/027 from fstests
      like this:
      
        $ MOUNT_OPTIONS="-o discard" ./check btrfs/027
      
      Fix this by skipping devices without a backing device before attempting
      to discard.
      
      Fixes: 38b5f68e ("btrfs: drop btrfs_device::can_discard to query directly")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      627e0873
    • Z
      btrfs: remove spurious WARN_ON(ref->count < 0) in find_parent_nodes · c8195a7b
      Zygo Blaxell 提交于
      Until v4.14, this warning was very infrequent:
      
      	WARNING: CPU: 3 PID: 18172 at fs/btrfs/backref.c:1391 find_parent_nodes+0xc41/0x14e0
      	Modules linked in: [...]
      	CPU: 3 PID: 18172 Comm: bees Tainted: G      D W    L  4.11.9-zb64+ #1
      	Hardware name: System manufacturer System Product Name/M5A78L-M/USB3, BIOS 2101    12/02/2014
      	Call Trace:
      	 dump_stack+0x85/0xc2
      	 __warn+0xd1/0xf0
      	 warn_slowpath_null+0x1d/0x20
      	 find_parent_nodes+0xc41/0x14e0
      	 __btrfs_find_all_roots+0xad/0x120
      	 ? extent_same_check_offsets+0x70/0x70
      	 iterate_extent_inodes+0x168/0x300
      	 iterate_inodes_from_logical+0x87/0xb0
      	 ? iterate_inodes_from_logical+0x87/0xb0
      	 ? extent_same_check_offsets+0x70/0x70
      	 btrfs_ioctl+0x8ac/0x2820
      	 ? lock_acquire+0xc2/0x200
      	 do_vfs_ioctl+0x91/0x700
      	 ? __fget+0x112/0x200
      	 SyS_ioctl+0x79/0x90
      	 entry_SYSCALL_64_fastpath+0x23/0xc6
      	 ? trace_hardirqs_off_caller+0x1f/0x140
      
      Starting with v4.14 (specifically 86d5f994 ("btrfs: convert prelimary
      reference tracking to use rbtrees")) the WARN_ON occurs three orders of
      magnitude more frequently--almost once per second while running workloads
      like bees.
      
      Replace the WARN_ON() with a comment rationale for its removal.
      The rationale is paraphrased from an explanation by Edmund Nadolski
      <enadolski@suse.de> on the linux-btrfs mailing list.
      
      Fixes: 8da6d581 ("Btrfs: added btrfs_find_all_roots()")
      Signed-off-by: NZygo Blaxell <ce3g8jdj@umail.furryterror.org>
      Reviewed-by: NLu Fengqi <lufq.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c8195a7b
    • N
      btrfs: Ignore errors from btrfs_qgroup_trace_extent_post · 952bd3db
      Nikolay Borisov 提交于
      Running generic/019 with qgroups on the scratch device enabled is almost
      guaranteed to trigger the BUG_ON in btrfs_free_tree_block. It's supposed
      to trigger only on -ENOMEM, in reality, however, it's possible to get
      -EIO from btrfs_qgroup_trace_extent_post. This function just finds the
      roots of the extent being tracked and sets the qrecord->old_roots list.
      If this operation fails nothing critical happens except the quota
      accounting can be considered wrong. In such case just set the
      INCONSISTENT flag for the quota and print a warning, rather than killing
      off the system. Additionally, it's possible to trigger a BUG_ON in
      btrfs_truncate_inode_items as well.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      [ error message adjustments ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      952bd3db
    • L
      Btrfs: fix unexpected -EEXIST when creating new inode · 900c9981
      Liu Bo 提交于
      The highest objectid, which is assigned to new inode, is decided at
      the time of initializing fs roots.  However, in cases where log replay
      gets processed, the btree which fs root owns might be changed, so we
      have to search it again for the highest objectid, otherwise creating
      new inode would end up with -EEXIST.
      
      cc: <stable@vger.kernel.org> v4.4-rc6+
      Fixes: f32e48e9 ("Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      900c9981
    • L
      Btrfs: fix use-after-free on root->orphan_block_rsv · 1a932ef4
      Liu Bo 提交于
      I got these from running generic/475,
      
      WARNING: CPU: 0 PID: 26384 at fs/btrfs/inode.c:3326 btrfs_orphan_commit_root+0x1ac/0x2b0 [btrfs]
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      IP: btrfs_block_rsv_release+0x1c/0x70 [btrfs]
      Call Trace:
        btrfs_orphan_release_metadata+0x9f/0x200 [btrfs]
        btrfs_orphan_del+0x10d/0x170 [btrfs]
        btrfs_setattr+0x500/0x640 [btrfs]
        notify_change+0x7ae/0x870
        do_truncate+0xca/0x130
        vfs_truncate+0x2ee/0x3d0
        do_sys_truncate+0xaf/0xf0
        SyS_truncate+0xe/0x10
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      The race is between btrfs_orphan_commit_root and btrfs_orphan_del,
              t1                                        t2
      btrfs_orphan_commit_root                     btrfs_orphan_del
         spin_lock
         check (&root->orphan_inodes)
         root->orphan_block_rsv = NULL;
         spin_unlock
                                                   atomic_dec(&root->orphan_inodes);
                                                   access root->orphan_block_rsv
      
      Accessing root->orphan_block_rsv must be done before decreasing
      root->orphan_inodes.
      
      cc: <stable@vger.kernel.org> v3.12+
      Fixes: 703c88e0 ("Btrfs: fix tracking of orphan inode count")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1a932ef4
    • L
      Btrfs: fix btrfs_evict_inode to handle abnormal inodes correctly · e8f1bc14
      Liu Bo 提交于
      This regression is introduced in
      commit 3d48d981 ("btrfs: Handle uninitialised inode eviction").
      
      There are two problems,
      
      a) it is ->destroy_inode() that does the final free on inode, not
         ->evict_inode(),
      b) clear_inode() must be called before ->evict_inode() returns.
      
      This could end up hitting BUG_ON(inode->i_state != (I_FREEING | I_CLEAR));
      in evict() because I_CLEAR is set in clear_inode().
      
      Fixes: commit 3d48d981 ("btrfs: Handle uninitialised inode eviction")
      Cc: <stable@vger.kernel.org> # v4.7-rc6+
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e8f1bc14
    • L
      Btrfs: fix extent state leak from tree log · 55237a5f
      Liu Bo 提交于
      It's possible that btrfs_sync_log() bails out after one of the two
      btrfs_write_marked_extents() which convert extent state's state bit into
      EXTENT_NEED_WAIT from EXTENT_DIRTY/EXTENT_NEW, however only EXTENT_DIRTY
      and EXTENT_NEW are searched by free_log_tree() so that those extent states
      with EXTENT_NEED_WAIT lead to memory leak.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      55237a5f
    • L
      Btrfs: fix crash due to not cleaning up tree log block's dirty bits · 1846430c
      Liu Bo 提交于
      In cases that the whole fs flips into readonly status due to failures in
      critical sections, then log tree's blocks are still dirty, and this leads
      to a crash during umount time, the crash is about use-after-free,
      
      umount
       -> close_ctree
          -> stop workers
          -> iput(btree_inode)
             -> iput_final
                -> write_inode_now
      	     -> ...
      	       -> queue job on stop'd workers
      
      cc: <stable@vger.kernel.org> v3.12+
      Fixes: 681ae509 ("Btrfs: cleanup reserved space when freeing tree log on error")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1846430c
    • L
      Btrfs: fix deadlock in run_delalloc_nocow · e8916699
      Liu Bo 提交于
      @cur_offset is not set back to what it should be (@cow_start) if
      btrfs_next_leaf() returns something wrong, and the range [cow_start,
      cur_offset) remains locked forever.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e8916699
  2. 30 1月, 2018 2 次提交
  3. 23 1月, 2018 7 次提交
  4. 22 1月, 2018 22 次提交