1. 22 1月, 2018 36 次提交
  2. 03 1月, 2018 2 次提交
    • C
      btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes · ec35e48b
      Chris Mason 提交于
      refcounts have a generic implementation and an asm optimized one.  The
      generic version has extra debugging to make sure that once a refcount
      goes to zero, refcount_inc won't increase it.
      
      The btrfs delayed inode code wasn't expecting this, and we're tripping
      over the warnings when the generic refcounts are used.  We ended up with
      this race:
      
      Process A                                         Process B
                                                        btrfs_get_delayed_node()
      						  spin_lock(root->inode_lock)
      						  radix_tree_lookup()
      __btrfs_release_delayed_node()
      refcount_dec_and_test(&delayed_node->refs)
      our refcount is now zero
      						  refcount_add(2) <---
      						  warning here, refcount
                                                        unchanged
      
      spin_lock(root->inode_lock)
      radix_tree_delete()
      
      With the generic refcounts, we actually warn again when process B above
      tries to release his refcount because refcount_add() turned into a
      no-op.
      
      We saw this in production on older kernels without the asm optimized
      refcounts.
      
      The fix used here is to use refcount_inc_not_zero() to detect when the
      object is in the middle of being freed and return NULL.  This is almost
      always the right answer anyway, since we usually end up pitching the
      delayed_node if it didn't have fresh data in it.
      
      This also changes __btrfs_release_delayed_node() to remove the extra
      check for zero refcounts before radix tree deletion.
      btrfs_get_delayed_node() was the only path that was allowing refcounts
      to go from zero to one.
      
      Fixes: 6de5f18e ("btrfs: fix refcount_t usage when deleting btrfs_delayed_node")
      CC: <stable@vger.kernel.org> # 4.12+
      Signed-off-by: NChris Mason <clm@fb.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ec35e48b
    • N
      btrfs: Fix flush bio leak · beed9263
      Nikolay Borisov 提交于
      Commit e0ae9994 ("btrfs: preallocate device flush bio") reworked
      the way the flush bio is allocated and used. Concretely it allocates
      the bio in __alloc_device and then re-uses it multiple times with a
      very simple endio routine that just calls complete() without consuming
      a reference. Allocated bios by default come with a ref count of 1,
      which is then consumed by the endio routine (or not, in which case they
      should be bio_put by the caller). The way the impleementation works now
      is that the flush bio has a refcount of 2 and we only ever bio_put it
      once, leaving it to hang indefinitely. Fix this by removing the extra
      bio_get in __alloc_device.
      
      Fixes: e0ae9994 ("btrfs: preallocate device flush bio")
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      beed9263
  3. 07 12月, 2017 2 次提交
    • N
      btrfs: Fix possible off-by-one in btrfs_search_path_in_tree · c8bcbfbd
      Nikolay Borisov 提交于
      The name char array passed to btrfs_search_path_in_tree is of size
      BTRFS_INO_LOOKUP_PATH_MAX (4080). So the actual accessible char indexes
      are in the range of [0, 4079]. Currently the code uses the define but this
      represents an off-by-one.
      
      Implications:
      
      Size of btrfs_ioctl_ino_lookup_args is 4096, so the new byte will be
      written to extra space, not some padding that could be provided by the
      allocator.
      
      btrfs-progs store the arguments on stack, but kernel does own copy of
      the ioctl buffer and the off-by-one overwrite does not affect userspace,
      but the ending 0 might be lost.
      
      Kernel ioctl buffer is allocated dynamically so we're overwriting
      somebody else's memory, and the ioctl is privileged if args.objectid is
      not 256. Which is in most cases, but resolving a subvolume stored in
      another directory will trigger that path.
      
      Before this patch the buffer was one byte larger, but then the -1 was
      not added.
      
      Fixes: ac8e9819 ("Btrfs: add search and inode lookup ioctls")
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ added implications ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c8bcbfbd
    • O
      Btrfs: disable FUA if mounted with nobarrier · 1b9e619c
      Omar Sandoval 提交于
      I was seeing disk flushes still happening when I mounted a Btrfs
      filesystem with nobarrier for testing. This is because we use FUA to
      write out the first super block, and on devices without FUA support, the
      block layer translates FUA to a flush. Even on devices supporting true
      FUA, using FUA when we asked for no barriers is surprising.
      
      Fixes: 387125fc ("Btrfs: fix barrier flushes")
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1b9e619c