1. 07 10月, 2020 1 次提交
  2. 14 9月, 2020 1 次提交
  3. 27 8月, 2020 1 次提交
    • J
      btrfs: fix potential deadlock in the search ioctl · a48b73ec
      Josef Bacik 提交于
      With the conversion of the tree locks to rwsem I got the following
      lockdep splat:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted
        ------------------------------------------------------
        compsize/11122 is trying to acquire lock:
        ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90
      
        but task is already holding lock:
        ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #2 (btrfs-fs-00){++++}-{3:3}:
      	 down_write_nested+0x3b/0x70
      	 __btrfs_tree_lock+0x24/0x120
      	 btrfs_search_slot+0x756/0x990
      	 btrfs_lookup_inode+0x3a/0xb4
      	 __btrfs_update_delayed_inode+0x93/0x270
      	 btrfs_async_run_delayed_root+0x168/0x230
      	 btrfs_work_helper+0xd4/0x570
      	 process_one_work+0x2ad/0x5f0
      	 worker_thread+0x3a/0x3d0
      	 kthread+0x133/0x150
      	 ret_from_fork+0x1f/0x30
      
        -> #1 (&delayed_node->mutex){+.+.}-{3:3}:
      	 __mutex_lock+0x9f/0x930
      	 btrfs_delayed_update_inode+0x50/0x440
      	 btrfs_update_inode+0x8a/0xf0
      	 btrfs_dirty_inode+0x5b/0xd0
      	 touch_atime+0xa1/0xd0
      	 btrfs_file_mmap+0x3f/0x60
      	 mmap_region+0x3a4/0x640
      	 do_mmap+0x376/0x580
      	 vm_mmap_pgoff+0xd5/0x120
      	 ksys_mmap_pgoff+0x193/0x230
      	 do_syscall_64+0x50/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
      	 __lock_acquire+0x1272/0x2310
      	 lock_acquire+0x9e/0x360
      	 __might_fault+0x68/0x90
      	 _copy_to_user+0x1e/0x80
      	 copy_to_sk.isra.32+0x121/0x300
      	 search_ioctl+0x106/0x200
      	 btrfs_ioctl_tree_search_v2+0x7b/0xf0
      	 btrfs_ioctl+0x106f/0x30a0
      	 ksys_ioctl+0x83/0xc0
      	 __x64_sys_ioctl+0x16/0x20
      	 do_syscall_64+0x50/0x90
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
        other info that might help us debug this:
      
        Chain exists of:
          &mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00
      
         Possible unsafe locking scenario:
      
      	 CPU0                    CPU1
      	 ----                    ----
          lock(btrfs-fs-00);
      				 lock(&delayed_node->mutex);
      				 lock(btrfs-fs-00);
          lock(&mm->mmap_lock#2);
      
         *** DEADLOCK ***
      
        1 lock held by compsize/11122:
         #0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
      
        stack backtrace:
        CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922
        Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
        Call Trace:
         dump_stack+0x78/0xa0
         check_noncircular+0x165/0x180
         __lock_acquire+0x1272/0x2310
         lock_acquire+0x9e/0x360
         ? __might_fault+0x3e/0x90
         ? find_held_lock+0x72/0x90
         __might_fault+0x68/0x90
         ? __might_fault+0x3e/0x90
         _copy_to_user+0x1e/0x80
         copy_to_sk.isra.32+0x121/0x300
         ? btrfs_search_forward+0x2a6/0x360
         search_ioctl+0x106/0x200
         btrfs_ioctl_tree_search_v2+0x7b/0xf0
         btrfs_ioctl+0x106f/0x30a0
         ? __do_sys_newfstat+0x5a/0x70
         ? ksys_ioctl+0x83/0xc0
         ksys_ioctl+0x83/0xc0
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x50/0x90
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The problem is we're doing a copy_to_user() while holding tree locks,
      which can deadlock if we have to do a page fault for the copy_to_user().
      This exists even without my locking changes, so it needs to be fixed.
      Rework the search ioctl to do the pre-fault and then
      copy_to_user_nofault for the copying.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a48b73ec
  4. 27 7月, 2020 8 次提交
  5. 17 6月, 2020 1 次提交
  6. 25 5月, 2020 4 次提交
    • D
      btrfs: simplify iget helpers · 0202e83f
      David Sterba 提交于
      The inode lookup starting at btrfs_iget takes the full location key,
      while only the objectid is used to match the inode, because the lookup
      happens inside the given root thus the inode number is unique.
      The entire location key is properly set up in btrfs_init_locked_inode.
      
      Simplify the helpers and pass only inode number, renaming it to 'ino'
      instead of 'objectid'. This allows to remove temporary variables key,
      saving some stack space.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      0202e83f
    • D
      btrfs: simplify root lookup by id · 56e9357a
      David Sterba 提交于
      The main function to lookup a root by its id btrfs_get_fs_root takes the
      whole key, while only using the objectid. The value of offset is preset
      to (u64)-1 but not actually used until btrfs_find_root that does the
      actual search.
      
      Switch btrfs_get_fs_root to use only objectid and remove all local
      variables that existed just for the lookup. The actual key for search is
      set up in btrfs_get_fs_root, reusing another key variable.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      56e9357a
    • R
      btrfs: reduce lock contention when creating snapshot · c11fbb6e
      Robbie Ko 提交于
      When creating a snapshot, ordered extents need to be flushed and this
      can take a long time.
      
      In create_snapshot there are two locks held when this happens:
      
        1. Destination directory inode lock
        2. Global subvolume semaphore
      
      This will unnecessarily block other operations like subvolume destroy,
      create, or setflag until the snapshot is created.
      
      We can fix that by moving the flush outside the locked section as this
      does not depend on the aforementioned locks.  The code factors out the
      snapshot related work from create_snapshot to btrfs_mksnapshot.
      
      __btrfs_ioctl_snap_create
        btrfs_mksubvol
          create_subvol
        btrfs_mksnapshot
          <flush>
          btrfs_mksubvol
            create_snapshot
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NRobbie Ko <robbieko@synology.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c11fbb6e
    • Q
      btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE · 92a7cc42
      Qu Wenruo 提交于
      The name BTRFS_ROOT_REF_COWS is not very clear about the meaning.
      
      In fact, that bit can only be set to those trees:
      
      - Subvolume roots
      - Data reloc root
      - Reloc roots for above roots
      
      All other trees won't get this bit set.  So just by the result, it is
      obvious that, roots with this bit set can have tree blocks shared with
      other trees.  Either shared by snapshots, or by reloc roots (an special
      snapshot created by relocation).
      
      This patch will rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE to
      make it easier to understand, and update all comment mentioning
      "reference counted" to follow the rename.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      92a7cc42
  7. 24 3月, 2020 22 次提交
  8. 24 1月, 2020 1 次提交
    • F
      Btrfs: make deduplication with range including the last block work · 831d2fa2
      Filipe Manana 提交于
      Since btrfs was migrated to use the generic VFS helpers for clone and
      deduplication, it stopped allowing for the last block of a file to be
      deduplicated when the source file size is not sector size aligned (when
      eof is somewhere in the middle of the last block). There are two reasons
      for that:
      
      1) The generic code always rounds down, to a multiple of the block size,
         the range's length for deduplications. This means we end up never
         deduplicating the last block when the eof is not block size aligned,
         even for the safe case where the destination range's end offset matches
         the destination file's size. That rounding down operation is done at
         generic_remap_check_len();
      
      2) Because of that, the btrfs specific code does not expect anymore any
         non-aligned range length's for deduplication and therefore does not
         work if such nona-aligned length is given.
      
      This patch addresses that second part, and it depends on a patch that
      fixes generic_remap_check_len(), in the VFS, which was submitted ealier
      and has the following subject:
      
        "fs: allow deduplication of eof block into the end of the destination file"
      
      These two patches address reports from users that started seeing lower
      deduplication rates due to the last block never being deduplicated when
      the file size is not aligned to the filesystem's block size.
      
      Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@svIo.N5dq.dFFD/
      CC: stable@vger.kernel.org # 5.1+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      831d2fa2
  9. 20 1月, 2020 1 次提交