1. 17 5月, 2020 3 次提交
    • P
      io_uring: fix FORCE_ASYNC req preparation · bd2ab18a
      Pavel Begunkov 提交于
      As for other not inlined requests, alloc req->io for FORCE_ASYNC reqs,
      so they can be prepared properly.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bd2ab18a
    • P
      io_uring: don't prepare DRAIN reqs twice · 650b5481
      Pavel Begunkov 提交于
      If req->io is not NULL, it's already prepared. Don't do it again,
      it's dangerous.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      650b5481
    • J
      io_uring: initialize ctx->sqo_wait earlier · 583863ed
      Jens Axboe 提交于
      Ensure that ctx->sqo_wait is initialized as soon as the ctx is allocated,
      instead of deferring it to the offload setup. This fixes a syzbot
      reported lockdep complaint, which is really due to trying to wake_up
      on an uninitialized wait queue:
      
      RSP: 002b:00007fffb1fb9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441319
      RDX: 0000000000000001 RSI: 0000000020000140 RDI: 000000000000047b
      RBP: 0000000000010475 R08: 0000000000000001 R09: 00000000004002c8
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402260
      R13: 00000000004022f0 R14: 0000000000000000 R15: 0000000000000000
      INFO: trying to register non-static key.
      the code is fine but needs lockdep annotation.
      turning off the locking correctness validator.
      CPU: 1 PID: 7090 Comm: syz-executor222 Not tainted 5.7.0-rc1-next-20200415-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x188/0x20d lib/dump_stack.c:118
       assign_lock_key kernel/locking/lockdep.c:913 [inline]
       register_lock_class+0x1664/0x1760 kernel/locking/lockdep.c:1225
       __lock_acquire+0x104/0x4c50 kernel/locking/lockdep.c:4234
       lock_acquire+0x1f2/0x8f0 kernel/locking/lockdep.c:4934
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x8c/0xbf kernel/locking/spinlock.c:159
       __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122
       io_cqring_ev_posted+0xa5/0x1e0 fs/io_uring.c:1160
       io_poll_remove_all fs/io_uring.c:4357 [inline]
       io_ring_ctx_wait_and_kill+0x2bc/0x5a0 fs/io_uring.c:7305
       io_uring_create fs/io_uring.c:7843 [inline]
       io_uring_setup+0x115e/0x22b0 fs/io_uring.c:7870
       do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x441319
      Code: e8 5c ae 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb1fb9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      
      Reported-by: syzbot+8c91f5d054e998721c57@syzkaller.appspotmail.com
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      583863ed
  2. 14 5月, 2020 1 次提交
    • J
      io_uring: polled fixed file must go through free iteration · 9d9e88a2
      Jens Axboe 提交于
      When we changed the file registration handling, it became important to
      iterate the bulk request freeing list for fixed files as well, or we
      miss dropping the fixed file reference. If not, we're leaking references,
      and we'll get a kworker stuck waiting for file references to disappear.
      
      This also means we can remove the special casing of fixed vs non-fixed
      files, we need to iterate for both and we can just rely on
      __io_req_aux_free() doing io_put_file() instead of doing it manually.
      
      Fixes: 05589553 ("io_uring: refactor file register/unregister/update handling")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9d9e88a2
  3. 10 5月, 2020 1 次提交
  4. 08 5月, 2020 1 次提交
    • J
      io_uring: don't use 'fd' for openat/openat2/statx · 63ff8223
      Jens Axboe 提交于
      We currently make some guesses as when to open this fd, but in reality
      we have no business (or need) to do so at all. In fact, it makes certain
      things fail, like O_PATH.
      
      Remove the fd lookup from these opcodes, we're just passing the 'fd' to
      generic helpers anyway. With that, we can also remove the special casing
      of fd values in io_req_needs_file(), and the 'fd_non_neg' check that
      we have. And we can ensure that we only read sqe->fd once.
      
      This fixes O_PATH usage with openat/openat2, and ditto statx path side
      oddities.
      
      Cc: stable@vger.kernel.org: # v5.6
      Reported-by: NMax Kellermann <mk@cm4all.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      63ff8223
  5. 07 5月, 2020 1 次提交
  6. 06 5月, 2020 1 次提交
  7. 04 5月, 2020 1 次提交
  8. 01 5月, 2020 8 次提交
  9. 30 4月, 2020 2 次提交
    • R
      fibmap: Warn and return an error in case of block > INT_MAX · b75dfde1
      Ritesh Harjani 提交于
      We better warn the fibmap user and not return a truncated and therefore
      an incorrect block map address if the bmap() returned block address
      is greater than INT_MAX (since user supplied integer pointer).
      
      It's better to pr_warn() all user of ioctl_fibmap() and return a proper
      error code rather than silently letting a FS corruption happen if the
      user tries to fiddle around with the returned block map address.
      
      We fix this by returning an error code of -ERANGE and returning 0 as the
      block mapping address in case if it is > INT_MAX.
      
      Now iomap_bmap() could be called from either of these two paths.
      Either when a user is calling an ioctl_fibmap() interface to get
      the block mapping address or by some filesystem via use of bmap()
      internal kernel API.
      bmap() kernel API is well equipped with handling of u64 addresses.
      
      WARN condition in iomap_bmap_actor() was mainly added to warn all
      the fibmap users. But now that we have directly added this warning
      for all fibmap users and also made sure to return 0 as block map address
      in case if addr > INT_MAX.
      So we can now remove this logic from iomap_bmap_actor().
      Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      b75dfde1
    • A
      btrfs: fix gcc-4.8 build warning for struct initializer · 9c6c723f
      Arnd Bergmann 提交于
      Some older compilers like gcc-4.8 warn about mismatched curly braces in
      a initializer:
      
      fs/btrfs/backref.c: In function 'is_shared_data_backref':
      fs/btrfs/backref.c:394:9: error: missing braces around
      initializer [-Werror=missing-braces]
        struct prelim_ref target = {0};
               ^
      fs/btrfs/backref.c:394:9: error: (near initialization for
      'target.rbnode') [-Werror=missing-braces]
      
      Use the GNU empty initializer extension to avoid this.
      
      Fixes: ed58f2e6 ("btrfs: backref, don't add refs from shared block when resolving normal backref")
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      9c6c723f
  10. 29 4月, 2020 2 次提交
    • D
      Fix use after free in get_tree_bdev() · dd7bc815
      David Howells 提交于
      Commit 6fcf0c72, a fix to get_tree_bdev() put a missing blkdev_put() in
      the wrong place, before a warnf() that displays the bdev under
      consideration rather after it.
      
      This results in a silent lockup in printk("%pg") called via warnf() from
      get_tree_bdev() under some circumstances when there's a race with the
      blockdev being frozen.  This can be caused by xfstests/tests/generic/085 in
      combination with Lukas Czerner's ext4 mount API conversion patchset.  It
      looks like it ought to occur with other users of get_tree_bdev() such as
      XFS, but apparently doesn't.
      
      Fix this by switching the order of the lines.
      
      Fixes: 6fcf0c72 ("vfs: add missing blkdev_put() in get_tree_bdev()")
      Reported-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Ian Kent <raven@themaw.net>
      cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd7bc815
    • O
      NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION · dff58530
      Olga Kornievskaia 提交于
      Currently, if the client sends BIND_CONN_TO_SESSION with
      NFS4_CDFC4_FORE_OR_BOTH but only gets NFS4_CDFS4_FORE back it ignores
      that it wasn't able to enable a backchannel.
      
      To make sure, the client sends BIND_CONN_TO_SESSION as the first
      operation on the connections (ie., no other session compounds haven't
      been sent before), and if the client's request to bind the backchannel
      is not satisfied, then reset the connection and retry.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      dff58530
  11. 28 4月, 2020 1 次提交
    • J
      io_uring: statx must grab the file table for valid fd · 5b0bbee4
      Jens Axboe 提交于
      Clay reports that OP_STATX fails for a test case with a valid fd
      and empty path:
      
       -- Test 0: statx:fd 3: SUCCEED, file mode 100755
       -- Test 1: statx:path ./uring_statx: SUCCEED, file mode 100755
       -- Test 2: io_uring_statx:fd 3: FAIL, errno 9: Bad file descriptor
       -- Test 3: io_uring_statx:path ./uring_statx: SUCCEED, file mode 100755
      
      This is due to statx not grabbing the process file table, hence we can't
      lookup the fd in async context. If the fd is valid, ensure that we grab
      the file table so we can grab the file from async context.
      
      Cc: stable@vger.kernel.org # v5.6
      Reported-by: NClay Harris <bugs@claycon.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5b0bbee4
  12. 27 4月, 2020 3 次提交
    • Q
      btrfs: transaction: Avoid deadlock due to bad initialization timing of fs_info::journal_info · fcc99734
      Qu Wenruo 提交于
      [BUG]
      One run of btrfs/063 triggered the following lockdep warning:
        ============================================
        WARNING: possible recursive locking detected
        5.6.0-rc7-custom+ #48 Not tainted
        --------------------------------------------
        kworker/u24:0/7 is trying to acquire lock:
        ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
      
        but task is already holding lock:
        ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
      
        other info that might help us debug this:
         Possible unsafe locking scenario:
      
               CPU0
               ----
          lock(sb_internal#2);
          lock(sb_internal#2);
      
         *** DEADLOCK ***
      
         May be due to missing lock nesting notation
      
        4 locks held by kworker/u24:0/7:
         #0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80
         #1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80
         #2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
         #3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs]
      
        stack backtrace:
        CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ #48
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
        Call Trace:
         dump_stack+0xc2/0x11a
         __lock_acquire.cold+0xce/0x214
         lock_acquire+0xe6/0x210
         __sb_start_write+0x14e/0x290
         start_transaction+0x66c/0x890 [btrfs]
         btrfs_join_transaction+0x1d/0x20 [btrfs]
         find_free_extent+0x1504/0x1a50 [btrfs]
         btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
         btrfs_alloc_tree_block+0x1ac/0x570 [btrfs]
         btrfs_copy_root+0x213/0x580 [btrfs]
         create_reloc_root+0x3bd/0x470 [btrfs]
         btrfs_init_reloc_root+0x2d2/0x310 [btrfs]
         record_root_in_trans+0x191/0x1d0 [btrfs]
         btrfs_record_root_in_trans+0x90/0xd0 [btrfs]
         start_transaction+0x16e/0x890 [btrfs]
         btrfs_join_transaction+0x1d/0x20 [btrfs]
         btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs]
         finish_ordered_fn+0x15/0x20 [btrfs]
         btrfs_work_helper+0x116/0x9a0 [btrfs]
         process_one_work+0x632/0xb80
         worker_thread+0x80/0x690
         kthread+0x1a3/0x1f0
         ret_from_fork+0x27/0x50
      
      It's pretty hard to reproduce, only one hit so far.
      
      [CAUSE]
      This is because we're calling btrfs_join_transaction() without re-using
      the current running one:
      
      btrfs_finish_ordered_io()
      |- btrfs_join_transaction()		<<< Call #1
         |- btrfs_record_root_in_trans()
            |- btrfs_reserve_extent()
      	 |- btrfs_join_transaction()	<<< Call #2
      
      Normally such btrfs_join_transaction() call should re-use the existing
      one, without trying to re-start a transaction.
      
      But the problem is, in btrfs_join_transaction() call #1, we call
      btrfs_record_root_in_trans() before initializing current::journal_info.
      
      And in btrfs_join_transaction() call #2, we're relying on
      current::journal_info to avoid such deadlock.
      
      [FIX]
      Call btrfs_record_root_in_trans() after we have initialized
      current::journal_info.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      fcc99734
    • F
      btrfs: fix partial loss of prealloc extent past i_size after fsync · f135cea3
      Filipe Manana 提交于
      When we have an inode with a prealloc extent that starts at an offset
      lower than the i_size and there is another prealloc extent that starts at
      an offset beyond i_size, we can end up losing part of the first prealloc
      extent (the part that starts at i_size) and have an implicit hole if we
      fsync the file and then have a power failure.
      
      Consider the following example with comments explaining how and why it
      happens.
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
      
        # Create our test file with 2 consecutive prealloc extents, each with a
        # size of 128Kb, and covering the range from 0 to 256Kb, with a file
        # size of 0.
        $ xfs_io -f -c "falloc -k 0 128K" /mnt/foo
        $ xfs_io -c "falloc -k 128K 128K" /mnt/foo
      
        # Fsync the file to record both extents in the log tree.
        $ xfs_io -c "fsync" /mnt/foo
      
        # Now do a redudant extent allocation for the range from 0 to 64Kb.
        # This will merely increase the file size from 0 to 64Kb. Instead we
        # could also do a truncate to set the file size to 64Kb.
        $ xfs_io -c "falloc 0 64K" /mnt/foo
      
        # Fsync the file, so we update the inode item in the log tree with the
        # new file size (64Kb). This also ends up setting the number of bytes
        # for the first prealloc extent to 64Kb. This is done by the truncation
        # at btrfs_log_prealloc_extents().
        # This means that if a power failure happens after this, a write into
        # the file range 64Kb to 128Kb will not use the prealloc extent and
        # will result in allocation of a new extent.
        $ xfs_io -c "fsync" /mnt/foo
      
        # Now set the file size to 256K with a truncate and then fsync the file.
        # Since no changes happened to the extents, the fsync only updates the
        # i_size in the inode item at the log tree. This results in an implicit
        # hole for the file range from 64Kb to 128Kb, something which fsck will
        # complain when not using the NO_HOLES feature if we replay the log
        # after a power failure.
        $ xfs_io -c "truncate 256K" -c "fsync" /mnt/foo
      
      So instead of always truncating the log to the inode's current i_size at
      btrfs_log_prealloc_extents(), check first if there's a prealloc extent
      that starts at an offset lower than the i_size and with a length that
      crosses the i_size - if there is one, just make sure we truncate to a
      size that corresponds to the end offset of that prealloc extent, so
      that we don't lose the part of that extent that starts at i_size if a
      power failure happens.
      
      A test case for fstests follows soon.
      
      Fixes: 31d11b83 ("Btrfs: fix duplicate extents after fsync of file with prealloc extents")
      CC: stable@vger.kernel.org # 4.14+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f135cea3
    • A
      propagate_one(): mnt_set_mountpoint() needs mount_lock · b0d3869c
      Al Viro 提交于
      ... to protect the modification of mp->m_count done by it.  Most of
      the places that modify that thing also have namespace_lock held,
      but not all of them can do so, so we really need mount_lock here.
      Kudos to Piotr Krysiuk <piotras@gmail.com>, who'd spotted a related
      bug in pivot_root(2) (fixed unnoticed in 5.3); search for other
      similar turds has caught out this one.
      
      Cc: stable@kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b0d3869c
  13. 25 4月, 2020 2 次提交
  14. 24 4月, 2020 4 次提交
  15. 23 4月, 2020 7 次提交
    • X
      btrfs: fix transaction leak in btrfs_recover_relocation · 1402d17d
      Xiyu Yang 提交于
      btrfs_recover_relocation() invokes btrfs_join_transaction(), which joins
      a btrfs_trans_handle object into transactions and returns a reference of
      it with increased refcount to "trans".
      
      When btrfs_recover_relocation() returns, "trans" becomes invalid, so the
      refcount should be decreased to keep refcount balanced.
      
      The reference counting issue happens in one exception handling path of
      btrfs_recover_relocation(). When read_fs_root() failed, the refcnt
      increased by btrfs_join_transaction() is not decreased, causing a refcnt
      leak.
      
      Fix this issue by calling btrfs_end_transaction() on this error path
      when read_fs_root() failed.
      
      Fixes: 79787eaa ("btrfs: replace many BUG_ONs with proper error handling")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn>
      Signed-off-by: NXin Tan <tanxin.ctf@gmail.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1402d17d
    • X
      btrfs: fix block group leak when removing fails · f6033c5e
      Xiyu Yang 提交于
      btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which
      returns a local reference of the block group that contains the given
      bytenr to "block_group" with increased refcount.
      
      When btrfs_remove_block_group() returns, "block_group" becomes invalid,
      so the refcount should be decreased to keep refcount balanced.
      
      The reference counting issue happens in several exception handling paths
      of btrfs_remove_block_group(). When those error scenarios occur such as
      btrfs_alloc_path() returns NULL, the function forgets to decrease its
      refcnt increased by btrfs_lookup_block_group() and will cause a refcnt
      leak.
      
      Fix this issue by jumping to "out_put_group" label and calling
      btrfs_put_block_group() when those error scenarios occur.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn>
      Signed-off-by: NXin Tan <tanxin.ctf@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f6033c5e
    • J
      btrfs: drop logs when we've aborted a transaction · ef67963d
      Josef Bacik 提交于
      Dave reported a problem where we were panicing with generic/475 with
      misc-5.7.  This is because we were doing IO after we had stopped all of
      the worker threads, because we do the log tree cleanup on roots at drop
      time.  Cleaning up the log tree will always need to do reads if we
      happened to have evicted the blocks from memory.
      
      Because of this simply add a helper to btrfs_cleanup_transaction() that
      will go through and drop all of the log roots.  This gets run before we
      do the close_ctree() work, and thus we are allowed to do any reads that
      we would need.  I ran this through many iterations of generic/475 with
      constrained memory and I did not see the issue.
      
        general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
        CPU: 2 PID: 12359 Comm: umount Tainted: G        W 5.6.0-rc7-btrfs-next-58 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
        RIP: 0010:btrfs_queue_work+0x33/0x1c0 [btrfs]
        RSP: 0018:ffff9cfb015937d8 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff8eb5e339ed80 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: ffff8eb5eb33b770 RDI: ffff8eb5e37a0460
        RBP: ffff8eb5eb33b770 R08: 000000000000020c R09: ffffffff9fc09ac0
        R10: 0000000000000007 R11: 0000000000000000 R12: 6b6b6b6b6b6b6b6b
        R13: ffff9cfb00229040 R14: 0000000000000008 R15: ffff8eb5d3868000
        FS:  00007f167ea022c0(0000) GS:ffff8eb5fae00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f167e5e0cb1 CR3: 0000000138c18004 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         btrfs_end_bio+0x81/0x130 [btrfs]
         __split_and_process_bio+0xaf/0x4e0 [dm_mod]
         ? percpu_counter_add_batch+0xa3/0x120
         dm_process_bio+0x98/0x290 [dm_mod]
         ? generic_make_request+0xfb/0x410
         dm_make_request+0x4d/0x120 [dm_mod]
         ? generic_make_request+0xfb/0x410
         generic_make_request+0x12a/0x410
         ? submit_bio+0x38/0x160
         submit_bio+0x38/0x160
         ? percpu_counter_add_batch+0xa3/0x120
         btrfs_map_bio+0x289/0x570 [btrfs]
         ? kmem_cache_alloc+0x24d/0x300
         btree_submit_bio_hook+0x79/0xc0 [btrfs]
         submit_one_bio+0x31/0x50 [btrfs]
         read_extent_buffer_pages+0x2fe/0x450 [btrfs]
         btree_read_extent_buffer_pages+0x7e/0x170 [btrfs]
         walk_down_log_tree+0x343/0x690 [btrfs]
         ? walk_log_tree+0x3d/0x380 [btrfs]
         walk_log_tree+0xf7/0x380 [btrfs]
         ? plist_requeue+0xf0/0xf0
         ? delete_node+0x4b/0x230
         free_log_tree+0x4c/0x130 [btrfs]
         ? wait_log_commit+0x140/0x140 [btrfs]
         btrfs_free_log+0x17/0x30 [btrfs]
         btrfs_drop_and_free_fs_root+0xb0/0xd0 [btrfs]
         btrfs_free_fs_roots+0x10c/0x190 [btrfs]
         ? do_raw_spin_unlock+0x49/0xc0
         ? _raw_spin_unlock+0x29/0x40
         ? release_extent_buffer+0x121/0x170 [btrfs]
         close_ctree+0x289/0x2e6 [btrfs]
         generic_shutdown_super+0x6c/0x110
         kill_anon_super+0xe/0x30
         btrfs_kill_super+0x12/0x20 [btrfs]
         deactivate_locked_super+0x3a/0x70
      Reported-by: NDavid Sterba <dsterba@suse.com>
      Fixes: 8c38938c ("btrfs: move the root freeing stuff into btrfs_put_root")
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ef67963d
    • F
      btrfs: fix memory leak of transaction when deleting unused block group · 5150bf19
      Filipe Manana 提交于
      When cleaning pinned extents right before deleting an unused block group,
      we check if there's still a previous transaction running and if so we
      increment its reference count before using it for cleaning pinned ranges
      in its pinned extents iotree. However we ended up never decrementing the
      reference count after using the transaction, resulting in a memory leak.
      
      Fix it by decrementing the reference count.
      
      Fixes: fe119a6e ("btrfs: switch to per-transaction pinned extents")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5150bf19
    • P
      cifs: fix uninitialised lease_key in open_shroot() · 0fe0781f
      Paulo Alcantara 提交于
      SMB2_open_init() expects a pre-initialised lease_key when opening a
      file with a lease, so set pfid->lease_key prior to calling it in
      open_shroot().
      
      This issue was observed when performing some DFS failover tests and
      the lease key was never randomly generated.
      Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      CC: Stable <stable@vger.kernel.org>
      0fe0781f
    • P
      cifs: ensure correct super block for DFS reconnect · 3786f4bd
      Paulo Alcantara 提交于
      This patch is basically fixing the lookup of tcons (DFS specific) during
      reconnect (smb2pdu.c:__smb2_reconnect) to update their prefix paths.
      
      Previously, we relied on the TCP_Server_Info pointer
      (misc.c:tcp_super_cb) to determine which tcon to update the prefix path
      
      We could not rely on TCP server pointer to determine which super block
      to update the prefix path when reconnecting tcons since it might map
      to different tcons that share same TCP connection.
      
      Instead, walk through all cifs super blocks and compare their DFS full
      paths with the tcon being updated to.
      Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      3786f4bd
    • P
      cifs: do not share tcons with DFS · 65303de8
      Paulo Alcantara 提交于
      This disables tcon re-use for DFS shares.
      
      tcon->dfs_path stores the path that the tcon should connect to when
      doing failing over.
      
      If that tcon is used multiple times e.g. 2 mounts using it with
      different prefixpath, each will need a different dfs_path but there is
      only one tcon. The other solution would be to split the tcon in 2
      tcons during failover but that is much harder.
      
      tcons could not be shared with DFS in cifs.ko because in a
      DFS namespace like:
      
                //domain/dfsroot -> /serverA/dfsroot, /serverB/dfsroot
      
                //serverA/dfsroot/link -> /serverA/target1/aa/bb
      
                //serverA/dfsroot/link2 -> /serverA/target1/cc/dd
      
      you can see that link and link2 are two DFS links that both resolve to
      the same target share (/serverA/target1), so cifs.ko will only contain a
      single tcon for both link and link2.
      
      The problem with that is, if we (auto)mount "link" and "link2", cifs.ko
      will only contain a single tcon for both DFS links so we couldn't
      perform failover or refresh the DFS cache for both links because
      tcon->dfs_path was set to either "link" or "link2", but not both --
      which is wrong.
      Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      65303de8
  16. 22 4月, 2020 2 次提交