1. 19 11月, 2019 1 次提交
  2. 18 11月, 2019 7 次提交
  3. 09 9月, 2019 8 次提交
    • N
      btrfs: Don't assign retval of btrfs_try_tree_write_lock/btrfs_tree_read_lock_atomic · 65e99c43
      Nikolay Borisov 提交于
      Those function are simple boolean predicates there is no need to assign
      their return values to interim variables. Use them directly as
      predicates. No functional changes.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      65e99c43
    • J
      btrfs: create structure to encode checksum type and length · af024ed2
      Johannes Thumshirn 提交于
      Create a structure to encode the type and length for the known on-disk
      checksums.  This makes it easier to add new checksums later.
      
      The structure and helpers are moved from ctree.h so they don't occupy
      space in all headers including ctree.h. This save some space in the
      final object.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      af024ed2
    • D
      btrfs: tie extent buffer and it's token together · c82f823c
      David Sterba 提交于
      Further simplifaction of the get/set helpers is possible when the token
      is uniquely tied to an extent buffer. A condition and an assignment can
      be avoided.
      
      The initializations are moved closer to the first use when the extent
      buffer is valid. There's one exception in __push_leaf_left where the
      token is reused.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c82f823c
    • D
      btrfs: move functions for tree compare to send.c · 18d0f5c6
      David Sterba 提交于
      Send is the only user of tree_compare, we can move it there along with
      the other helpers and definitions.
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      18d0f5c6
    • D
      btrfs: rename and export read_node_slot · 4b231ae4
      David Sterba 提交于
      Preparatory work for code that will be moved out of ctree and uses this
      function.
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4b231ae4
    • F
      Btrfs: fix use-after-free when using the tree modification log · efad8a85
      Filipe Manana 提交于
      At ctree.c:get_old_root(), we are accessing a root's header owner field
      after we have freed the respective extent buffer. This results in an
      use-after-free that can lead to crashes, and when CONFIG_DEBUG_PAGEALLOC
      is set, results in a stack trace like the following:
      
        [ 3876.799331] stack segment: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
        [ 3876.799363] CPU: 0 PID: 15436 Comm: pool Not tainted 5.3.0-rc3-btrfs-next-54 #1
        [ 3876.799385] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
        [ 3876.799433] RIP: 0010:btrfs_search_old_slot+0x652/0xd80 [btrfs]
        (...)
        [ 3876.799502] RSP: 0018:ffff9f08c1a2f9f0 EFLAGS: 00010286
        [ 3876.799518] RAX: ffff8dd300000000 RBX: ffff8dd85a7a9348 RCX: 000000038da26000
        [ 3876.799538] RDX: 0000000000000000 RSI: ffffe522ce368980 RDI: 0000000000000246
        [ 3876.799559] RBP: dae1922adadad000 R08: 0000000008020000 R09: ffffe522c0000000
        [ 3876.799579] R10: ffff8dd57fd788c8 R11: 000000007511b030 R12: ffff8dd781ddc000
        [ 3876.799599] R13: ffff8dd9e6240578 R14: ffff8dd6896f7a88 R15: ffff8dd688cf90b8
        [ 3876.799620] FS:  00007f23ddd97700(0000) GS:ffff8dda20200000(0000) knlGS:0000000000000000
        [ 3876.799643] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 3876.799660] CR2: 00007f23d4024000 CR3: 0000000710bb0005 CR4: 00000000003606f0
        [ 3876.799682] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [ 3876.799703] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [ 3876.799723] Call Trace:
        [ 3876.799735]  ? do_raw_spin_unlock+0x49/0xc0
        [ 3876.799749]  ? _raw_spin_unlock+0x24/0x30
        [ 3876.799779]  resolve_indirect_refs+0x1eb/0xc80 [btrfs]
        [ 3876.799810]  find_parent_nodes+0x38d/0x1180 [btrfs]
        [ 3876.799841]  btrfs_check_shared+0x11a/0x1d0 [btrfs]
        [ 3876.799870]  ? extent_fiemap+0x598/0x6e0 [btrfs]
        [ 3876.799895]  extent_fiemap+0x598/0x6e0 [btrfs]
        [ 3876.799913]  do_vfs_ioctl+0x45a/0x700
        [ 3876.799926]  ksys_ioctl+0x70/0x80
        [ 3876.799938]  ? trace_hardirqs_off_thunk+0x1a/0x20
        [ 3876.799953]  __x64_sys_ioctl+0x16/0x20
        [ 3876.799965]  do_syscall_64+0x62/0x220
        [ 3876.799977]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [ 3876.799993] RIP: 0033:0x7f23e0013dd7
        (...)
        [ 3876.800056] RSP: 002b:00007f23ddd96ca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [ 3876.800078] RAX: ffffffffffffffda RBX: 00007f23d80210f8 RCX: 00007f23e0013dd7
        [ 3876.800099] RDX: 00007f23d80210f8 RSI: 00000000c020660b RDI: 0000000000000003
        [ 3876.800626] RBP: 000055fa2a2a2440 R08: 0000000000000000 R09: 00007f23ddd96d7c
        [ 3876.801143] R10: 00007f23d8022000 R11: 0000000000000246 R12: 00007f23ddd96d80
        [ 3876.801662] R13: 00007f23ddd96d78 R14: 00007f23d80210f0 R15: 00007f23ddd96d80
        (...)
        [ 3876.805107] ---[ end trace e53161e179ef04f9 ]---
      
      Fix that by saving the root's header owner field into a local variable
      before freeing the root's extent buffer, and then use that local variable
      when needed.
      
      Fixes: 30b0463a ("Btrfs: fix accessing the root pointer in tree mod log functions")
      CC: stable@vger.kernel.org # 3.10+
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      efad8a85
    • J
      btrfs: make caching_thread use btrfs_find_next_key · 6a9fb468
      Josef Bacik 提交于
      extent-tree.c has a find_next_key that just walks up the path to find
      the next key, but it is used for both the caching stuff and the snapshot
      delete stuff.  The snapshot deletion stuff is special so it can't really
      use btrfs_find_next_key, but the caching thread stuff can.  We just need
      to fix btrfs_find_next_key to deal with ->skip_locking and then it works
      exactly the same as the private find_next_key helper.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6a9fb468
    • D
      btrfs: assert tree mod log lock in __tree_mod_log_insert · 73e82fe4
      David Sterba 提交于
      The tree is going to be modified so it must be the exclusive lock.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      73e82fe4
  4. 30 4月, 2019 21 次提交
    • Q
      btrfs: ctree: Dump the leaf before BUG_ON in btrfs_set_item_key_safe · 7c15d410
      Qu Wenruo 提交于
      We have a long standing problem with reversed keys that's detected by
      btrfs_set_item_key_safe. This is hard to reproduce so we'd like to
      capture more information for later analysis.
      
      Let's dump the leaf content before triggering BUG_ON() so that we can
      have some clue on what's going wrong.  The output of tree locks should
      help us to debug such problem.
      
      Sample stacktrace:
      
       generic/522             [00:07:05]
       [26946.113381] run fstests generic/522 at 2019-04-16 00:07:05
       [27161.474720] kernel BUG at fs/btrfs/ctree.c:3192!
       [27161.475923] invalid opcode: 0000 [#1] PREEMPT SMP
       [27161.477167] CPU: 0 PID: 15676 Comm: fsx Tainted: G        W         5.1.0-rc5-default+ #562
       [27161.478932] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-prebuilt.qemu.org 04/01/2014
       [27161.481099] RIP: 0010:btrfs_set_item_key_safe+0x146/0x1c0 [btrfs]
       [27161.485369] RSP: 0018:ffffb087499e39b0 EFLAGS: 00010286
       [27161.486464] RAX: 00000000ffffffff RBX: ffff941534d80e70 RCX: 0000000000024000
       [27161.487929] RDX: 0000000000013039 RSI: ffffb087499e3aa5 RDI: ffffb087499e39c7
       [27161.489289] RBP: 000000000000000e R08: ffff9414e0f49008 R09: 0000000000001000
       [27161.490807] R10: 0000000000000000 R11: 0000000000000003 R12: ffff9414e0f48e70
       [27161.492305] R13: ffffb087499e3aa5 R14: 0000000000000000 R15: 0000000000071000
       [27161.493845] FS:  00007f8ea58d0b80(0000) GS:ffff94153d400000(0000) knlGS:0000000000000000
       [27161.495608] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [27161.496717] CR2: 00007f8ea57a9000 CR3: 0000000016a33000 CR4: 00000000000006f0
       [27161.498100] Call Trace:
       [27161.498771]  __btrfs_drop_extents+0x6ec/0xdf0 [btrfs]
       [27161.499872]  btrfs_log_changed_extents.isra.26+0x3a2/0x9e0 [btrfs]
       [27161.501114]  btrfs_log_inode+0x7ff/0xdc0 [btrfs]
       [27161.502114]  ? __mutex_unlock_slowpath+0x4b/0x2b0
       [27161.503172]  btrfs_log_inode_parent+0x237/0x9c0 [btrfs]
       [27161.504348]  btrfs_log_dentry_safe+0x4a/0x70 [btrfs]
       [27161.505374]  btrfs_sync_file+0x1b7/0x480 [btrfs]
       [27161.506371]  __x64_sys_msync+0x180/0x210
       [27161.507208]  do_syscall_64+0x54/0x180
       [27161.507932]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
       [27161.508839] RIP: 0033:0x7f8ea5aa9c61
       [27161.512616] RSP: 002b:00007ffea2a06498 EFLAGS: 00000246 ORIG_RAX: 000000000000001a
       [27161.514161] RAX: ffffffffffffffda RBX: 000000000002a938 RCX: 00007f8ea5aa9c61
       [27161.515376] RDX: 0000000000000004 RSI: 000000000001c9b2 RDI: 00007f8ea578d000
       [27161.516572] RBP: 000000000001c07a R08: fffffffffffffff8 R09: 000000000002a000
       [27161.517883] R10: 00007f8ea57a99b2 R11: 0000000000000246 R12: 0000000000000938
       [27161.519080] R13: 00007f8ea578d000 R14: 000000000001c9b2 R15: 0000000000000000
       [27161.520281] Modules linked in: btrfs libcrc32c xor zstd_decompress zstd_compress xxhash raid6_pq loop [last unloaded: scsi_debug]
       [27161.522272] ---[ end trace d5afec7ccac6a252 ]---
       [27161.523111] RIP: 0010:btrfs_set_item_key_safe+0x146/0x1c0 [btrfs]
       [27161.527253] RSP: 0018:ffffb087499e39b0 EFLAGS: 00010286
       [27161.528192] RAX: 00000000ffffffff RBX: ffff941534d80e70 RCX: 0000000000024000
       [27161.529392] RDX: 0000000000013039 RSI: ffffb087499e3aa5 RDI: ffffb087499e39c7
       [27161.530607] RBP: 000000000000000e R08: ffff9414e0f49008 R09: 0000000000001000
       [27161.531802] R10: 0000000000000000 R11: 0000000000000003 R12: ffff9414e0f48e70
       [27161.533018] R13: ffffb087499e3aa5 R14: 0000000000000000 R15: 0000000000071000
       [27161.534405] FS:  00007f8ea58d0b80(0000) GS:ffff94153d400000(0000) knlGS:0000000000000000
       [27161.536048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [27161.537210] CR2: 00007f8ea57a9000 CR3: 0000000016a33000 CR4: 00000000000006f0
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      7c15d410
    • D
    • D
      179d1e6a
    • D
      c7da9597
    • D
      c71dd880
    • D
      78ac4f9e
    • D
      25263cd7
    • D
      btrfs: get fs_info from eb in __push_leaf_left · 8087c193
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      8087c193
    • D
      btrfs: get fs_info from eb in __push_leaf_right · f72f0010
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f72f0010
    • D
      btrfs: get fs_info from trans in copy_for_split · 94f94ad9
      David Sterba 提交于
      We can read fs_info from the transaction and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      94f94ad9
    • D
      btrfs: get fs_info from trans in insert_ptr · 6ad3cf6d
      David Sterba 提交于
      We can read fs_info from the transaction and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6ad3cf6d
    • D
      btrfs: get fs_info from trans in balance_node_right · 55d32ed8
      David Sterba 提交于
      We can read fs_info from the transaction and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      55d32ed8
    • D
      btrfs: get fs_info from trans in push_node_left · d30a668f
      David Sterba 提交于
      We can read fs_info from the transaction and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d30a668f
    • D
      btrfs: get fs_info from eb in btrfs_verify_level_key · e064d5e9
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e064d5e9
    • D
      btrfs: get fs_info from eb in read_node_slot · d0d20b0f
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d0d20b0f
    • D
      btrfs: get fs_info from eb in btrfs_leaf_free_space · e902baac
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e902baac
    • D
      btrfs: get fs_info from eb in clean_tree_block · 6a884d7d
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6a884d7d
    • D
      btrfs: get fs_info from eb in tree_mod_log_eb_copy · ed874f0d
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ed874f0d
    • D
      btrfs: get fs_info from eb in leaf_data_end · 8f881e8c
      David Sterba 提交于
      We can read fs_info from extent buffer and can drop it from the
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      8f881e8c
    • A
      btrfs: use BUG() instead of BUG_ON(1) · 290342f6
      Arnd Bergmann 提交于
      BUG_ON(1) leads to bogus warnings from clang when
      CONFIG_PROFILE_ANNOTATED_BRANCHES is set:
      
      fs/btrfs/volumes.c:5041:3: error: variable 'max_chunk_size' is used uninitialized whenever 'if' condition is false
            [-Werror,-Wsometimes-uninitialized]
                      BUG_ON(1);
                      ^~~~~~~~~
      include/asm-generic/bug.h:61:36: note: expanded from macro 'BUG_ON'
       #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                                         ^~~~~~~~~~~~~~~~~~~
      include/linux/compiler.h:48:23: note: expanded from macro 'unlikely'
       #  define unlikely(x)   (__branch_check__(x, 0, __builtin_constant_p(x)))
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      fs/btrfs/volumes.c:5046:9: note: uninitialized use occurs here
                                   max_chunk_size);
                                   ^~~~~~~~~~~~~~
      include/linux/kernel.h:860:36: note: expanded from macro 'min'
       #define min(x, y)       __careful_cmp(x, y, <)
                                               ^
      include/linux/kernel.h:853:17: note: expanded from macro '__careful_cmp'
                      __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op))
                                    ^
      include/linux/kernel.h:847:25: note: expanded from macro '__cmp_once'
                      typeof(y) unique_y = (y);               \
                                            ^
      fs/btrfs/volumes.c:5041:3: note: remove the 'if' if its condition is always true
                      BUG_ON(1);
                      ^
      include/asm-generic/bug.h:61:32: note: expanded from macro 'BUG_ON'
       #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                                     ^
      fs/btrfs/volumes.c:4993:20: note: initialize the variable 'max_chunk_size' to silence this warning
              u64 max_chunk_size;
                                ^
                                 = 0
      
      Change it to BUG() so clang can see that this code path can never
      continue.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      290342f6
    • Q
      btrfs: Check the first key and level for cached extent buffer · 448de471
      Qu Wenruo 提交于
      [BUG]
      When reading a file from a fuzzed image, kernel can panic like:
      
        BTRFS warning (device loop0): csum failed root 5 ino 270 off 0 csum 0x98f94189 expected csum 0x00000000 mirror 1
        assertion failed: !memcmp_extent_buffer(b, &disk_key, offsetof(struct btrfs_leaf, items[0].key), sizeof(disk_key)), file: fs/btrfs/ctree.c, line: 2544
        ------------[ cut here ]------------
        kernel BUG at fs/btrfs/ctree.h:3500!
        invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
        RIP: 0010:btrfs_search_slot.cold.24+0x61/0x63 [btrfs]
        Call Trace:
         btrfs_lookup_csum+0x52/0x150 [btrfs]
         __btrfs_lookup_bio_sums+0x209/0x640 [btrfs]
         btrfs_submit_bio_hook+0x103/0x170 [btrfs]
         submit_one_bio+0x59/0x80 [btrfs]
         extent_read_full_page+0x58/0x80 [btrfs]
         generic_file_read_iter+0x2f6/0x9d0
         __vfs_read+0x14d/0x1a0
         vfs_read+0x8d/0x140
         ksys_read+0x52/0xc0
         do_syscall_64+0x60/0x210
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [CAUSE]
      The fuzzed image has a corrupted leaf whose first key doesn't match its
      parent:
      
        checksum tree key (CSUM_TREE ROOT_ITEM 0)
        node 29741056 level 1 items 14 free 107 generation 19 owner CSUM_TREE
        fs uuid 3381d111-94a3-4ac7-8f39-611bbbdab7e6
        chunk uuid 9af1c3c7-2af5-488b-8553-530bd515f14c
        	...
                key (EXTENT_CSUM EXTENT_CSUM 79691776) block 29761536 gen 19
      
        leaf 29761536 items 1 free space 1726 generation 19 owner CSUM_TREE
        leaf 29761536 flags 0x1(WRITTEN) backref revision 1
        fs uuid 3381d111-94a3-4ac7-8f39-611bbbdab7e6
        chunk uuid 9af1c3c7-2af5-488b-8553-530bd515f14c
                item 0 key (EXTENT_CSUM EXTENT_CSUM 8798638964736) itemoff 1751 itemsize 2244
                        range start 8798638964736 end 8798641262592 length 2297856
      
      When reading the above tree block, we have extent_buffer->refs = 2 in
      the context:
      
      - initial one from __alloc_extent_buffer()
        alloc_extent_buffer()
        |- __alloc_extent_buffer()
           |- atomic_set(&eb->refs, 1)
      
      - one being added to fs_info->buffer_radix
        alloc_extent_buffer()
        |- check_buffer_tree_ref()
           |- atomic_inc(&eb->refs)
      
      So if even we call free_extent_buffer() in read_tree_block or other
      similar situation, we only decrease the refs by 1, it doesn't reach 0
      and won't be freed right now.
      
      The staled eb and its corrupted content will still be kept cached.
      
      Furthermore, we have several extra cases where we either don't do first
      key check or the check is not proper for all callers:
      
      - scrub
        We just don't have first key in this context.
      
      - shared tree block
        One tree block can be shared by several snapshot/subvolume trees.
        In that case, the first key check for one subvolume doesn't apply to
        another.
      
      So for the above reasons, a corrupted extent buffer can sneak into the
      buffer cache.
      
      [FIX]
      Call verify_level_key in read_block_for_search to do another
      verification. For that purpose the function is exported.
      
      Due to above reasons, although we can free corrupted extent buffer from
      cache, we still need the check in read_block_for_search(), for scrub and
      shared tree blocks.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202755
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202757
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202759
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202761
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202767
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202769Reported-by: NYoon Jungyeon <jungyeon@gatech.edu>
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      448de471
  5. 25 2月, 2019 3 次提交