1. 07 10月, 2020 14 次提交
  2. 27 8月, 2020 1 次提交
  3. 20 8月, 2020 1 次提交
  4. 27 7月, 2020 1 次提交
    • D
      btrfs: add little-endian optimized key helpers · ce6ef5ab
      David Sterba 提交于
      The CPU and on-disk keys are mapped to two different structures because
      of the endianness. There's an intermediate buffer used to do the
      conversion, but this is not necessary when CPU and on-disk endianness
      match.
      
      Add optimized versions of helpers that take disk_key and use the buffer
      directly for CPU keys or drop the intermediate buffer and conversion.
      
      This saves a lot of stack space accross many functions and removes about
      6K of generated binary code:
      
         text    data     bss     dec     hex filename
      1090439   17468   14912 1122819  112203 pre/btrfs.ko
      1084613   17456   14912 1116981  110b35 post/btrfs.ko
      
      Delta: -5826
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ce6ef5ab
  5. 02 7月, 2020 1 次提交
  6. 28 5月, 2020 2 次提交
  7. 25 5月, 2020 5 次提交
  8. 24 3月, 2020 4 次提交
  9. 31 1月, 2020 1 次提交
    • F
      Btrfs: fix race between adding and putting tree mod seq elements and nodes · 7227ff4d
      Filipe Manana 提交于
      There is a race between adding and removing elements to the tree mod log
      list and rbtree that can lead to use-after-free problems.
      
      Consider the following example that explains how/why the problems happens:
      
      1) Task A has mod log element with sequence number 200. It currently is
         the only element in the mod log list;
      
      2) Task A calls btrfs_put_tree_mod_seq() because it no longer needs to
         access the tree mod log. When it enters the function, it initializes
         'min_seq' to (u64)-1. Then it acquires the lock 'tree_mod_seq_lock'
         before checking if there are other elements in the mod seq list.
         Since the list it empty, 'min_seq' remains set to (u64)-1. Then it
         unlocks the lock 'tree_mod_seq_lock';
      
      3) Before task A acquires the lock 'tree_mod_log_lock', task B adds
         itself to the mod seq list through btrfs_get_tree_mod_seq() and gets a
         sequence number of 201;
      
      4) Some other task, name it task C, modifies a btree and because there
         elements in the mod seq list, it adds a tree mod elem to the tree
         mod log rbtree. That node added to the mod log rbtree is assigned
         a sequence number of 202;
      
      5) Task B, which is doing fiemap and resolving indirect back references,
         calls btrfs get_old_root(), with 'time_seq' == 201, which in turn
         calls tree_mod_log_search() - the search returns the mod log node
         from the rbtree with sequence number 202, created by task C;
      
      6) Task A now acquires the lock 'tree_mod_log_lock', starts iterating
         the mod log rbtree and finds the node with sequence number 202. Since
         202 is less than the previously computed 'min_seq', (u64)-1, it
         removes the node and frees it;
      
      7) Task B still has a pointer to the node with sequence number 202, and
         it dereferences the pointer itself and through the call to
         __tree_mod_log_rewind(), resulting in a use-after-free problem.
      
      This issue can be triggered sporadically with the test case generic/561
      from fstests, and it happens more frequently with a higher number of
      duperemove processes. When it happens to me, it either freezes the VM or
      it produces a trace like the following before crashing:
      
        [ 1245.321140] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
        [ 1245.321200] CPU: 1 PID: 26997 Comm: pool Not tainted 5.5.0-rc6-btrfs-next-52 #1
        [ 1245.321235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
        [ 1245.321287] RIP: 0010:rb_next+0x16/0x50
        [ 1245.321307] Code: ....
        [ 1245.321372] RSP: 0018:ffffa151c4d039b0 EFLAGS: 00010202
        [ 1245.321388] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8ae221363c80 RCX: 6b6b6b6b6b6b6b6b
        [ 1245.321409] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ae221363c80
        [ 1245.321439] RBP: ffff8ae20fcc4688 R08: 0000000000000002 R09: 0000000000000000
        [ 1245.321475] R10: ffff8ae20b120910 R11: 00000000243f8bb1 R12: 0000000000000038
        [ 1245.321506] R13: ffff8ae221363c80 R14: 000000000000075f R15: ffff8ae223f762b8
        [ 1245.321539] FS:  00007fdee1ec7700(0000) GS:ffff8ae236c80000(0000) knlGS:0000000000000000
        [ 1245.321591] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 1245.321614] CR2: 00007fded4030c48 CR3: 000000021da16003 CR4: 00000000003606e0
        [ 1245.321642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [ 1245.321668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [ 1245.321706] Call Trace:
        [ 1245.321798]  __tree_mod_log_rewind+0xbf/0x280 [btrfs]
        [ 1245.321841]  btrfs_search_old_slot+0x105/0xd00 [btrfs]
        [ 1245.321877]  resolve_indirect_refs+0x1eb/0xc60 [btrfs]
        [ 1245.321912]  find_parent_nodes+0x3dc/0x11b0 [btrfs]
        [ 1245.321947]  btrfs_check_shared+0x115/0x1c0 [btrfs]
        [ 1245.321980]  ? extent_fiemap+0x59d/0x6d0 [btrfs]
        [ 1245.322029]  extent_fiemap+0x59d/0x6d0 [btrfs]
        [ 1245.322066]  do_vfs_ioctl+0x45a/0x750
        [ 1245.322081]  ksys_ioctl+0x70/0x80
        [ 1245.322092]  ? trace_hardirqs_off_thunk+0x1a/0x1c
        [ 1245.322113]  __x64_sys_ioctl+0x16/0x20
        [ 1245.322126]  do_syscall_64+0x5c/0x280
        [ 1245.322139]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [ 1245.322155] RIP: 0033:0x7fdee3942dd7
        [ 1245.322177] Code: ....
        [ 1245.322258] RSP: 002b:00007fdee1ec6c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [ 1245.322294] RAX: ffffffffffffffda RBX: 00007fded40210d8 RCX: 00007fdee3942dd7
        [ 1245.322314] RDX: 00007fded40210d8 RSI: 00000000c020660b RDI: 0000000000000004
        [ 1245.322337] RBP: 0000562aa89e7510 R08: 0000000000000000 R09: 00007fdee1ec6d44
        [ 1245.322369] R10: 0000000000000073 R11: 0000000000000246 R12: 00007fdee1ec6d48
        [ 1245.322390] R13: 00007fdee1ec6d40 R14: 00007fded40210d0 R15: 00007fdee1ec6d50
        [ 1245.322423] Modules linked in: ....
        [ 1245.323443] ---[ end trace 01de1e9ec5dff3cd ]---
      
      Fix this by ensuring that btrfs_put_tree_mod_seq() computes the minimum
      sequence number and iterates the rbtree while holding the lock
      'tree_mod_log_lock' in write mode. Also get rid of the 'tree_mod_seq_lock'
      lock, since it is now redundant.
      
      Fixes: bd989ba3 ("Btrfs: add tree modification log functions")
      Fixes: 097b8a7c ("Btrfs: join tree mod log code with the code holding back delayed refs")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      7227ff4d
  10. 13 12月, 2019 1 次提交
    • F
      Btrfs: fix removal logic of the tree mod log that leads to use-after-free issues · 6609fee8
      Filipe Manana 提交于
      When a tree mod log user no longer needs to use the tree it calls
      btrfs_put_tree_mod_seq() to remove itself from the list of users and
      delete all no longer used elements of the tree's red black tree, which
      should be all elements with a sequence number less then our equals to
      the caller's sequence number. However the logic is broken because it
      can delete and free elements from the red black tree that have a
      sequence number greater then the caller's sequence number:
      
      1) At a point in time we have sequence numbers 1, 2, 3 and 4 in the
         tree mod log;
      
      2) The task which got assigned the sequence number 1 calls
         btrfs_put_tree_mod_seq();
      
      3) Sequence number 1 is deleted from the list of sequence numbers;
      
      4) The current minimum sequence number is computed to be the sequence
         number 2;
      
      5) A task using sequence number 2 is at tree_mod_log_rewind() and gets
         a pointer to one of its elements from the red black tree through
         a call to tree_mod_log_search();
      
      6) The task with sequence number 1 iterates the red black tree of tree
         modification elements and deletes (and frees) all elements with a
         sequence number less then or equals to 2 (the computed minimum sequence
         number) - it ends up only leaving elements with sequence numbers of 3
         and 4;
      
      7) The task with sequence number 2 now uses the pointer to its element,
         already freed by the other task, at __tree_mod_log_rewind(), resulting
         in a use-after-free issue. When CONFIG_DEBUG_PAGEALLOC=y it produces
         a trace like the following:
      
        [16804.546854] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
        [16804.547451] CPU: 0 PID: 28257 Comm: pool Tainted: G        W         5.4.0-rc8-btrfs-next-51 #1
        [16804.548059] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
        [16804.548666] RIP: 0010:rb_next+0x16/0x50
        (...)
        [16804.550581] RSP: 0018:ffffb948418ef9b0 EFLAGS: 00010202
        [16804.551227] RAX: 6b6b6b6b6b6b6b6b RBX: ffff90e0247f6600 RCX: 6b6b6b6b6b6b6b6b
        [16804.551873] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff90e0247f6600
        [16804.552504] RBP: ffff90dffe0d4688 R08: 0000000000000001 R09: 0000000000000000
        [16804.553136] R10: ffff90dffa4a0040 R11: 0000000000000000 R12: 000000000000002e
        [16804.553768] R13: ffff90e0247f6600 R14: 0000000000001663 R15: ffff90dff77862b8
        [16804.554399] FS:  00007f4b197ae700(0000) GS:ffff90e036a00000(0000) knlGS:0000000000000000
        [16804.555039] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [16804.555683] CR2: 00007f4b10022000 CR3: 00000002060e2004 CR4: 00000000003606f0
        [16804.556336] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [16804.556968] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [16804.557583] Call Trace:
        [16804.558207]  __tree_mod_log_rewind+0xbf/0x280 [btrfs]
        [16804.558835]  btrfs_search_old_slot+0x105/0xd00 [btrfs]
        [16804.559468]  resolve_indirect_refs+0x1eb/0xc70 [btrfs]
        [16804.560087]  ? free_extent_buffer.part.19+0x5a/0xc0 [btrfs]
        [16804.560700]  find_parent_nodes+0x388/0x1120 [btrfs]
        [16804.561310]  btrfs_check_shared+0x115/0x1c0 [btrfs]
        [16804.561916]  ? extent_fiemap+0x59d/0x6d0 [btrfs]
        [16804.562518]  extent_fiemap+0x59d/0x6d0 [btrfs]
        [16804.563112]  ? __might_fault+0x11/0x90
        [16804.563706]  do_vfs_ioctl+0x45a/0x700
        [16804.564299]  ksys_ioctl+0x70/0x80
        [16804.564885]  ? trace_hardirqs_off_thunk+0x1a/0x20
        [16804.565461]  __x64_sys_ioctl+0x16/0x20
        [16804.566020]  do_syscall_64+0x5c/0x250
        [16804.566580]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [16804.567153] RIP: 0033:0x7f4b1ba2add7
        (...)
        [16804.568907] RSP: 002b:00007f4b197adc88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [16804.569513] RAX: ffffffffffffffda RBX: 00007f4b100210d8 RCX: 00007f4b1ba2add7
        [16804.570133] RDX: 00007f4b100210d8 RSI: 00000000c020660b RDI: 0000000000000003
        [16804.570726] RBP: 000055de05a6cfe0 R08: 0000000000000000 R09: 00007f4b197add44
        [16804.571314] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4b197add48
        [16804.571905] R13: 00007f4b197add40 R14: 00007f4b100210d0 R15: 00007f4b197add50
        (...)
        [16804.575623] ---[ end trace 87317359aad4ba50 ]---
      
      Fix this by making btrfs_put_tree_mod_seq() skip deletion of elements that
      have a sequence number equals to the computed minimum sequence number, and
      not just elements with a sequence number greater then that minimum.
      
      Fixes: bd989ba3 ("Btrfs: add tree modification log functions")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6609fee8
  11. 19 11月, 2019 5 次提交
  12. 18 11月, 2019 4 次提交