1. 20 1月, 2020 1 次提交
    • D
      btrfs: keep track of which extents have been discarded · a7ccb255
      Dennis Zhou 提交于
      Async discard will use the free space cache as backing knowledge for
      which extents to discard. This patch plumbs knowledge about which
      extents need to be discarded into the free space cache from
      unpin_extent_range().
      
      An untrimmed extent can merge with everything as this is a new region.
      Absorbing trimmed extents is a tradeoff to for greater coalescing which
      makes life better for find_free_extent(). Additionally, it seems the
      size of a trim isn't as problematic as the trim io itself.
      
      When reading in the free space cache from disk, if sync is set, mark all
      extents as trimmed. The current code ensures at transaction commit that
      all free space is trimmed when sync is set, so this reflects that.
      Signed-off-by: NDennis Zhou <dennis@kernel.org>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a7ccb255
  2. 19 11月, 2019 3 次提交
  3. 18 11月, 2019 2 次提交
    • J
      btrfs: check page->mapping when loading free space cache · 3797136b
      Josef Bacik 提交于
      While testing 5.2 we ran into the following panic
      
      [52238.017028] BUG: kernel NULL pointer dereference, address: 0000000000000001
      [52238.105608] RIP: 0010:drop_buffers+0x3d/0x150
      [52238.304051] Call Trace:
      [52238.308958]  try_to_free_buffers+0x15b/0x1b0
      [52238.317503]  shrink_page_list+0x1164/0x1780
      [52238.325877]  shrink_inactive_list+0x18f/0x3b0
      [52238.334596]  shrink_node_memcg+0x23e/0x7d0
      [52238.342790]  ? do_shrink_slab+0x4f/0x290
      [52238.350648]  shrink_node+0xce/0x4a0
      [52238.357628]  balance_pgdat+0x2c7/0x510
      [52238.365135]  kswapd+0x216/0x3e0
      [52238.371425]  ? wait_woken+0x80/0x80
      [52238.378412]  ? balance_pgdat+0x510/0x510
      [52238.386265]  kthread+0x111/0x130
      [52238.392727]  ? kthread_create_on_node+0x60/0x60
      [52238.401782]  ret_from_fork+0x1f/0x30
      
      The page we were trying to drop had a page->private, but had no
      page->mapping and so called drop_buffers, assuming that we had a
      buffer_head on the page, and then panic'ed trying to deref 1, which is
      our page->private for data pages.
      
      This is happening because we're truncating the free space cache while
      we're trying to load the free space cache.  This isn't supposed to
      happen, and I'll fix that in a followup patch.  However we still
      shouldn't allow those sort of mistakes to result in messing with pages
      that do not belong to us.  So add the page->mapping check to verify that
      we still own this page after dropping and re-acquiring the page lock.
      
      This page being unlocked as:
      btrfs_readpage
        extent_read_full_page
          __extent_read_full_page
            __do_readpage
              if (!nr)
      	   unlock_page  <-- nr can be 0 only if submit_extent_page
      			    returns an error
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      [ add callchain ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3797136b
    • D
      btrfs: drop unused parameter is_new from btrfs_iget · 4c66e0d4
      David Sterba 提交于
      The parameter is now always set to NULL and could be dropped. The last
      user was get_default_root but that got reworked in 05dbe683 ("Btrfs:
      unify subvol= and subvolid= mounting") and the parameter became unused.
      Reviewed-by: NAnand Jain <anand.jain@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4c66e0d4
  4. 09 9月, 2019 5 次提交
    • O
      btrfs: stop clearing EXTENT_DIRTY in inode I/O tree · e182163d
      Omar Sandoval 提交于
      Since commit fee187d9 ("Btrfs: do not set EXTENT_DIRTY along with
      EXTENT_DELALLOC"), we never set EXTENT_DIRTY in inode->io_tree, so we
      can simplify and stop trying to clear it.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e182163d
    • C
      btrfs: fix allocation of free space cache v1 bitmap pages · 3acd4850
      Christophe Leroy 提交于
      Various notifications of type "BUG kmalloc-4096 () : Redzone
      overwritten" have been observed recently in various parts of the kernel.
      After some time, it has been made a relation with the use of BTRFS
      filesystem and with SLUB_DEBUG turned on.
      
      [   22.809700] BUG kmalloc-4096 (Tainted: G        W        ): Redzone overwritten
      
      [   22.810286] INFO: 0xbe1a5921-0xfbfc06cd. First byte 0x0 instead of 0xcc
      [   22.810866] INFO: Allocated in __load_free_space_cache+0x588/0x780 [btrfs] age=22 cpu=0 pid=224
      [   22.811193] 	__slab_alloc.constprop.26+0x44/0x70
      [   22.811345] 	kmem_cache_alloc_trace+0xf0/0x2ec
      [   22.811588] 	__load_free_space_cache+0x588/0x780 [btrfs]
      [   22.811848] 	load_free_space_cache+0xf4/0x1b0 [btrfs]
      [   22.812090] 	cache_block_group+0x1d0/0x3d0 [btrfs]
      [   22.812321] 	find_free_extent+0x680/0x12a4 [btrfs]
      [   22.812549] 	btrfs_reserve_extent+0xec/0x220 [btrfs]
      [   22.812785] 	btrfs_alloc_tree_block+0x178/0x5f4 [btrfs]
      [   22.813032] 	__btrfs_cow_block+0x150/0x5d4 [btrfs]
      [   22.813262] 	btrfs_cow_block+0x194/0x298 [btrfs]
      [   22.813484] 	commit_cowonly_roots+0x44/0x294 [btrfs]
      [   22.813718] 	btrfs_commit_transaction+0x63c/0xc0c [btrfs]
      [   22.813973] 	close_ctree+0xf8/0x2a4 [btrfs]
      [   22.814107] 	generic_shutdown_super+0x80/0x110
      [   22.814250] 	kill_anon_super+0x18/0x30
      [   22.814437] 	btrfs_kill_super+0x18/0x90 [btrfs]
      [   22.814590] INFO: Freed in proc_cgroup_show+0xc0/0x248 age=41 cpu=0 pid=83
      [   22.814841] 	proc_cgroup_show+0xc0/0x248
      [   22.814967] 	proc_single_show+0x54/0x98
      [   22.815086] 	seq_read+0x278/0x45c
      [   22.815190] 	__vfs_read+0x28/0x17c
      [   22.815289] 	vfs_read+0xa8/0x14c
      [   22.815381] 	ksys_read+0x50/0x94
      [   22.815475] 	ret_from_syscall+0x0/0x38
      
      Commit 69d24804 ("btrfs: use copy_page for copying pages instead of
      memcpy") changed the way bitmap blocks are copied. But allthough bitmaps
      have the size of a page, they were allocated with kzalloc().
      
      Most of the time, kzalloc() allocates aligned blocks of memory, so
      copy_page() can be used. But when some debug options like SLAB_DEBUG are
      activated, kzalloc() may return unaligned pointer.
      
      On powerpc, memcpy(), copy_page() and other copying functions use
      'dcbz' instruction which provides an entire zeroed cacheline to avoid
      memory read when the intention is to overwrite a full line. Functions
      like memcpy() are writen to care about partial cachelines at the start
      and end of the destination, but copy_page() assumes it gets pages. As
      pages are naturally cache aligned, copy_page() doesn't care about
      partial lines. This means that when copy_page() is called with a
      misaligned pointer, a few leading bytes are zeroed.
      
      To fix it, allocate bitmaps through kmem_cache instead of using kzalloc()
      The cache pool is created with PAGE_SIZE alignment constraint.
      Reported-by: NErhard F. <erhard_f@mailbox.org>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204371
      Fixes: 69d24804 ("btrfs: use copy_page for copying pages instead of memcpy")
      Cc: stable@vger.kernel.org # 4.19+
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ rename to btrfs_free_space_bitmap ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3acd4850
    • J
      btrfs: rename the btrfs_calc_*_metadata_size helpers · 2bd36e7b
      Josef Bacik 提交于
      btrfs_calc_trunc_metadata_size differs from trans_metadata_size in that
      it doesn't take into account any splitting at the levels, because
      truncate will never split nodes.  However truncate _and_ changing will
      never split nodes, so rename btrfs_calc_trunc_metadata_size to
      btrfs_calc_metadata_size.  Also btrfs_calc_trans_metadata_size is purely
      for inserting items, so rename this to btrfs_calc_insert_metadata_size.
      Making these clearer will help when I start using them differently in
      upcoming patches.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      2bd36e7b
    • J
      btrfs: move basic block_group definitions to their own header · aac0023c
      Josef Bacik 提交于
      This is prep work for moving all of the block group cache code into its
      own file.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ minor comment updates ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      aac0023c
    • J
      btrfs: move btrfs_add_free_space out of a header file · 478b4d9f
      Josef Bacik 提交于
      This is prep work for moving block_group_cache around.  Having this in
      the header file makes the header file include need to be in a certain
      order, which is awkward, so just move it into free-space-cache.c and
      then we can re-arrange later.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      478b4d9f
  5. 04 7月, 2019 1 次提交
  6. 02 7月, 2019 1 次提交
  7. 01 7月, 2019 3 次提交
  8. 30 4月, 2019 7 次提交
  9. 06 11月, 2018 1 次提交
    • F
      Btrfs: fix deadlock on tree root leaf when finding free extent · 4222ea71
      Filipe Manana 提交于
      When we are writing out a free space cache, during the transaction commit
      phase, we can end up in a deadlock which results in a stack trace like the
      following:
      
       schedule+0x28/0x80
       btrfs_tree_read_lock+0x8e/0x120 [btrfs]
       ? finish_wait+0x80/0x80
       btrfs_read_lock_root_node+0x2f/0x40 [btrfs]
       btrfs_search_slot+0xf6/0x9f0 [btrfs]
       ? evict_refill_and_join+0xd0/0xd0 [btrfs]
       ? inode_insert5+0x119/0x190
       btrfs_lookup_inode+0x3a/0xc0 [btrfs]
       ? kmem_cache_alloc+0x166/0x1d0
       btrfs_iget+0x113/0x690 [btrfs]
       __lookup_free_space_inode+0xd8/0x150 [btrfs]
       lookup_free_space_inode+0x5b/0xb0 [btrfs]
       load_free_space_cache+0x7c/0x170 [btrfs]
       ? cache_block_group+0x72/0x3b0 [btrfs]
       cache_block_group+0x1b3/0x3b0 [btrfs]
       ? finish_wait+0x80/0x80
       find_free_extent+0x799/0x1010 [btrfs]
       btrfs_reserve_extent+0x9b/0x180 [btrfs]
       btrfs_alloc_tree_block+0x1b3/0x4f0 [btrfs]
       __btrfs_cow_block+0x11d/0x500 [btrfs]
       btrfs_cow_block+0xdc/0x180 [btrfs]
       btrfs_search_slot+0x3bd/0x9f0 [btrfs]
       btrfs_lookup_inode+0x3a/0xc0 [btrfs]
       ? kmem_cache_alloc+0x166/0x1d0
       btrfs_update_inode_item+0x46/0x100 [btrfs]
       cache_save_setup+0xe4/0x3a0 [btrfs]
       btrfs_start_dirty_block_groups+0x1be/0x480 [btrfs]
       btrfs_commit_transaction+0xcb/0x8b0 [btrfs]
      
      At cache_save_setup() we need to update the inode item of a block group's
      cache which is located in the tree root (fs_info->tree_root), which means
      that it may result in COWing a leaf from that tree. If that happens we
      need to find a free metadata extent and while looking for one, if we find
      a block group which was not cached yet we attempt to load its cache by
      calling cache_block_group(). However this function will try to load the
      inode of the free space cache, which requires finding the matching inode
      item in the tree root - if that inode item is located in the same leaf as
      the inode item of the space cache we are updating at cache_save_setup(),
      we end up in a deadlock, since we try to obtain a read lock on the same
      extent buffer that we previously write locked.
      
      So fix this by using the tree root's commit root when searching for a
      block group's free space cache inode item when we are attempting to load
      a free space cache. This is safe since block groups once loaded stay in
      memory forever, as well as their caches, so after they are first loaded
      we will never need to read their inode items again. For new block groups,
      once they are created they get their ->cached field set to
      BTRFS_CACHE_FINISHED meaning we will not need to read their inode item.
      Reported-by: NAndrew Nelson <andrew.s.nelson@gmail.com>
      Link: https://lore.kernel.org/linux-btrfs/CAPTELenq9x5KOWuQ+fa7h1r3nsJG8vyiTH8+ifjURc_duHh2Wg@mail.gmail.com/
      Fixes: 9d66e233 ("Btrfs: load free space cache if it exists")
      Tested-by: NAndrew Nelson <andrew.s.nelson@gmail.com>
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4222ea71
  10. 23 10月, 2018 1 次提交
    • F
      Btrfs: fix use-after-free when dumping free space · 9084cb6a
      Filipe Manana 提交于
      We were iterating a block group's free space cache rbtree without locking
      first the lock that protects it (the free_space_ctl->free_space_offset
      rbtree is protected by the free_space_ctl->tree_lock spinlock).
      
      KASAN reported an use-after-free problem when iterating such a rbtree due
      to a concurrent rbtree delete:
      
      [ 9520.359168] ==================================================================
      [ 9520.359656] BUG: KASAN: use-after-free in rb_next+0x13/0x90
      [ 9520.359949] Read of size 8 at addr ffff8800b7ada500 by task btrfs-transacti/1721
      [ 9520.360357]
      [ 9520.360530] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G             L    4.19.0-rc8-nbor #555
      [ 9520.360990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 9520.362682] Call Trace:
      [ 9520.362887]  dump_stack+0xa4/0xf5
      [ 9520.363146]  print_address_description+0x78/0x280
      [ 9520.363412]  kasan_report+0x263/0x390
      [ 9520.363650]  ? rb_next+0x13/0x90
      [ 9520.363873]  __asan_load8+0x54/0x90
      [ 9520.364102]  rb_next+0x13/0x90
      [ 9520.364380]  btrfs_dump_free_space+0x146/0x160 [btrfs]
      [ 9520.364697]  dump_space_info+0x2cd/0x310 [btrfs]
      [ 9520.364997]  btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
      [ 9520.365310]  __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
      [ 9520.365646]  ? btrfs_update_time+0x180/0x180 [btrfs]
      [ 9520.365923]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.366204]  ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
      [ 9520.366549]  btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
      [ 9520.366880]  cache_save_setup+0x42e/0x580 [btrfs]
      [ 9520.367220]  ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
      [ 9520.367518]  ? lock_downgrade+0x2f0/0x2f0
      [ 9520.367799]  ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
      [ 9520.368104]  ? kasan_check_read+0x11/0x20
      [ 9520.368349]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.368638]  btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
      [ 9520.368978]  ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
      [ 9520.369282]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.369534]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.369811]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.370137]  commit_cowonly_roots+0x4b9/0x610 [btrfs]
      [ 9520.370560]  ? commit_fs_roots+0x350/0x350 [btrfs]
      [ 9520.370926]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.371285]  btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
      [ 9520.371612]  ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
      [ 9520.371943]  ? start_transaction+0x168/0x6c0 [btrfs]
      [ 9520.372257]  transaction_kthread+0x21c/0x240 [btrfs]
      [ 9520.372537]  kthread+0x1d2/0x1f0
      [ 9520.372793]  ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
      [ 9520.373090]  ? kthread_park+0xb0/0xb0
      [ 9520.373329]  ret_from_fork+0x3a/0x50
      [ 9520.373567]
      [ 9520.373738] Allocated by task 1804:
      [ 9520.373974]  kasan_kmalloc+0xff/0x180
      [ 9520.374208]  kasan_slab_alloc+0x11/0x20
      [ 9520.374447]  kmem_cache_alloc+0xfc/0x2d0
      [ 9520.374731]  __btrfs_add_free_space+0x40/0x580 [btrfs]
      [ 9520.375044]  unpin_extent_range+0x4f7/0x7a0 [btrfs]
      [ 9520.375383]  btrfs_finish_extent_commit+0x15f/0x4d0 [btrfs]
      [ 9520.375707]  btrfs_commit_transaction+0xb06/0x10e0 [btrfs]
      [ 9520.376027]  btrfs_alloc_data_chunk_ondemand+0x237/0x5c0 [btrfs]
      [ 9520.376365]  btrfs_check_data_free_space+0x81/0xd0 [btrfs]
      [ 9520.376689]  btrfs_delalloc_reserve_space+0x25/0x80 [btrfs]
      [ 9520.377018]  btrfs_direct_IO+0x42e/0x6d0 [btrfs]
      [ 9520.377284]  generic_file_direct_write+0x11e/0x220
      [ 9520.377587]  btrfs_file_write_iter+0x472/0xac0 [btrfs]
      [ 9520.377875]  aio_write+0x25c/0x360
      [ 9520.378106]  io_submit_one+0xaa0/0xdc0
      [ 9520.378343]  __se_sys_io_submit+0xfa/0x2f0
      [ 9520.378589]  __x64_sys_io_submit+0x43/0x50
      [ 9520.378840]  do_syscall_64+0x7d/0x240
      [ 9520.379081]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 9520.379387]
      [ 9520.379557] Freed by task 1802:
      [ 9520.379782]  __kasan_slab_free+0x173/0x260
      [ 9520.380028]  kasan_slab_free+0xe/0x10
      [ 9520.380262]  kmem_cache_free+0xc1/0x2c0
      [ 9520.380544]  btrfs_find_space_for_alloc+0x4cd/0x4e0 [btrfs]
      [ 9520.380866]  find_free_extent+0xa99/0x17e0 [btrfs]
      [ 9520.381166]  btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
      [ 9520.381474]  btrfs_get_blocks_direct+0x60b/0xbd0 [btrfs]
      [ 9520.381761]  __blockdev_direct_IO+0x10ee/0x58a1
      [ 9520.382059]  btrfs_direct_IO+0x25a/0x6d0 [btrfs]
      [ 9520.382321]  generic_file_direct_write+0x11e/0x220
      [ 9520.382623]  btrfs_file_write_iter+0x472/0xac0 [btrfs]
      [ 9520.382904]  aio_write+0x25c/0x360
      [ 9520.383172]  io_submit_one+0xaa0/0xdc0
      [ 9520.383416]  __se_sys_io_submit+0xfa/0x2f0
      [ 9520.383678]  __x64_sys_io_submit+0x43/0x50
      [ 9520.383927]  do_syscall_64+0x7d/0x240
      [ 9520.384165]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 9520.384439]
      [ 9520.384610] The buggy address belongs to the object at ffff8800b7ada500
                      which belongs to the cache btrfs_free_space of size 72
      [ 9520.385175] The buggy address is located 0 bytes inside of
                      72-byte region [ffff8800b7ada500, ffff8800b7ada548)
      [ 9520.385691] The buggy address belongs to the page:
      [ 9520.385957] page:ffffea0002deb680 count:1 mapcount:0 mapping:ffff880108a1d700 index:0x0 compound_mapcount: 0
      [ 9520.388030] flags: 0x8100(slab|head)
      [ 9520.388281] raw: 0000000000008100 ffffea0002deb608 ffffea0002728808 ffff880108a1d700
      [ 9520.388722] raw: 0000000000000000 0000000000130013 00000001ffffffff 0000000000000000
      [ 9520.389169] page dumped because: kasan: bad access detected
      [ 9520.389473]
      [ 9520.389658] Memory state around the buggy address:
      [ 9520.389943]  ffff8800b7ada400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.390368]  ffff8800b7ada480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.390796] >ffff8800b7ada500: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
      [ 9520.391223]                    ^
      [ 9520.391461]  ffff8800b7ada580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.391885]  ffff8800b7ada600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 9520.392313] ==================================================================
      [ 9520.392772] BTRFS critical (device vdc): entry offset 2258497536, bytes 131072, bitmap no
      [ 9520.393247] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011
      [ 9520.393705] PGD 800000010dbab067 P4D 800000010dbab067 PUD 107551067 PMD 0
      [ 9520.394059] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [ 9520.394378] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G    B        L    4.19.0-rc8-nbor #555
      [ 9520.394858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [ 9520.395350] RIP: 0010:rb_next+0x3c/0x90
      [ 9520.396461] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
      [ 9520.396762] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
      [ 9520.397115] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
      [ 9520.397468] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
      [ 9520.397821] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
      [ 9520.398188] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
      [ 9520.398555] FS:  0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
      [ 9520.399007] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9520.399335] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
      [ 9520.399679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9520.400023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9520.400400] Call Trace:
      [ 9520.400648]  btrfs_dump_free_space+0x146/0x160 [btrfs]
      [ 9520.400974]  dump_space_info+0x2cd/0x310 [btrfs]
      [ 9520.401287]  btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
      [ 9520.401609]  __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
      [ 9520.401952]  ? btrfs_update_time+0x180/0x180 [btrfs]
      [ 9520.402232]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.402522]  ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
      [ 9520.402882]  btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
      [ 9520.403261]  cache_save_setup+0x42e/0x580 [btrfs]
      [ 9520.403570]  ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
      [ 9520.403871]  ? lock_downgrade+0x2f0/0x2f0
      [ 9520.404161]  ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
      [ 9520.404481]  ? kasan_check_read+0x11/0x20
      [ 9520.404732]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.405026]  btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
      [ 9520.405375]  ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
      [ 9520.405694]  ? do_raw_spin_unlock+0xa8/0x140
      [ 9520.405958]  ? _raw_spin_unlock+0x27/0x40
      [ 9520.406243]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.406574]  commit_cowonly_roots+0x4b9/0x610 [btrfs]
      [ 9520.406899]  ? commit_fs_roots+0x350/0x350 [btrfs]
      [ 9520.407253]  ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
      [ 9520.407589]  btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
      [ 9520.407925]  ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
      [ 9520.408262]  ? start_transaction+0x168/0x6c0 [btrfs]
      [ 9520.408582]  transaction_kthread+0x21c/0x240 [btrfs]
      [ 9520.408870]  kthread+0x1d2/0x1f0
      [ 9520.409138]  ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
      [ 9520.409440]  ? kthread_park+0xb0/0xb0
      [ 9520.409682]  ret_from_fork+0x3a/0x50
      [ 9520.410508] Dumping ftrace buffer:
      [ 9520.410764]    (ftrace buffer empty)
      [ 9520.411007] CR2: 0000000000000011
      [ 9520.411297] ---[ end trace 01a0863445cf360a ]---
      [ 9520.411568] RIP: 0010:rb_next+0x3c/0x90
      [ 9520.412644] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
      [ 9520.412932] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
      [ 9520.413274] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
      [ 9520.413616] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
      [ 9520.414007] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
      [ 9520.414349] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
      [ 9520.416074] FS:  0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
      [ 9520.416536] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9520.416848] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
      [ 9520.418477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 9520.418846] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 9520.419204] Kernel panic - not syncing: Fatal exception
      [ 9520.419666] Dumping ftrace buffer:
      [ 9520.419930]    (ftrace buffer empty)
      [ 9520.420168] Kernel Offset: disabled
      [ 9520.420406] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Fix this by acquiring the respective lock before iterating the rbtree.
      Reported-by: NNikolay Borisov <nborisov@suse.com>
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      9084cb6a
  11. 19 10月, 2018 1 次提交
  12. 15 10月, 2018 3 次提交
  13. 06 8月, 2018 4 次提交
  14. 29 5月, 2018 1 次提交
  15. 12 4月, 2018 1 次提交
  16. 31 3月, 2018 1 次提交
    • Q
      btrfs: qgroup: Use separate meta reservation type for delalloc · 43b18595
      Qu Wenruo 提交于
      Before this patch, btrfs qgroup is mixing per-transcation meta rsv with
      preallocated meta rsv, making it quite easy to underflow qgroup meta
      reservation.
      
      Since we have the new qgroup meta rsv types, apply it to delalloc
      reservation.
      
      Now for delalloc, most of its reserved space will use META_PREALLOC qgroup
      rsv type.
      
      And for callers reducing outstanding extent like btrfs_finish_ordered_io(),
      they will convert corresponding META_PREALLOC reservation to
      META_PERTRANS.
      
      This is mainly due to the fact that current qgroup numbers will only be
      updated in btrfs_commit_transaction(), that's to say if we don't keep
      such placeholder reservation, we can exceed qgroup limitation.
      
      And for callers freeing outstanding extent in error handler, we will
      just free META_PREALLOC bytes.
      
      This behavior makes callers of btrfs_qgroup_release_meta() or
      btrfs_qgroup_convert_meta() to be aware of which type they are.
      So in this patch, btrfs_delalloc_release_metadata() and its callers get
      an extra parameter to info qgroup to do correct meta convert/release.
      
      The good news is, even we use the wrong type (convert or free), it won't
      cause obvious bug, as prealloc type is always in good shape, and the
      type only affects how per-trans meta is increased or not.
      
      So the worst case will be at most metadata limitation can be sometimes
      exceeded (no convert at all) or metadata limitation is reached too soon
      (no free at all).
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      43b18595
  17. 22 1月, 2018 2 次提交
  18. 13 1月, 2018 2 次提交
    • M
      error-injection: Add injectable error types · 663faf9f
      Masami Hiramatsu 提交于
      Add injectable error types for each error-injectable function.
      
      One motivation of error injection test is to find software flaws,
      mistakes or mis-handlings of expectable errors. If we find such
      flaws by the test, that is a program bug, so we need to fix it.
      
      But if the tester miss input the error (e.g. just return success
      code without processing anything), it causes unexpected behavior
      even if the caller is correctly programmed to handle any errors.
      That is not what we want to test by error injection.
      
      To clarify what type of errors the caller must expect for each
      injectable function, this introduces injectable error types:
      
       - EI_ETYPE_NULL : means the function will return NULL if it
      		    fails. No ERR_PTR, just a NULL.
       - EI_ETYPE_ERRNO : means the function will return -ERRNO
      		    if it fails.
       - EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO
      		       (ERR_PTR) or NULL.
      
      ALLOW_ERROR_INJECTION() macro is expanded to get one of
      NULL, ERRNO, ERRNO_NULL to record the error type for
      each function. e.g.
      
       ALLOW_ERROR_INJECTION(open_ctree, ERRNO)
      
      This error types are shown in debugfs as below.
      
        ====
        / # cat /sys/kernel/debug/error_injection/list
        open_ctree [btrfs]	ERRNO
        io_ctl_init [btrfs]	ERRNO
        ====
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      663faf9f
    • M
      error-injection: Separate error-injection from kprobe · 540adea3
      Masami Hiramatsu 提交于
      Since error-injection framework is not limited to be used
      by kprobes, nor bpf. Other kernel subsystems can use it
      freely for checking safeness of error-injection, e.g.
      livepatch, ftrace etc.
      So this separate error-injection framework from kprobes.
      
      Some differences has been made:
      
      - "kprobe" word is removed from any APIs/structures.
      - BPF_ALLOW_ERROR_INJECTION() is renamed to
        ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
      - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
        feature. It is automatically enabled if the arch supports
        error injection feature for kprobe or ftrace etc.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      540adea3