1. 14 2月, 2017 19 次提交
  2. 03 1月, 2017 1 次提交
  3. 06 12月, 2016 6 次提交
  4. 01 12月, 2016 1 次提交
    • R
      Btrfs: fix tree search logic when replaying directory entry deletes · 2a7bf53f
      Robbie Ko 提交于
      If a log tree has a layout like the following:
      
      leaf N:
              ...
              item 240 key (282 DIR_LOG_ITEM 0) itemoff 8189 itemsize 8
                      dir log end 1275809046
      leaf N + 1:
              item 0 key (282 DIR_LOG_ITEM 3936149215) itemoff 16275 itemsize 8
                      dir log end 18446744073709551615
              ...
      
      When we pass the value 1275809046 + 1 as the parameter start_ret to the
      function tree-log.c:find_dir_range() (done by replay_dir_deletes()), we
      end up with path->slots[0] having the value 239 (points to the last item
      of leaf N, item 240). Because the dir log item in that position has an
      offset value smaller than *start_ret (1275809046 + 1) we need to move on
      to the next leaf, however the logic for that is wrong since it compares
      the current slot to the number of items in the leaf, which is smaller
      and therefore we don't lookup for the next leaf but instead we set the
      slot to point to an item that does not exist, at slot 240, and we later
      operate on that slot which has unexpected content or in the worst case
      can result in an invalid memory access (accessing beyond the last page
      of leaf N's extent buffer).
      
      So fix the logic that checks when we need to lookup at the next leaf
      by first incrementing the slot and only after to check if that slot
      is beyond the last item of the current leaf.
      Signed-off-by: NRobbie Ko <robbieko@synology.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Fixes: e02119d5 (Btrfs: Add a write ahead tree log to optimize synchronous operations)
      Cc: stable@vger.kernel.org  # 2.6.29+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      [Modified changelog for clarity and correctness]
      2a7bf53f
  5. 30 11月, 2016 2 次提交
    • R
      Btrfs: fix deadlock caused by fsync when logging directory entries · ec125cfb
      Robbie Ko 提交于
      While logging new directory entries, at tree-log.c:log_new_dir_dentries(),
      after we call btrfs_search_forward() we get a leaf with a read lock on it,
      and without unlocking that leaf we can end up calling btrfs_iget() to get
      an inode pointer. The later (btrfs_iget()) can end up doing a read-only
      search on the same tree again, if the inode is not in memory already, which
      ends up causing a deadlock if some other task in the meanwhile started a
      write search on the tree and is attempting to write lock the same leaf
      that btrfs_search_forward() locked while holding write locks on upper
      levels of the tree blocking the read search from btrfs_iget(). In this
      scenario we get a deadlock.
      
      So fix this by releasing the search path before calling btrfs_iget() at
      tree-log.c:log_new_dir_dentries().
      
      Example trace of such deadlock:
      
      [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2 0x00000000
      [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
      [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8
      [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0
      [ 4077.519202] Call Trace:
      [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
      [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
      [ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
      [ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
      [ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
      [ 4077.573928]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
      [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
      [ 4077.589896]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
      [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
      [ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
      [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
      [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
      [ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
      [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
      [ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
      [ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      
      [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2 0x00000000
      [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
      [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8
      [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0
      [ 4078.398466] Call Trace:
      [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
      [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
      [ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
      [ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
      [ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
      [ 4078.453194]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
      [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
      [ 4078.469161]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
      [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
      [ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
      [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
      [ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
      [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
      [ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
      [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
      [ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
      [ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      
      [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1 0x00000000
      [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8
      [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0
      [ 4079.388586] Call Trace:
      [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
      [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs]
      [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs]
      [ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs]
      [ 4079.435041]  [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs]
      [ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs]
      [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
      [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
      [ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs]
      [ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
      [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
      [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
      [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
      [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
      
      [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1 0x00000004
      [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8
      [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138
      [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610
      [ 4079.541092] Call Trace:
      [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
      [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
      [ 4079.572058]  [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
      [ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs]
      [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
      [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs]
      [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
      [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
      [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
      [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
      [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
      [ 4079.636889]  [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160
      [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
      [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
      
      [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639 0x00000004
      [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8
      [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0
      [ 4080.443353] Call Trace:
      [ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs]
      [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs]
      [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs]
      [ 4080.482994]  [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
      [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs]
      [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
      [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
      [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs]
      [ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs]
      [ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
      [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
      [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
      [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
      [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
      Signed-off-by: NRobbie Ko <robbieko@synology.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Fixes: 2f2ff0ee (Btrfs: fix metadata inconsistencies after directory fsync)
      Cc: stable@vger.kernel.org # 4.1+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      [Modified changelog for clarity and correctness]
      ec125cfb
    • Q
      btrfs: qgroup: Rename functions to make it follow reserve,trace,account steps · 50b3e040
      Qu Wenruo 提交于
      Rename btrfs_qgroup_insert_dirty_extent(_nolock) to
      btrfs_qgroup_trace_extent(_nolock), according to the new
      reserve/trace/account naming schema.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      50b3e040
  6. 28 10月, 2016 1 次提交
    • C
      btrfs: fix races on root_log_ctx lists · 570dd450
      Chris Mason 提交于
      btrfs_remove_all_log_ctxs takes a shortcut where it avoids walking the
      list because it knows all of the waiters are patiently waiting for the
      commit to finish.
      
      But, there's a small race where btrfs_sync_log can remove itself from
      the list if it finds a log commit is already done.  Also, it uses
      list_del_init() to remove itself from the list, but there's no way to
      know if btrfs_remove_all_log_ctxs has already run, so we don't know for
      sure if it is safe to call list_del_init().
      
      This gets rid of all the shortcuts for btrfs_remove_all_log_ctxs(), and
      just calls it with the proper locking.
      
      This is part two of the corruption fixed by cbd60aa7.  I should have
      done this in the first place, but convinced myself the optimizations were
      safe.  A 12 hour run of dbench 2048 will eventually trigger a list debug
      WARN_ON for the list_del_init() in btrfs_sync_log().
      
      Fixes: d1433debReported-by: NDave Jones <davej@codemonkey.org.uk>
      cc: stable@vger.kernel.org # 3.15+
      Signed-off-by: NChris Mason <clm@fb.com>
      570dd450
  7. 27 9月, 2016 1 次提交
  8. 26 9月, 2016 1 次提交
    • J
      Btrfs: add a flags field to btrfs_fs_info · afcdd129
      Josef Bacik 提交于
      We have a lot of random ints in btrfs_fs_info that can be put into flags.  This
      is mostly equivalent with the exception of how we deal with quota going on or
      off, now instead we set a flag when we are turning it on or off and deal with
      that appropriately, rather than just having a pending state that the current
      quota_enabled gets set to.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      afcdd129
  9. 16 9月, 2016 1 次提交
  10. 06 9月, 2016 1 次提交
  11. 25 8月, 2016 2 次提交
    • F
      Btrfs: fix lockdep warning on deadlock against an inode's log mutex · 28a23593
      Filipe Manana 提交于
      Commit 44f714da ("Btrfs: improve performance on fsync against new
      inode after rename/unlink"), which landed in 4.8-rc2, introduced a
      possibility for a deadlock due to double locking of an inode's log mutex
      by the same task, which lockdep reports with:
      
      [23045.433975] =============================================
      [23045.434748] [ INFO: possible recursive locking detected ]
      [23045.435426] 4.7.0-rc6-btrfs-next-34+ #1 Not tainted
      [23045.436044] ---------------------------------------------
      [23045.436044] xfs_io/3688 is trying to acquire lock:
      [23045.436044]  (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]
                     but task is already holding lock:
      [23045.436044]  (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]
                     other info that might help us debug this:
      [23045.436044]  Possible unsafe locking scenario:
      
      [23045.436044]        CPU0
      [23045.436044]        ----
      [23045.436044]   lock(&ei->log_mutex);
      [23045.436044]   lock(&ei->log_mutex);
      [23045.436044]
                      *** DEADLOCK ***
      
      [23045.436044]  May be due to missing lock nesting notation
      
      [23045.436044] 3 locks held by xfs_io/3688:
      [23045.436044]  #0:  (&sb->s_type->i_mutex_key#15){+.+...}, at: [<ffffffffa035f2ae>] btrfs_sync_file+0x14e/0x425 [btrfs]
      [23045.436044]  #1:  (sb_internal#2){.+.+.+}, at: [<ffffffff8118446b>] __sb_start_write+0x5f/0xb0
      [23045.436044]  #2:  (&ei->log_mutex){+.+...}, at: [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]
                     stack backtrace:
      [23045.436044] CPU: 4 PID: 3688 Comm: xfs_io Not tainted 4.7.0-rc6-btrfs-next-34+ #1
      [23045.436044] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
      [23045.436044]  0000000000000000 ffff88022f5f7860 ffffffff8127074d ffffffff82a54b70
      [23045.436044]  ffffffff82a54b70 ffff88022f5f7920 ffffffff81092897 ffff880228015d68
      [23045.436044]  0000000000000000 ffffffff82a54b70 ffffffff829c3f00 ffff880228015d68
      [23045.436044] Call Trace:
      [23045.436044]  [<ffffffff8127074d>] dump_stack+0x67/0x90
      [23045.436044]  [<ffffffff81092897>] __lock_acquire+0xcbb/0xe4e
      [23045.436044]  [<ffffffff8109155f>] ? mark_lock+0x24/0x201
      [23045.436044]  [<ffffffff8109179a>] ? mark_held_locks+0x5e/0x74
      [23045.436044]  [<ffffffff81092de0>] lock_acquire+0x12f/0x1c3
      [23045.436044]  [<ffffffff81092de0>] ? lock_acquire+0x12f/0x1c3
      [23045.436044]  [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]  [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]  [<ffffffff814a51a4>] mutex_lock_nested+0x77/0x3a7
      [23045.436044]  [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]  [<ffffffffa039705e>] ? btrfs_release_delayed_node+0xb/0xd [btrfs]
      [23045.436044]  [<ffffffffa038552d>] btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]  [<ffffffffa038552d>] ? btrfs_log_inode+0x13a/0xc95 [btrfs]
      [23045.436044]  [<ffffffff810a0ed1>] ? vprintk_emit+0x453/0x465
      [23045.436044]  [<ffffffffa0385a61>] btrfs_log_inode+0x66e/0xc95 [btrfs]
      [23045.436044]  [<ffffffffa03c084d>] log_new_dir_dentries+0x26c/0x359 [btrfs]
      [23045.436044]  [<ffffffffa03865aa>] btrfs_log_inode_parent+0x4a6/0x628 [btrfs]
      [23045.436044]  [<ffffffffa0387552>] btrfs_log_dentry_safe+0x5a/0x75 [btrfs]
      [23045.436044]  [<ffffffffa035f464>] btrfs_sync_file+0x304/0x425 [btrfs]
      [23045.436044]  [<ffffffff811acaf4>] vfs_fsync_range+0x8c/0x9e
      [23045.436044]  [<ffffffff811acb22>] vfs_fsync+0x1c/0x1e
      [23045.436044]  [<ffffffff811acc79>] do_fsync+0x31/0x4a
      [23045.436044]  [<ffffffff811ace99>] SyS_fsync+0x10/0x14
      [23045.436044]  [<ffffffff814a88e5>] entry_SYSCALL_64_fastpath+0x18/0xa8
      [23045.436044]  [<ffffffff8108f039>] ? trace_hardirqs_off_caller+0x3f/0xaa
      
      An example reproducer for this is:
      
         $ mkfs.btrfs -f /dev/sdb
         $ mount /dev/sdb /mnt
         $ mkdir /mnt/dir
         $ touch /mnt/dir/foo
         $ sync
         $ mv /mnt/dir/foo /mnt/dir/bar
         $ touch /mnt/dir/foo
         $ xfs_io -c "fsync" /mnt/dir/bar
      
      This is because while logging the inode of file bar we end up logging its
      parent directory (since its inode has an unlink_trans field matching the
      current transaction id due to the rename operation), which in turn logs
      the inodes for all its new dentries, so that the new inode for the new
      file named foo gets logged which in turn triggered another logging attempt
      for the inode we are fsync'ing, since that inode had an old name that
      corresponds to the name of the new inode.
      
      So fix this by ensuring that when logging the inode for a new dentry that
      has a name matching an old name of some other inode, we don't log again
      the original inode that we are fsync'ing.
      
      Fixes: 44f714da ("Btrfs: improve performance on fsync against new inode after rename/unlink")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      28a23593
    • Q
      btrfs: qgroup: Fix qgroup incorrectness caused by log replay · df2c95f3
      Qu Wenruo 提交于
      When doing log replay at mount time(after power loss), qgroup will leak
      numbers of replayed data extents.
      
      The cause is almost the same of balance.
      So fix it by manually informing qgroup for owner changed extents.
      
      The bug can be detected by btrfs/119 test case.
      
      Cc: Mark Fasheh <mfasheh@suse.de>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      df2c95f3
  12. 01 8月, 2016 1 次提交
    • F
      Btrfs: improve performance on fsync against new inode after rename/unlink · 44f714da
      Filipe Manana 提交于
      With commit 56f23fdb ("Btrfs: fix file/data loss caused by fsync after
      rename and new inode") we got simple fix for a functional issue when the
      following sequence of actions is done:
      
        at transaction N
        create file A at directory D
        at transaction N + M (where M >= 1)
        move/rename existing file A from directory D to directory E
        create a new file named A at directory D
        fsync the new file
        power fail
      
      The solution was to simply detect such scenario and fallback to a full
      transaction commit when we detect it. However this turned out to had a
      significant impact on throughput (and a bit on latency too) for benchmarks
      using the dbench tool, which simulates real workloads from smbd (Samba)
      servers. For example on a test vm (with a debug kernel):
      
      Unpatched:
      Throughput 19.1572 MB/sec  32 clients  32 procs  max_latency=1005.229 ms
      
      Patched:
      Throughput 23.7015 MB/sec  32 clients  32 procs  max_latency=809.206 ms
      
      The patched results (this patch is applied) are similar to the results of
      a kernel with the commit 56f23fdb ("Btrfs: fix file/data loss caused
      by fsync after rename and new inode") reverted.
      
      This change avoids the fallback to a transaction commit and instead makes
      sure all the names of the conflicting inode (the one that had a name in a
      past transaction that matches the name of the new file in the same parent
      directory) are logged so that at log replay time we don't lose neither the
      new file nor the old file, and the old file gets the name it was renamed
      to.
      
      This also ends up avoiding a full transaction commit for a similar case
      that involves an unlink instead of a rename of the old file:
      
        at transaction N
        create file A at directory D
        at transaction N + M (where M >= 1)
        remove file A
        create a new file named A at directory D
        fsync the new file
        power fail
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      44f714da
  13. 26 7月, 2016 3 次提交