1. 12 12月, 2016 1 次提交
  2. 09 12月, 2016 1 次提交
  3. 06 12月, 2016 20 次提交
  4. 01 12月, 2016 1 次提交
    • R
      Btrfs: fix tree search logic when replaying directory entry deletes · 2a7bf53f
      Robbie Ko 提交于
      If a log tree has a layout like the following:
      
      leaf N:
              ...
              item 240 key (282 DIR_LOG_ITEM 0) itemoff 8189 itemsize 8
                      dir log end 1275809046
      leaf N + 1:
              item 0 key (282 DIR_LOG_ITEM 3936149215) itemoff 16275 itemsize 8
                      dir log end 18446744073709551615
              ...
      
      When we pass the value 1275809046 + 1 as the parameter start_ret to the
      function tree-log.c:find_dir_range() (done by replay_dir_deletes()), we
      end up with path->slots[0] having the value 239 (points to the last item
      of leaf N, item 240). Because the dir log item in that position has an
      offset value smaller than *start_ret (1275809046 + 1) we need to move on
      to the next leaf, however the logic for that is wrong since it compares
      the current slot to the number of items in the leaf, which is smaller
      and therefore we don't lookup for the next leaf but instead we set the
      slot to point to an item that does not exist, at slot 240, and we later
      operate on that slot which has unexpected content or in the worst case
      can result in an invalid memory access (accessing beyond the last page
      of leaf N's extent buffer).
      
      So fix the logic that checks when we need to lookup at the next leaf
      by first incrementing the slot and only after to check if that slot
      is beyond the last item of the current leaf.
      Signed-off-by: NRobbie Ko <robbieko@synology.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Fixes: e02119d5 (Btrfs: Add a write ahead tree log to optimize synchronous operations)
      Cc: stable@vger.kernel.org  # 2.6.29+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      [Modified changelog for clarity and correctness]
      2a7bf53f
  5. 30 11月, 2016 17 次提交
    • R
      Btrfs: fix deadlock caused by fsync when logging directory entries · ec125cfb
      Robbie Ko 提交于
      While logging new directory entries, at tree-log.c:log_new_dir_dentries(),
      after we call btrfs_search_forward() we get a leaf with a read lock on it,
      and without unlocking that leaf we can end up calling btrfs_iget() to get
      an inode pointer. The later (btrfs_iget()) can end up doing a read-only
      search on the same tree again, if the inode is not in memory already, which
      ends up causing a deadlock if some other task in the meanwhile started a
      write search on the tree and is attempting to write lock the same leaf
      that btrfs_search_forward() locked while holding write locks on upper
      levels of the tree blocking the read search from btrfs_iget(). In this
      scenario we get a deadlock.
      
      So fix this by releasing the search path before calling btrfs_iget() at
      tree-log.c:log_new_dir_dentries().
      
      Example trace of such deadlock:
      
      [ 4077.478852] kworker/u24:10  D ffff88107fc90640     0 14431      2 0x00000000
      [ 4077.486752] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
      [ 4077.494346]  ffff880ffa56bad0 0000000000000046 0000000000009000 ffff880ffa56bfd8
      [ 4077.502629]  ffff880ffa56bfd8 ffff881016ce21c0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4077.510915]  ffff880ebb5173b0 ffff880ffa56baf8 ffff880ebb517410 ffff881016ce21c0
      [ 4077.519202] Call Trace:
      [ 4077.528752]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
      [ 4077.536049]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4077.542574]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
      [ 4077.550171]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
      [ 4077.558252]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
      [ 4077.566140]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
      [ 4077.573928]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
      [ 4077.582399]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
      [ 4077.589896]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
      [ 4077.599632]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
      [ 4077.607134]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
      [ 4077.615329]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
      [ 4077.622043]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
      [ 4077.628459]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
      [ 4077.635759]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
      [ 4077.641404]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      [ 4077.648696]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
      [ 4077.654926]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      
      [ 4078.358087] kworker/u24:15  D ffff88107fcd0640     0 14436      2 0x00000000
      [ 4078.365981] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
      [ 4078.373574]  ffff880ffa57fad0 0000000000000046 0000000000009000 ffff880ffa57ffd8
      [ 4078.381864]  ffff880ffa57ffd8 ffff88103004d0a0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4078.390163]  ffff880fbeffc298 ffff880ffa57faf8 ffff880fbeffc2f8 ffff88103004d0a0
      [ 4078.398466] Call Trace:
      [ 4078.408019]  [<ffffffffa06ed5ed>] ? btrfs_tree_lock+0xdd/0x2f0 [btrfs]
      [ 4078.415322]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4078.421844]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
      [ 4078.429438]  [<ffffffffa06a5073>] ? btrfs_lookup_file_extent+0x33/0x40 [btrfs]
      [ 4078.437518]  [<ffffffffa06c600b>] ? __btrfs_drop_extents+0x13b/0xdf0 [btrfs]
      [ 4078.445404]  [<ffffffffa06fc9e2>] ? add_delayed_data_ref+0xe2/0x150 [btrfs]
      [ 4078.453194]  [<ffffffffa06fd629>] ? btrfs_add_delayed_data_ref+0x149/0x1d0 [btrfs]
      [ 4078.461663]  [<ffffffffa06cf3c0>] ? __set_extent_bit+0x4c0/0x5c0 [btrfs]
      [ 4078.469161]  [<ffffffffa06b4a64>] ? insert_reserved_file_extent.constprop.75+0xa4/0x320 [btrfs]
      [ 4078.478893]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
      [ 4078.486388]  [<ffffffffa06bab57>] ? btrfs_finish_ordered_io+0x2e7/0x600 [btrfs]
      [ 4078.494561]  [<ffffffff8104cbc2>] ? process_one_work+0x142/0x3d0
      [ 4078.501278]  [<ffffffff8104a507>] ? pwq_activate_delayed_work+0x27/0x40
      [ 4078.508673]  [<ffffffff8104d729>] ? worker_thread+0x109/0x3b0
      [ 4078.515098]  [<ffffffff8104d620>] ? manage_workers.isra.26+0x270/0x270
      [ 4078.522396]  [<ffffffff81052b0f>] ? kthread+0xaf/0xc0
      [ 4078.528032]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      [ 4078.535325]  [<ffffffff814a9ac8>] ? ret_from_fork+0x58/0x90
      [ 4078.541552]  [<ffffffff81052a60>] ? kthread_create_on_node+0x110/0x110
      
      [ 4079.355824] user-space-program D ffff88107fd30640     0 32020      1 0x00000000
      [ 4079.363716]  ffff880eae8eba10 0000000000000086 0000000000009000 ffff880eae8ebfd8
      [ 4079.372003]  ffff880eae8ebfd8 ffff881016c162c0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4079.380294]  ffff880fbed4b4c8 ffff880eae8eba38 ffff880fbed4b528 ffff881016c162c0
      [ 4079.388586] Call Trace:
      [ 4079.398134]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
      [ 4079.405431]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4079.411955]  [<ffffffffa06876fb>] ? btrfs_lock_root_node+0x2b/0x40 [btrfs]
      [ 4079.419644]  [<ffffffffa068ce83>] ? btrfs_search_slot+0xa03/0xb10 [btrfs]
      [ 4079.427237]  [<ffffffffa06aba52>] ? btrfs_buffer_uptodate+0x52/0x70 [btrfs]
      [ 4079.435041]  [<ffffffffa0689b60>] ? generic_bin_search.constprop.38+0x80/0x190 [btrfs]
      [ 4079.443897]  [<ffffffffa068ea44>] ? btrfs_insert_empty_items+0x74/0xd0 [btrfs]
      [ 4079.451975]  [<ffffffffa072c443>] ? copy_items+0x128/0x850 [btrfs]
      [ 4079.458890]  [<ffffffffa072da10>] ? btrfs_log_inode+0x629/0xbf3 [btrfs]
      [ 4079.466292]  [<ffffffffa06f34a1>] ? btrfs_log_inode_parent+0xc61/0xf30 [btrfs]
      [ 4079.474373]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
      [ 4079.482161]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
      [ 4079.489558]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
      [ 4079.495300]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
      [ 4079.501422]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
      
      [ 4079.508334] user-space-program D ffff88107fc30640     0 32021      1 0x00000004
      [ 4079.516226]  ffff880eae8efbf8 0000000000000086 0000000000009000 ffff880eae8effd8
      [ 4079.524513]  ffff880eae8effd8 ffff881030279610 ffffffffa06ecb26 ffff88101a5d6138
      [ 4079.532802]  ffff880ebb671d88 ffff880eae8efc20 ffff880ebb671de8 ffff881030279610
      [ 4079.541092] Call Trace:
      [ 4079.550642]  [<ffffffffa06ed595>] ? btrfs_tree_lock+0x85/0x2f0 [btrfs]
      [ 4079.557941]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4079.564463]  [<ffffffffa068cc1f>] ? btrfs_search_slot+0x79f/0xb10 [btrfs]
      [ 4079.572058]  [<ffffffffa06bb7d8>] ? btrfs_truncate_inode_items+0x168/0xb90 [btrfs]
      [ 4079.580526]  [<ffffffffa06b04be>] ? join_transaction.isra.15+0x1e/0x3a0 [btrfs]
      [ 4079.588701]  [<ffffffffa06b206d>] ? start_transaction+0x8d/0x470 [btrfs]
      [ 4079.596196]  [<ffffffffa0690ac6>] ? block_rsv_add_bytes+0x16/0x50 [btrfs]
      [ 4079.603789]  [<ffffffffa06bc2e9>] ? btrfs_truncate+0xe9/0x2e0 [btrfs]
      [ 4079.610994]  [<ffffffffa06bd00b>] ? btrfs_setattr+0x30b/0x410 [btrfs]
      [ 4079.618197]  [<ffffffff81117c1c>] ? notify_change+0x1dc/0x680
      [ 4079.624625]  [<ffffffff8123c8a4>] ? aa_path_perm+0xd4/0x160
      [ 4079.630854]  [<ffffffff810f4fcb>] ? do_truncate+0x5b/0x90
      [ 4079.636889]  [<ffffffff810f59fa>] ? do_sys_ftruncate.constprop.15+0x10a/0x160
      [ 4079.644869]  [<ffffffff8110d87b>] ? SyS_fcntl+0x5b/0x570
      [ 4079.650805]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
      
      [ 4080.410607] user-space-program D ffff88107fc70640     0 32028  12639 0x00000004
      [ 4080.418489]  ffff880eaeccbbe0 0000000000000086 0000000000009000 ffff880eaeccbfd8
      [ 4080.426778]  ffff880eaeccbfd8 ffff880f317ef1e0 ffffffffa06ecb26 ffff88101a5d6138
      [ 4080.435067]  ffff880ef7e93928 ffff880f317ef1e0 ffff880eaeccbc08 ffff880f317ef1e0
      [ 4080.443353] Call Trace:
      [ 4080.452920]  [<ffffffffa06ed15d>] ? btrfs_tree_read_lock+0xdd/0x190 [btrfs]
      [ 4080.460703]  [<ffffffff81053680>] ? wake_up_atomic_t+0x30/0x30
      [ 4080.467225]  [<ffffffffa06876bb>] ? btrfs_read_lock_root_node+0x2b/0x40 [btrfs]
      [ 4080.475400]  [<ffffffffa068cc81>] ? btrfs_search_slot+0x801/0xb10 [btrfs]
      [ 4080.482994]  [<ffffffffa06b2df0>] ? btrfs_clean_one_deleted_snapshot+0xe0/0xe0 [btrfs]
      [ 4080.491857]  [<ffffffffa06a70a6>] ? btrfs_lookup_inode+0x26/0x90 [btrfs]
      [ 4080.499353]  [<ffffffff810ec42f>] ? kmem_cache_alloc+0xaf/0xc0
      [ 4080.505879]  [<ffffffffa06bd905>] ? btrfs_iget+0xd5/0x5d0 [btrfs]
      [ 4080.512696]  [<ffffffffa06caf04>] ? btrfs_get_token_64+0x104/0x120 [btrfs]
      [ 4080.520387]  [<ffffffffa06f341f>] ? btrfs_log_inode_parent+0xbdf/0xf30 [btrfs]
      [ 4080.528469]  [<ffffffffa06f45a9>] ? btrfs_log_dentry_safe+0x59/0x80 [btrfs]
      [ 4080.536258]  [<ffffffffa06c298d>] ? btrfs_sync_file+0x20d/0x330 [btrfs]
      [ 4080.543657]  [<ffffffff8112777c>] ? do_fsync+0x4c/0x80
      [ 4080.549399]  [<ffffffff81127a0a>] ? SyS_fdatasync+0xa/0x10
      [ 4080.555534]  [<ffffffff814a9b72>] ? system_call_fastpath+0x16/0x1b
      Signed-off-by: NRobbie Ko <robbieko@synology.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Fixes: 2f2ff0ee (Btrfs: fix metadata inconsistencies after directory fsync)
      Cc: stable@vger.kernel.org # 4.1+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      [Modified changelog for clarity and correctness]
      ec125cfb
    • R
      Btrfs: fix enospc in hole punching · 2cdaf447
      Robbie Ko 提交于
      The hole punching can result in adding new leafs (and as a consequence
      new nodes) to the tree because when we find file extent items that span
      beyond the hole range we may end up not deleting them (just adjusting
      them, reducing their range by reducing their length or increasing their
      offset field) and add new file extent items representing holes.
      
      So after splitting a leaf (therefore creating a new one) to insert a new
      file extent item representing a hole, a new node might be added to each
      level of the tree in the worst case scenario (since there's a new key
      and every parent node was full).
      
      For example if a file has an extent item representing the range 0 to 64Mb
      and we punch a hole in the range 1Mb to 20Mb, the existing extent item is
      duplicated and one of the copies is adjusted to represent the range 0 to
      1Mb, the other copy adjusted to represent the range 20Mb to 64Mb, and a
      new file extent item representing a hole in the range 1Mb to 20Mb is
      inserted.
      
      Fix this by using btrfs_calc_trans_metadata_size() instead of
      btrfs_calc_trunc_metadata_size(), so that enough metadata space is
      reserved for the worst possible case.
      Signed-off-by: NRobbie Ko <robbieko@synology.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      [Modified changelog for clarity and correctness]
      2cdaf447
    • W
      btrfs: improve delayed refs iterations · 1d57ee94
      Wang Xiaoguang 提交于
      This issue was found when I tried to delete a heavily reflinked file,
      when deleting such files, other transaction operation will not have a
      chance to make progress, for example, start_transaction() will blocked
      in wait_current_trans(root) for long time, sometimes it even triggers
      soft lockups, and the time taken to delete such heavily reflinked file
      is also very large, often hundreds of seconds. Using perf top, it reports
      that:
      
      PerfTop:    7416 irqs/sec  kernel:99.8%  exact:  0.0% [4000Hz cpu-clock],  (all, 4 CPUs)
      ---------------------------------------------------------------------------------------
          84.37%  [btrfs]             [k] __btrfs_run_delayed_refs.constprop.80
          11.02%  [kernel]            [k] delay_tsc
           0.79%  [kernel]            [k] _raw_spin_unlock_irq
           0.78%  [kernel]            [k] _raw_spin_unlock_irqrestore
           0.45%  [kernel]            [k] do_raw_spin_lock
           0.18%  [kernel]            [k] __slab_alloc
      It seems __btrfs_run_delayed_refs() took most cpu time, after some debug
      work, I found it's select_delayed_ref() causing this issue, for a delayed
      head, in our case, it'll be full of BTRFS_DROP_DELAYED_REF nodes, but
      select_delayed_ref() will firstly try to iterate node list to find
      BTRFS_ADD_DELAYED_REF nodes, obviously it's a disaster in this case, and
      waste much time.
      
      To fix this issue, we introduce a new ref_add_list in struct btrfs_delayed_ref_head,
      then in select_delayed_ref(), if this list is not empty, we can directly use
      nodes in this list. With this patch, it just took about 10~15 seconds to
      delte the same file. Now using perf top, it reports that:
      
      PerfTop:    2734 irqs/sec  kernel:99.5%  exact:  0.0% [4000Hz cpu-clock],  (all, 4 CPUs)
      ----------------------------------------------------------------------------------------
      
          20.74%  [kernel]          [k] _raw_spin_unlock_irqrestore
          16.33%  [kernel]          [k] __slab_alloc
           5.41%  [kernel]          [k] lock_acquired
           4.42%  [kernel]          [k] lock_acquire
           4.05%  [kernel]          [k] lock_release
           3.37%  [kernel]          [k] _raw_spin_unlock_irq
      
      For normal files, this patch also gives help, at least we do not need to
      iterate whole list to found BTRFS_ADD_DELAYED_REF nodes.
      Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      Tested-by: NHolger Hoffstätte <holger@applied-asynchrony.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1d57ee94
    • Q
      btrfs: qgroup: Fix qgroup data leaking by using subtree tracing · 824d8dff
      Qu Wenruo 提交于
      Commit 62b99540 (btrfs: relocation: Fix leaking qgroups numbers
      on data extents) only fixes the problem partly.
      
      The previous fix is to trace all new data extents at transaction commit
      time when balance finishes.
      
      However balance is not done in a large transaction, every path
      replacement can happen in its own transaction.
      This makes the fix useless if transaction commits during relocation.
      
      For example:
      relocate_block_group()
      |-merge_reloc_roots()
      |  |- merge_reloc_root()
      |     |- btrfs_start_transaction()         <- Trans X
      |     |- replace_path()                    <- Cause leak
      |     |- btrfs_end_transaction_throttle()  <- Trans X commits here
      |     |                                       Leak not fixed
      |     |
      |     |- btrfs_start_transaction()         <- Trans Y
      |     |- replace_path()                    <- Cause leak
      |     |- btrfs_end_transaction_throttle()  <- Trans Y ends
      |                                             but not committed
      |-btrfs_join_transaction()                 <- Still trans Y
      |-qgroup_fix()                             <- Only fixes data leak
      |                                             in trans Y
      |-btrfs_commit_transaction()               <- Trans Y commits
      
      In that case, qgroup fixup can only fix data leak in trans Y, data leak
      in trans X is out of fix.
      
      So the correct fix should happen in the same transaction of
      replace_path().
      
      This patch fixes it by tracing both subtrees of tree block swap, so it
      can fix the problem and ensure all leaking and fix are in the same
      transaction, so no leak again.
      Reported-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      824d8dff
    • Q
      btrfs: Export and move leaf/subtree qgroup helpers to qgroup.c · 33d1f05c
      Qu Wenruo 提交于
      Move account_shared_subtree() to qgroup.c and rename it to
      btrfs_qgroup_trace_subtree().
      
      Do the same thing for account_leaf_items() and rename it to
      btrfs_qgroup_trace_leaf_items().
      
      Since all these functions are only for qgroup, move them to qgroup.c and
      export them is more appropriate.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      33d1f05c
    • Q
      btrfs: qgroup: Rename functions to make it follow reserve,trace,account steps · 50b3e040
      Qu Wenruo 提交于
      Rename btrfs_qgroup_insert_dirty_extent(_nolock) to
      btrfs_qgroup_trace_extent(_nolock), according to the new
      reserve/trace/account naming schema.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      50b3e040
    • Q
      btrfs: qgroup: Add comments explaining how btrfs qgroup works · 1d2beaa9
      Qu Wenruo 提交于
      Add explaination how btrfs qgroups work.
      
      Qgroup is split into 3 main phrases:
      1) Reserve
         To ensure qgroup doesn't exceed its limit
      
      2) Trace
         To info qgroup to trace which extent
      
      3) Account
         Calculate qgroup number change for each traced extent.
      
      This should save quite some time for new developers.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1d2beaa9
    • C
      btrfs: use bio_for_each_segment_all in __btrfsic_submit_bio · 1621f8f3
      Christoph Hellwig 提交于
      And remove the bogus check for a NULL return value from kmap, which
      can't happen.  While we're at it: I don't think that kmapping up to 256
      will work without deadlocks on highmem machines, a better idea would
      be to use vm_map_ram to map all of them into a single virtual address
      range.  Incidentally that would also simplify the code a lot.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1621f8f3
    • C
      btrfs: refactor __btrfs_lookup_bio_sums to use bio_for_each_segment_all · 4989d277
      Christoph Hellwig 提交于
      Rework the loop a little bit to use the generic bio_for_each_segment_all
      helper for iterating over the bio.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4989d277
    • C
      btrfs: calculate end of bio offset properly · 2a4d0c90
      Christoph Hellwig 提交于
      Use the bvec offset and len members to prepare for multipage bvecs.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      2a4d0c90
    • C
      btrfs: use bi_size · 81381053
      Christoph Hellwig 提交于
      Instead of using bi_vcnt to calculate it.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      81381053
    • C
      btrfs: don't access the bio directly in btrfs_csum_one_bio · 6cd7ce49
      Christoph Hellwig 提交于
      Use bio_for_each_segment_all to iterate over the segments instead.
      This requires a bit of reshuffling so that we only lookup up the ordered
      item once inside the bio_for_each_segment_all loop.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6cd7ce49
    • C
      btrfs: don't access the bio directly in the direct I/O code · 6a2de22f
      Christoph Hellwig 提交于
      Just use bio_for_each_segment_all to iterate over all segments.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6a2de22f
    • C
      btrfs: don't access the bio directly in the raid5/6 code · 80ace3e4
      Christoph Hellwig 提交于
      Just use bio_for_each_segment_all to iterate over all segments.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      80ace3e4
    • C
      btrfs: use bio iterators for the decompression handlers · 974b1adc
      Christoph Hellwig 提交于
      Pass the full bio to the decompression routines and use bio iterators
      to iterate over the data in the bio.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      974b1adc
    • J
      btrfs: Ensure proper sector alignment for btrfs_free_reserved_data_space · 0c476a5d
      Jeff Mahoney 提交于
      This fixes the WARN_ON on BTRFS_I(inode)->reserved_extents in
      btrfs_destroy_inode and the WARN_ON on nonzero delalloc bytes on umount
      with qgroups enabled.
      
      I was able to reproduce this by setting up a small (~500kb) quota limit
      and writing a file one byte at a time until I hit the limit.  The warnings
      would all hit on umount.
      
      The root cause is that we would reserve a block-sized range in both
      the reservation and the quota in btrfs_check_data_free_space, but if we
      encountered a problem (like e.g. EDQUOT), we would only release the single
      byte in the qgroup reservation.  That caused an iotree state split, which
      increased the number of outstanding extents, in turn disallowing releasing
      the metadata reservation.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      0c476a5d
    • J
      Btrfs: abort transaction if fill_holes() fails · f94480bd
      Josef Bacik 提交于
      At this point we will have dropped extent entries from the file, so if we fail
      to insert the new hole entries then we are leaving the fs in a corrupt state
      (albeit an easily fixed one).  Abort the transaciton if this happens so we can
      avoid corrupting the fs.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f94480bd