1. 18 4月, 2017 13 次提交
  2. 28 2月, 2017 3 次提交
  3. 17 2月, 2017 2 次提交
  4. 06 12月, 2016 5 次提交
  5. 29 11月, 2016 1 次提交
  6. 01 11月, 2016 1 次提交
  7. 27 9月, 2016 1 次提交
  8. 26 7月, 2016 2 次提交
  9. 08 6月, 2016 2 次提交
  10. 30 5月, 2016 3 次提交
    • F
      Btrfs: fix race setting block group back to RW mode during device replace · 1a1a8b73
      Filipe Manana 提交于
      After it finishes processing a device extent, the device replace code sets
      back the block group to RW mode and then after that it sets the left cursor
      to match the logical end address of the block group, so that future writes
      into extents belonging to the block group go both the source (old) and
      target (new) devices. However from the moment we turn the block group
      back to RW mode we have a short time window, that lasts until we update
      the left cursor's value, where extents can be allocated from the block
      group and written to, in which case they will not be copied/written to
      the target (new) device. Fix this by updating the left cursor's value
      before turning the block group back to RW mode.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      1a1a8b73
    • F
      Btrfs: fix unprotected assignment of the left cursor for device replace · 81e87a73
      Filipe Manana 提交于
      We were assigning new values to fields of the device replace object
      without holding the respective lock after processing each device extent.
      This is important for the left cursor field which can be accessed by a
      concurrent task running __btrfs_map_block (which, correctly, takes the
      device replace lock).
      So change these fields while holding the device replace lock.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      81e87a73
    • F
      Btrfs: fix race setting block group readonly during device replace · f0e9b7d6
      Filipe Manana 提交于
      When we do a device replace, for each device extent we find from the
      source device, we set the corresponding block group to readonly mode to
      prevent writes into it from happening while we are copying the device
      extent from the source to the target device. However just before we set
      the block group to readonly mode some concurrent task might have already
      allocated an extent from it or decided it could perform a nocow write
      into one of its extents, which can make the device replace process to
      miss copying an extent since it uses the extent tree's commit root to
      search for extents and only once it finishes searching for all extents
      belonging to the block group it does set the left cursor to the logical
      end address of the block group - this is a problem if the respective
      ordered extents finish while we are searching for extents using the
      extent tree's commit root and no transaction commit happens while we
      are iterating the tree, since it's the delayed references created by the
      ordered extents (when they complete) that insert the extent items into
      the extent tree (using the non-commit root of course).
      Example:
      
                CPU 1                                            CPU 2
      
       btrfs_dev_replace_start()
         btrfs_scrub_dev()
           scrub_enumerate_chunks()
             --> finds device extent belonging
                 to block group X
      
                                     <transaction N starts>
      
                                                            starts buffered write
                                                            against some inode
      
                                                            writepages is run against
                                                            that inode forcing dellaloc
                                                            to run
      
                                                            btrfs_writepages()
                                                              extent_writepages()
                                                                extent_write_cache_pages()
                                                                  __extent_writepage()
                                                                    writepage_delalloc()
                                                                      run_delalloc_range()
                                                                        cow_file_range()
                                                                          btrfs_reserve_extent()
                                                                            --> allocates an extent
                                                                                from block group X
                                                                                (which is not yet
                                                                                 in RO mode)
                                                                          btrfs_add_ordered_extent()
                                                                            --> creates ordered extent Y
                                                              flush_epd_write_bio()
                                                                --> bio against the extent from
                                                                    block group X is submitted
      
             btrfs_inc_block_group_ro(bg X)
               --> sets block group X to readonly
      
             scrub_chunk(bg X)
               scrub_stripe(device extent from srcdev)
                 --> keeps searching for extent items
                     belonging to the block group using
                     the extent tree's commit root
                 --> it never blocks due to
                     fs_info->scrub_pause_req as no
                     one tries to commit transaction N
                 --> copies all extents found from the
                     source device into the target device
                 --> finishes search loop
      
                                                              bio completes
      
                                                              ordered extent Y completes
                                                              and creates delayed data
                                                              reference which will add an
                                                              extent item to the extent
                                                              tree when run (typically
                                                              at transaction commit time)
      
                                                                --> so the task doing the
                                                                    scrub/device replace
                                                                    at CPU 1 misses this
                                                                    and does not copy this
                                                                    extent into the new/target
                                                                    device
      
             btrfs_dec_block_group_ro(bg X)
               --> turns block group X back to RW mode
      
             dev_replace->cursor_left is set to the
             logical end offset of block group X
      
      So fix this by waiting for all cow and nocow writes after setting a block
      group to readonly mode.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      f0e9b7d6
  11. 26 5月, 2016 2 次提交
  12. 16 5月, 2016 1 次提交
  13. 06 5月, 2016 3 次提交
  14. 29 4月, 2016 1 次提交