1. 20 10月, 2016 6 次提交
  2. 10 10月, 2016 4 次提交
  3. 06 10月, 2016 9 次提交
    • D
      xfs: garbage collect old cowextsz reservations · 83104d44
      Darrick J. Wong 提交于
      Trim CoW reservations made on behalf of a cowextsz hint if they get too
      old or we run low on quota, so long as we don't have dirty data awaiting
      writeback or directio operations in progress.
      
      Garbage collection of the cowextsize extents are kept separate from
      prealloc extent reaping because setting the CoW prealloc lifetime to a
      (much) higher value than the regular prealloc extent lifetime has been
      useful for combatting CoW fragmentation on VM hosts where the VMs
      experience bursty write behaviors and we can keep the utilization ratios
      low enough that we don't start to run out of space.  IOWs, it benefits
      us to keep the CoW fork reservations around for as long as we can unless
      we run out of blocks or hit inode reclaim.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      83104d44
    • D
      xfs: don't allow reflink when the AG is low on space · 6fa164b8
      Darrick J. Wong 提交于
      If the AG free space is down to the reserves, refuse to reflink our
      way out of space.  Hopefully userspace will make a real copy and/or go
      elsewhere.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      6fa164b8
    • D
      xfs: create a separate cow extent size hint for the allocator · f7ca3522
      Darrick J. Wong 提交于
      Create a per-inode extent size allocator hint for copy-on-write.  This
      hint is separate from the existing extent size hint so that CoW can
      take advantage of the fragmentation-reducing properties of extent size
      hints without disabling delalloc for regular writes.
      
      The extent size hint that's fed to the allocator during a copy on
      write operation is the greater of the cowextsize and regular extsize
      hint.
      
      During reflink, if we're sharing the entire source file to the entire
      destination file and the destination file doesn't already have a
      cowextsize hint, propagate the source file's cowextsize hint to the
      destination file.
      
      Furthermore, zero the bulkstat buffer prior to setting the fields
      so that we don't copy kernel memory contents into userspace.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      f7ca3522
    • D
      xfs: unshare a range of blocks via fallocate · 98cc2db5
      Darrick J. Wong 提交于
      Unshare all shared extents if the user calls fallocate with the new
      unshare mode flag set, so that we can guarantee that a subsequent
      write will not ENOSPC.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      [hch: pass inode instead of file to xfs_reflink_dirty_range,
            use iomap infrastructure for copy up]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      98cc2db5
    • D
      xfs: add dedupe range vfs function · cc714660
      Darrick J. Wong 提交于
      Define a VFS function which allows userspace to request that the
      kernel reflink a range of blocks between two files if the ranges'
      contents match.  The function fits the new VFS ioctl that standardizes
      the checking for the btrfs EXTENT SAME ioctl.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      cc714660
    • D
      xfs: reflink extents from one file to another · 862bb360
      Darrick J. Wong 提交于
      Reflink extents from one file to another; that is to say, iteratively
      remove the mappings from the destination file, copy the mappings from
      the source file to the destination file, and increment the reference
      count of all the blocks that got remapped.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      862bb360
    • D
      xfs: store in-progress CoW allocations in the refcount btree · 174edb0e
      Darrick J. Wong 提交于
      Due to the way the CoW algorithm in XFS works, there's an interval
      during which blocks allocated to handle a CoW can be lost -- if the FS
      goes down after the blocks are allocated but before the block
      remapping takes place.  This is exacerbated by the cowextsz hint --
      allocated reservations can sit around for a while, waiting to get
      used.
      
      Since the refcount btree doesn't normally store records with refcount
      of 1, we can use it to record these in-progress extents.  In-progress
      blocks cannot be shared because they're not user-visible, so there
      shouldn't be any conflicts with other programs.  This is a better
      solution than holding EFIs during writeback because (a) EFIs can't be
      relogged currently, (b) even if they could, EFIs are bound by
      available log space, which puts an unnecessary upper bound on how much
      CoW we can have in flight, and (c) we already have a mechanism to
      track blocks.
      
      At mount time, read the refcount records and free anything we find
      with a refcount of 1 because those were in-progress when the FS went
      down.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      174edb0e
    • D
      xfs: implement CoW for directio writes · 0613f16c
      Darrick J. Wong 提交于
      For O_DIRECT writes to shared blocks, we have to CoW them just like
      we would with buffered writes.  For writes that are not block-aligned,
      just bounce them to the page cache.
      
      For block-aligned writes, however, we can do better than that.  Use
      the same mechanisms that we employ for buffered CoW to set up a
      delalloc reservation, allocate all the blocks at once, issue the
      writes against the new blocks and use the same ioend functions to
      remap the blocks after the write.  This should be fairly performant.
      
      Christoph discovered that xfs_reflink_allocate_cow_range may stumble
      over invalid entries in the extent array given that it drops the ilock
      but still expects the index to be stable.  Simple fixing it to a new
      lookup for every iteration still isn't correct given that
      xfs_bmapi_allocate will trigger a BUG_ON() if hitting a hole, and
      there is nothing preventing a xfs_bunmapi_cow call removing extents
      once we dropped the ilock either.
      
      This patch duplicates the inner loop of xfs_bmapi_allocate into a
      helper for xfs_reflink_allocate_cow_range so that it can be done under
      the same ilock critical section as our CoW fork delayed allocation.
      The directio CoW warts will be revisited in a later patch.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      0613f16c
    • D
      xfs: move mappings from cow fork to data fork after copy-write · 43caeb18
      Darrick J. Wong 提交于
      After the write component of a copy-write operation finishes, clean up
      the bookkeeping left behind.  On error, we simply free the new blocks
      and pass the error up.  If we succeed, however, then we must remove
      the old data fork mapping and move the cow fork mapping to the data
      fork.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      [hch: Call the CoW failure function during xfs_cancel_ioend]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      43caeb18
  4. 05 10月, 2016 3 次提交
    • D
      xfs: allocate delayed extents in CoW fork · ef473667
      Darrick J. Wong 提交于
      Modify the writepage handler to find and convert pending delalloc
      extents to real allocations.  Furthermore, when we're doing non-cow
      writes to a part of a file that already has a CoW reservation (the
      cowextsz hint that we set up in a subsequent patch facilitates this),
      promote the write to copy-on-write so that the entire extent can get
      written out as a single extent on disk, thereby reducing post-CoW
      fragmentation.
      
      Christoph moved the CoW support code in _map_blocks to a separate helper
      function, refactored other functions, and reduced the number of CoW fork
      lookups, so I merged those changes here to reduce churn.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ef473667
    • D
      xfs: create delalloc extents in CoW fork · 2a06705c
      Darrick J. Wong 提交于
      Wire up iomap_begin to detect shared extents and create delayed allocation
      extents in the CoW fork:
      
       1) Check if we already have an extent in the COW fork for the area.
          If so nothing to do, we can move along.
       2) Look up block number for the current extent, and if there is none
          it's not shared move along.
       3) Unshare the current extent as far as we are going to write into it.
          For this we avoid an additional COW fork lookup and use the
          information we set aside in step 1) above.
       4) Goto 1) unless we've covered the whole range.
      
      Last but not least, this updates the xfs_reflink_reserve_cow_range calling
      convention to pass a byte offset and length, as that is what both callers
      expect anyway.  This patch has been refactored considerably as part of the
      iomap transition.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      2a06705c
    • D
      xfs: introduce the CoW fork · 3993baeb
      Darrick J. Wong 提交于
      Introduce a new in-core fork for storing copy-on-write delalloc
      reservations and allocated extents that are in the process of being
      written out.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      3993baeb