1. 29 1月, 2018 3 次提交
  2. 22 12月, 2017 1 次提交
  3. 15 12月, 2017 4 次提交
  4. 09 12月, 2017 1 次提交
  5. 07 11月, 2017 3 次提交
  6. 27 10月, 2017 1 次提交
  7. 04 10月, 2017 1 次提交
  8. 02 9月, 2017 1 次提交
  9. 21 7月, 2017 2 次提交
  10. 20 6月, 2017 2 次提交
  11. 04 5月, 2017 1 次提交
    • D
      xfs: reserve enough blocks to handle btree splits when remapping · fe0be23e
      Darrick J. Wong 提交于
      In xfs_reflink_end_cow, we erroneously reserve only enough blocks to
      handle adding 1 extent.  This is problematic if we fragment free space,
      have to do CoW, and then have to perform multiple bmap btree expansions.
      Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to
      handle btree splits, so log recovery fails after our first error causes
      the filesystem to go down.
      
      Therefore, refactor the transaction block reservation macros until we
      have a macro that works for our deferred (re)mapping activities, and fix
      both problems by using that macro.
      
      With 1k blocks we can hit this fairly often in g/187 if the scratch fs
      is big enough.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      fe0be23e
  12. 04 4月, 2017 1 次提交
  13. 08 3月, 2017 1 次提交
  14. 17 2月, 2017 1 次提交
  15. 10 2月, 2017 1 次提交
  16. 07 2月, 2017 3 次提交
  17. 03 2月, 2017 1 次提交
    • D
      xfs: mark speculative prealloc CoW fork extents unwritten · 5eda4300
      Darrick J. Wong 提交于
      Christoph Hellwig pointed out that there's a potentially nasty race when
      performing simultaneous nearby directio cow writes:
      
      "Thread 1 writes a range from B to c
      
      "                    B --------- C
                                 p
      
      "a little later thread 2 writes from A to B
      
      "        A --------- B
                     p
      
      [editor's note: the 'p' denote cowextsize boundaries, which I added to
      make this more clear]
      
      "but the code preallocates beyond B into the range where thread
      "1 has just written, but ->end_io hasn't been called yet.
      "But once ->end_io is called thread 2 has already allocated
      "up to the extent size hint into the write range of thread 1,
      "so the end_io handler will splice the unintialized blocks from
      "that preallocation back into the file right after B."
      
      We can avoid this race by ensuring that thread 1 cannot accidentally
      remap the blocks that thread 2 allocated (as part of speculative
      preallocation) as part of t2's write preparation in t1's end_io handler.
      The way we make this happen is by taking advantage of the unwritten
      extent flag as an intermediate step.
      
      Recall that when we begin the process of writing data to shared blocks,
      we create a delayed allocation extent in the CoW fork:
      
      D: --RRRRRRSSSRRRRRRRR---
      C: ------DDDDDDD---------
      
      When a thread prepares to CoW some dirty data out to disk, it will now
      convert the delalloc reservation into an /unwritten/ allocated extent in
      the cow fork.  The da conversion code tries to opportunistically
      allocate as much of a (speculatively prealloc'd) extent as possible, so
      we may end up allocating a larger extent than we're actually writing
      out:
      
      D: --RRRRRRSSSRRRRRRRR---
      U: ------UUUUUUU---------
      
      Next, we convert only the part of the extent that we're actively
      planning to write to normal (i.e. not unwritten) status:
      
      D: --RRRRRRSSSRRRRRRRR---
      U: ------UURRUUU---------
      
      If the write succeeds, the end_cow function will now scan the relevant
      range of the CoW fork for real extents and remap only the real extents
      into the data fork:
      
      D: --RRRRRRRRSRRRRRRRR---
      U: ------UU--UUU---------
      
      This ensures that we never obliterate valid data fork extents with
      unwritten blocks from the CoW fork.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      5eda4300
  18. 23 12月, 2016 1 次提交
  19. 10 12月, 2016 1 次提交
  20. 30 11月, 2016 1 次提交
  21. 28 11月, 2016 3 次提交
    • B
      xfs: clean up cow fork reservation and tag inodes correctly · 0260d8ff
      Brian Foster 提交于
      COW fork reservation is implemented via delayed allocation. The code is
      modeled after the traditional delalloc allocation code, but is slightly
      different in terms of how preallocation occurs. Rather than post-eof
      speculative preallocation, COW fork preallocation is implemented via a
      COW extent size hint that is designed to minimize fragmentation as a
      reflinked file is split over time.
      
      xfs_reflink_reserve_cow() still uses logic that is oriented towards
      dealing with post-eof speculative preallocation, however, and is stale
      or not necessarily correct. First, the EOF alignment to the COW extent
      size hint is implemented in xfs_bmapi_reserve_delalloc() (which does so
      correctly by aligning the start and end offsets) and so is not necessary
      in xfs_reflink_reserve_cow(). The backoff and retry logic on ENOSPC is
      also ineffective for the same reason, as xfs_bmapi_reserve_delalloc()
      will simply perform the same allocation request on the retry. Finally,
      since the COW extent size hint aligns the start and end offset of the
      range to allocate, the end_fsb != orig_end_fsb logic is not sufficient.
      Indeed, if a write request happens to end on an aligned offset, it is
      possible that we do not tag the inode for COW preallocation even though
      xfs_bmapi_reserve_delalloc() may have preallocated at the start offset.
      
      Kill the unnecessary, duplicate code in xfs_reflink_reserve_cow().
      Remove the inode tag logic as well since xfs_bmapi_reserve_delalloc()
      has been updated to tag the inode correctly.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      0260d8ff
    • B
      xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() · 974ae922
      Brian Foster 提交于
      Speculative preallocation is currently processed entirely by the callers
      of xfs_bmapi_reserve_delalloc(). The caller determines how much
      preallocation to include, adjusts the extent length and passes down the
      resulting request.
      
      While this works fine for post-eof speculative preallocation, it is not
      as reliable for COW fork preallocation. COW fork preallocation is
      implemented via the cowextszhint, which aligns the start offset as well
      as the length of the extent. Further, it is difficult for the caller to
      accurately identify when preallocation occurs because the returned
      extent could have been merged with neighboring extents in the fork.
      
      To simplify this situation and facilitate further COW fork preallocation
      enhancements, update xfs_bmapi_reserve_delalloc() to take a separate
      preallocation parameter to incorporate into the allocation request. The
      preallocation blocks value is tacked onto the end of the request and
      adjusted to accommodate neighboring extents and extent size limits.
      Since xfs_bmapi_reserve_delalloc() now knows precisely how much
      preallocation was included in the allocation, it can also tag the inodes
      appropriately to support preallocation reclaim.
      
      Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to
      use the preallocation mechanism. This patch should not change behavior
      outside of correctly tagging reflink inodes when start offset
      preallocation occurs (which the caller does not handle correctly).
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      974ae922
    • D
      xfs: always succeed when deduping zero bytes · fba3e594
      Darrick J. Wong 提交于
      It turns out that btrfs and xfs had differing interpretations of what
      to do when the dedupe length is zero.  Change xfs to follow btrfs'
      semantics so that the userland interface is consistent.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      fba3e594
  22. 24 11月, 2016 6 次提交