1. 21 2月, 2019 2 次提交
    • C
      xfs: introduce an always_cow mode · 66ae56a5
      Christoph Hellwig 提交于
      Add a mode where XFS never overwrites existing blocks in place.  This
      is to aid debugging our COW code, and also put infatructure in place
      for things like possible future support for zoned block devices, which
      can't support overwrites.
      
      This mode is enabled globally by doing a:
      
          echo 1 > /sys/fs/xfs/debug/always_cow
      
      Note that the parameter is global to allow running all tests in xfstests
      easily in this mode, which would not easily be possible with a per-fs
      sysfs file.
      
      In always_cow mode persistent preallocations are disabled, and fallocate
      will fail when called with a 0 mode (with our without
      FALLOC_FL_KEEP_SIZE), and not create unwritten extent for zeroed space
      when called with FALLOC_FL_ZERO_RANGE or FALLOC_FL_UNSHARE_RANGE.
      
      There are a few interesting xfstests failures when run in always_cow
      mode:
      
       - generic/392 fails because the bytes used in the file used to test
         hole punch recovery are less after the log replay.  This is
         because the blocks written and then punched out are only freed
         with a delay due to the logging mechanism.
       - xfs/170 will fail as the already fragile file streams mechanism
         doesn't seem to interact well with the COW allocator
       - xfs/180 xfs/182 xfs/192 xfs/198 xfs/204 and xfs/208 will claim
         the file system is badly fragmented, but there is not much we
         can do to avoid that when always writing out of place
       - xfs/205 fails because overwriting a file in always_cow mode
         will require new space allocation and the assumption in the
         test thus don't work anymore.
       - xfs/326 fails to modify the file at all in always_cow mode after
         injecting the refcount error, leading to an unexpected md5sum
         after the remount, but that again is expected
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      66ae56a5
    • C
      xfs: also truncate holes covered by COW blocks · 12df89f2
      Christoph Hellwig 提交于
      This only matters if we want to write data through the COW fork that is
      not actually an overwrite of existing data.  Reasons for that are
      speculative COW fork allocations using the cowextsize, or a mode where
      we always write through the COW fork.  Currently both can't actually
      happen, but I plan to enable them.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      12df89f2
  2. 18 2月, 2019 5 次提交
  3. 15 2月, 2019 2 次提交
  4. 12 2月, 2019 2 次提交
    • B
      xfs: remove superfluous writeback mapping eof trimming · 3b350898
      Brian Foster 提交于
      Now that the cached writeback mapping is explicitly invalidated on
      data fork changes, the EOF trimming band-aid is no longer necessary.
      Remove xfs_trim_extent_eof() as well since it has no other users.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      3b350898
    • B
      xfs: validate writeback mapping using data fork seq counter · d9252d52
      Brian Foster 提交于
      The writeback code caches the current extent mapping across multiple
      xfs_do_writepage() calls to avoid repeated lookups for sequential
      pages backed by the same extent. This is known to be slightly racy
      with extent fork changes in certain difficult to reproduce
      scenarios. The cached extent is trimmed to within EOF to help avoid
      the most common vector for this problem via speculative
      preallocation management, but this is a band-aid that does not
      address the fundamental problem.
      
      Now that we have an xfs_ifork sequence counter mechanism used to
      facilitate COW writeback, we can use the same mechanism to validate
      consistency between the data fork and cached writeback mappings. On
      its face, this is somewhat of a big hammer approach because any
      change to the data fork invalidates any mapping currently cached by
      a writeback in progress regardless of whether the data fork change
      overlaps with the range under writeback. In practice, however, the
      impact of this approach is minimal in most cases.
      
      First, data fork changes (delayed allocations) caused by sustained
      sequential buffered writes are amortized across speculative
      preallocations. This means that a cached mapping won't be
      invalidated by each buffered write of a common file copy workload,
      but rather only on less frequent allocation events. Second, the
      extent tree is always entirely in-core so an additional lookup of a
      usable extent mostly costs a shared ilock cycle and in-memory tree
      lookup. This means that a cached mapping reval is relatively cheap
      compared to the I/O itself. Third, spurious invalidations don't
      impact ioend construction. This means that even if the same extent
      is revalidated multiple times across multiple writepage instances,
      we still construct and submit the same size ioend (and bio) if the
      blocks are physically contiguous.
      
      Update struct xfs_writepage_ctx with a new field to hold the
      sequence number of the data fork associated with the currently
      cached mapping. Check the wpc seqno against the data fork when the
      mapping is validated and reestablish the mapping whenever the fork
      has changed since the mapping was cached. This ensures that
      writeback always uses a valid extent mapping and thus prevents lost
      writebacks and stale delalloc block problems.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      d9252d52
  5. 04 2月, 2019 1 次提交
    • B
      xfs: eof trim writeback mapping as soon as it is cached · aa6ee4ab
      Brian Foster 提交于
      The cached writeback mapping is EOF trimmed to try and avoid races
      between post-eof block management and writeback that result in
      sending cached data to a stale location. The cached mapping is
      currently trimmed on the validation check, which leaves a race
      window between the time the mapping is cached and when it is trimmed
      against the current inode size.
      
      For example, if a new mapping is cached by delalloc conversion on a
      blocksize == page size fs, we could cycle various locks, perform
      memory allocations, etc.  in the writeback codepath before the
      associated mapping is eventually trimmed to i_size. This leaves
      enough time for a post-eof truncate and file append before the
      cached mapping is trimmed. The former event essentially invalidates
      a range of the cached mapping and the latter bumps the inode size
      such the trim on the next writepage event won't trim all of the
      invalid blocks. fstest generic/464 reproduces this scenario
      occasionally and causes a lost writeback and stale delalloc blocks
      warning on inode inactivation.
      
      To work around this problem, trim the cached writeback mapping as
      soon as it is cached in addition to on subsequent validation checks.
      This is a minor tweak to tighten the race window as much as possible
      until a proper invalidation mechanism is available.
      
      Fixes: 40214d12 ("xfs: trim writepage mapping to within eof")
      Cc: <stable@vger.kernel.org> # v4.14+
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      aa6ee4ab
  6. 18 10月, 2018 1 次提交
  7. 08 8月, 2018 1 次提交
  8. 01 8月, 2018 1 次提交
  9. 30 7月, 2018 1 次提交
  10. 12 7月, 2018 20 次提交
  11. 09 6月, 2018 1 次提交
  12. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  13. 02 6月, 2018 1 次提交
  14. 31 5月, 2018 1 次提交