1. 12 7月, 2018 2 次提交
  2. 25 6月, 2018 1 次提交
  3. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  4. 16 5月, 2018 3 次提交
  5. 10 5月, 2018 1 次提交
  6. 10 4月, 2018 1 次提交
  7. 24 3月, 2018 1 次提交
  8. 07 11月, 2017 2 次提交
  9. 27 10月, 2017 6 次提交
  10. 17 10月, 2017 1 次提交
    • B
      xfs: trim writepage mapping to within eof · 40214d12
      Brian Foster 提交于
      The writeback rework in commit fbcc0256 ("xfs: Introduce
      writeback context for writepages") introduced a subtle change in
      behavior with regard to the block mapping used across the
      ->writepages() sequence. The previous xfs_cluster_write() code would
      only flush pages up to EOF at the time of the writepage, thus
      ensuring that any pages due to file-extending writes would be
      handled on a separate cycle and with a new, updated block mapping.
      
      The updated code establishes a block mapping in xfs_writepage_map()
      that could extend beyond EOF if the file has post-eof preallocation.
      Because we now use the generic writeback infrastructure and pass the
      cached mapping to each writepage call, there is no implicit EOF
      limit in place. If eofblocks trimming occurs during ->writepages(),
      any post-eof portion of the cached mapping becomes invalid. The
      eofblocks code has no means to serialize against writeback because
      there are no pages associated with post-eof blocks. Therefore if an
      eofblocks trim occurs and is followed by a file-extending buffered
      write, not only has the mapping become invalid, but we could end up
      writing a page to disk based on the invalid mapping.
      
      Consider the following sequence of events:
      
      - A buffered write creates a delalloc extent and post-eof
        speculative preallocation.
      - Writeback starts and on the first writepage cycle, the delalloc
        extent is converted to real blocks (including the post-eof blocks)
        and the mapping is cached.
      - The file is closed and xfs_release() trims post-eof blocks. The
        cached writeback mapping is now invalid.
      - Another buffered write appends the file with a delalloc extent.
      - The concurrent writeback cycle picks up the just written page
        because the writeback range end is LLONG_MAX. xfs_writepage_map()
        attributes it to the (now invalid) cached mapping and writes the
        data to an incorrect location on disk (and where the file offset is
        still backed by a delalloc extent).
      
      This problem is reproduced by xfstests test generic/464, which
      triggers racing writes, appends, open/closes and writeback requests.
      
      To address this problem, trim the mapping used during writeback to
      within EOF when the mapping is validated. This ensures the mapping
      is revalidated for any pages encountered beyond EOF as of the time
      the current mapping was cached or last validated.
      Reported-by: NEryu Guan <eguan@redhat.com>
      Diagnosed-by: NEryu Guan <eguan@redhat.com>
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      40214d12
  11. 19 6月, 2017 1 次提交
    • D
      xfs: try to avoid blowing out the transaction reservation when bunmaping a shared extent · e1a4e37c
      Darrick J. Wong 提交于
      In a pathological scenario where we are trying to bunmapi a single
      extent in which every other block is shared, it's possible that trying
      to unmap the entire large extent in a single transaction can generate so
      many EFIs that we overflow the transaction reservation.
      
      Therefore, use a heuristic to guess at the number of blocks we can
      safely unmap from a reflink file's data fork in an single transaction.
      This should prevent problems such as the log head slamming into the tail
      and ASSERTs that trigger because we've exceeded the transaction
      reservation.
      
      Note that since bunmapi can fail to unmap the entire range, we must also
      teach the deferred unmap code to roll into a new transaction whenever we
      get low on reservation.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      [hch: random edits, all bugs are my fault]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      e1a4e37c
  12. 26 4月, 2017 1 次提交
    • C
      xfs: simplify validation of the unwritten extent bit · 0c1d9e4a
      Christoph Hellwig 提交于
      XFS only supports the unwritten extent bit in the data fork, and only if
      the file system has a version 5 superblock or the unwritten extent
      feature bit.
      
      We currently have two routines that validate the invariant:
      xfs_check_nostate_extents which return -EFSCORRUPTED when it's not met,
      and xfs_validate_extent that triggers and assert in debug build.
      
      Both of them iterate over all extents of an inode fork when called,
      which isn't very efficient.
      
      This patch instead adds a new helper that verifies the invariant one
      extent at a time, and calls it from the places where we iterate over
      all extents to converted them from or two the in-memory format.  The
      callers then return -EFSCORRUPTED when reading invalid extents from
      disk, or trigger an assert when writing them to disk.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0c1d9e4a
  13. 04 4月, 2017 1 次提交
  14. 24 1月, 2017 1 次提交
    • C
      xfs: fix COW writeback race · d2b3964a
      Christoph Hellwig 提交于
      Due to the way how xfs_iomap_write_allocate tries to convert the whole
      found extents from delalloc to real space we can run into a race
      condition with multiple threads doing writes to this same extent.
      For the non-COW case that is harmless as the only thing that can happen
      is that we call xfs_bmapi_write on an extent that has already been
      converted to a real allocation.  For COW writes where we move the extent
      from the COW to the data fork after I/O completion the race is, however,
      not quite as harmless.  In the worst case we are now calling
      xfs_bmapi_write on a region that contains hole in the COW work, which
      will trip up an assert in debug builds or lead to file system corruption
      in non-debug builds.  This seems to be reproducible with workloads of
      small O_DSYNC write, although so far I've not managed to come up with
      a with an isolated reproducer.
      
      The fix for the issue is relatively simple:  tell xfs_bmapi_write
      that we are only asked to convert delayed allocations and skip holes
      in that case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      d2b3964a
  15. 28 11月, 2016 1 次提交
    • B
      xfs: track preallocation separately in xfs_bmapi_reserve_delalloc() · 974ae922
      Brian Foster 提交于
      Speculative preallocation is currently processed entirely by the callers
      of xfs_bmapi_reserve_delalloc(). The caller determines how much
      preallocation to include, adjusts the extent length and passes down the
      resulting request.
      
      While this works fine for post-eof speculative preallocation, it is not
      as reliable for COW fork preallocation. COW fork preallocation is
      implemented via the cowextszhint, which aligns the start offset as well
      as the length of the extent. Further, it is difficult for the caller to
      accurately identify when preallocation occurs because the returned
      extent could have been merged with neighboring extents in the fork.
      
      To simplify this situation and facilitate further COW fork preallocation
      enhancements, update xfs_bmapi_reserve_delalloc() to take a separate
      preallocation parameter to incorporate into the allocation request. The
      preallocation blocks value is tacked onto the end of the request and
      adjusted to accommodate neighboring extents and extent size limits.
      Since xfs_bmapi_reserve_delalloc() now knows precisely how much
      preallocation was included in the allocation, it can also tag the inodes
      appropriately to support preallocation reclaim.
      
      Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to
      use the preallocation mechanism. This patch should not change behavior
      outside of correctly tagging reflink inodes when start offset
      preallocation occurs (which the caller does not handle correctly).
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      974ae922
  16. 24 11月, 2016 2 次提交
  17. 20 10月, 2016 3 次提交
  18. 06 10月, 2016 1 次提交
  19. 05 10月, 2016 7 次提交
  20. 26 9月, 2016 1 次提交
    • D
      xfs: remote attribute blocks aren't really userdata · 292378ed
      Dave Chinner 提交于
      When adding a new remote attribute, we write the attribute to the
      new extent before the allocation transaction is committed. This
      means we cannot reuse busy extents as that violates crash
      consistency semantics. Hence we currently treat remote attribute
      extent allocation like userdata because it has the same overwrite
      ordering constraints as userdata.
      
      Unfortunately, this also allows the allocator to incorrectly apply
      extent size hints to the remote attribute extent allocation. This
      results in interesting failures, such as transaction block
      reservation overruns and in-memory inode attribute fork corruption.
      
      To fix this, we need to separate the busy extent reuse configuration
      from the userdata configuration. This changes the definition of
      XFS_BMAPI_METADATA slightly - it now means that allocation is
      metadata and reuse of busy extents is acceptible due to the metadata
      ordering semantics of the journal. If this flag is not set, it
      means the allocation is that has unordered data writeback, and hence
      busy extent reuse is not allowed. It no longer implies the
      allocation is for user data, just that the data write will not be
      strictly ordered. This matches the semantics for both user data
      and remote attribute block allocation.
      
      As such, This patch changes the "userdata" field to a "datatype"
      field, and adds a "no busy reuse" flag to the field.
      When we detect an unordered data extent allocation, we immediately set
      the no reuse flag. We then set the "user data" flags based on the
      inode fork we are allocating the extent to. Hence we only set
      userdata flags on data fork allocations now and consider attribute
      fork remote extents to be an unordered metadata extent.
      
      The result is that remote attribute extents now have the expected
      allocation semantics, and the data fork allocation behaviour is
      completely unchanged.
      
      It should be noted that there may be other ways to fix this (e.g.
      use ordered metadata buffers for the remote attribute extent data
      write) but they are more invasive and difficult to validate both
      from a design and implementation POV. Hence this patch takes the
      simple, obvious route to fixing the problem...
      Reported-and-tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      292378ed
  21. 19 9月, 2016 1 次提交
    • C
      xfs: rewrite and optimize the delalloc write path · 51446f5b
      Christoph Hellwig 提交于
      Currently xfs_iomap_write_delay does up to lookups in the inode
      extent tree, which is rather costly especially with the new iomap
      based write path and small write sizes.
      
      But it turns out that the low-level xfs_bmap_search_extents gives us
      all the information we need in the regular delalloc buffered write
      path:
      
       - it will return us an extent covering the block we are looking up
         if it exists.  In that case we can simply return that extent to
         the caller and are done
       - it will tell us if we are beyoned the last current allocated
         block with an eof return parameter.  In that case we can create a
         delalloc reservation and use the also returned information about
         the last extent in the file as the hint to size our delalloc
         reservation.
       - it can tell us that we are writing into a hole, but that there is
         an extent beyoned this hole.  In this case we can create a
         delalloc reservation that covers the requested size (possible
         capped to the next existing allocation).
      
      All that can be done in one single routine instead of bouncing up
      and down a few layers.  This reduced the CPU overhead of the block
      mapping routines and also simplified the code a lot.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      51446f5b
  22. 03 8月, 2016 1 次提交
    • D
      xfs: add owner field to extent allocation and freeing · 340785cc
      Darrick J. Wong 提交于
      For the rmap btree to work, we have to feed the extent owner
      information to the the allocation and freeing functions. This
      information is what will end up in the rmap btree that tracks
      allocated extents. While we technically don't need the owner
      information when freeing extents, passing it allows us to validate
      that the extent we are removing from the rmap btree actually
      belonged to the owner we expected it to belong to.
      
      We also define a special set of owner values for internal metadata
      that would otherwise have no owner. This allows us to tell the
      difference between metadata owned by different per-ag btrees, as
      well as static fs metadata (e.g. AG headers) and internal journal
      blocks.
      
      There are also a couple of special cases we need to take care of -
      during EFI recovery, we don't actually know who the original owner
      was, so we need to pass a wildcard to indicate that we aren't
      checking the owner for validity. We also need special handling in
      growfs, as we "free" the space in the last AG when extending it, but
      because it's new space it has no actual owner...
      
      While touching the xfs_bmap_add_free() function, re-order the
      parameters to put the struct xfs_mount first.
      
      Extend the owner field to include both the owner type and some sort
      of index within the owner.  The index field will be used to support
      reverse mappings when reflink is enabled.
      
      When we're freeing extents from an EFI, we don't have the owner
      information available (rmap updates have their own redo items).
      xfs_free_extent therefore doesn't need to do an rmap update. Make
      sure that the log replay code signals this correctly.
      
      This is based upon a patch originally from Dave Chinner. It has been
      extended to add more owner information with the intent of helping
      recovery operations when things go wrong (e.g. offset of user data
      block in a file).
      
      [dchinner: de-shout the xfs_rmap_*_owner helpers]
      [darrick: minor style fixes suggested by Christoph Hellwig]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      340785cc