1. 20 8月, 2021 1 次提交
  2. 19 8月, 2021 2 次提交
  3. 02 6月, 2021 1 次提交
  4. 23 1月, 2021 1 次提交
    • C
      xfs: Introduce error injection to allocate only minlen size extents for files · 30151967
      Chandan Babu R 提交于
      This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which
      helps userspace test programs to get xfs_bmap_btalloc() to always
      allocate minlen sized extents.
      
      This is required for test programs which need a guarantee that minlen
      extents allocated for a file do not get merged with their existing
      neighbours in the inode's BMBT. "Inode fork extent overflow check" for
      Directories, Xattrs and extension of realtime inodes need this since the
      file offset at which the extents are being allocated cannot be
      explicitly controlled from userspace.
      
      One way to use this error tag is to,
      1. Consume all of the free space by sequentially writing to a file.
      2. Punch alternate blocks of the file. This causes CNTBT to contain
         sufficient number of one block sized extent records.
      3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag.
      After step 3, xfs_bmap_btalloc() will issue space allocation
      requests for minlen sized extents only.
      
      ENOSPC error code is returned to userspace when there aren't any "one
      block sized" extents left in any of the AGs.
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NChandan Babu R <chandanrlinux@gmail.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      30151967
  5. 14 5月, 2020 1 次提交
  6. 12 3月, 2020 1 次提交
  7. 04 11月, 2019 2 次提交
  8. 24 9月, 2019 1 次提交
  9. 31 8月, 2019 1 次提交
  10. 13 12月, 2018 1 次提交
  11. 12 7月, 2018 1 次提交
  12. 09 6月, 2018 1 次提交
  13. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  14. 16 5月, 2018 2 次提交
  15. 10 5月, 2018 2 次提交
  16. 10 4月, 2018 1 次提交
  17. 12 3月, 2018 1 次提交
  18. 18 1月, 2018 1 次提交
  19. 27 10月, 2017 1 次提交
  20. 20 6月, 2017 1 次提交
  21. 04 4月, 2017 2 次提交
  22. 18 2月, 2017 1 次提交
  23. 10 1月, 2017 1 次提交
    • C
      xfs: adjust allocation length in xfs_alloc_space_available · 54fee133
      Christoph Hellwig 提交于
      We must decide in xfs_alloc_fix_freelist if we can perform an
      allocation from a given AG is possible or not based on the available
      space, and should not fail the allocation past that point on a
      healthy file system.
      
      But currently we have two additional places that second-guess
      xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the
      maxlen parameter to remove the reservation before doing the
      allocation (but ignores the various minium freespace requirements),
      and xfs_alloc_fix_minleft tries to fix up the allocated length
      after we've found an extent, but ignores the reservations and also
      doesn't take the AGFL into account (and thus fails allocations
      for not matching minlen in some cases).
      
      Remove all these later fixups and just correct the maxlen argument
      inside xfs_alloc_fix_freelist once we have the AGF buffer locked.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      54fee133
  24. 26 9月, 2016 1 次提交
    • D
      xfs: remote attribute blocks aren't really userdata · 292378ed
      Dave Chinner 提交于
      When adding a new remote attribute, we write the attribute to the
      new extent before the allocation transaction is committed. This
      means we cannot reuse busy extents as that violates crash
      consistency semantics. Hence we currently treat remote attribute
      extent allocation like userdata because it has the same overwrite
      ordering constraints as userdata.
      
      Unfortunately, this also allows the allocator to incorrectly apply
      extent size hints to the remote attribute extent allocation. This
      results in interesting failures, such as transaction block
      reservation overruns and in-memory inode attribute fork corruption.
      
      To fix this, we need to separate the busy extent reuse configuration
      from the userdata configuration. This changes the definition of
      XFS_BMAPI_METADATA slightly - it now means that allocation is
      metadata and reuse of busy extents is acceptible due to the metadata
      ordering semantics of the journal. If this flag is not set, it
      means the allocation is that has unordered data writeback, and hence
      busy extent reuse is not allowed. It no longer implies the
      allocation is for user data, just that the data write will not be
      strictly ordered. This matches the semantics for both user data
      and remote attribute block allocation.
      
      As such, This patch changes the "userdata" field to a "datatype"
      field, and adds a "no busy reuse" flag to the field.
      When we detect an unordered data extent allocation, we immediately set
      the no reuse flag. We then set the "user data" flags based on the
      inode fork we are allocating the extent to. Hence we only set
      userdata flags on data fork allocations now and consider attribute
      fork remote extents to be an unordered metadata extent.
      
      The result is that remote attribute extents now have the expected
      allocation semantics, and the data fork allocation behaviour is
      completely unchanged.
      
      It should be noted that there may be other ways to fix this (e.g.
      use ordered metadata buffers for the remote attribute extent data
      write) but they are more invasive and difficult to validate both
      from a design and implementation POV. Hence this patch takes the
      simple, obvious route to fixing the problem...
      Reported-and-tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      292378ed
  25. 19 9月, 2016 1 次提交
    • D
      xfs: set up per-AG free space reservations · 3fd129b6
      Darrick J. Wong 提交于
      One unfortunate quirk of the reference count and reverse mapping
      btrees -- they can expand in size when blocks are written to *other*
      allocation groups if, say, one large extent becomes a lot of tiny
      extents.  Since we don't want to start throwing errors in the middle
      of CoWing, we need to reserve some blocks to handle future expansion.
      The transaction block reservation counters aren't sufficient here
      because we have to have a reserve of blocks in every AG, not just
      somewhere in the filesystem.
      
      Therefore, create two per-AG block reservation pools.  One feeds the
      AGFL so that rmapbt expansion always succeeds, and the other feeds all
      other metadata so that refcountbt expansion never fails.
      
      Use the count of how many reserved blocks we need to have on hand to
      create a virtual reservation in the AG.  Through selective clamping of
      the maximum length of allocation requests and of the length of the
      longest free extent, we can make it look like there's less free space
      in the AG unless the reservation owner is asking for blocks.
      
      In other words, play some accounting tricks in-core to make sure that
      we always have blocks available.  On the plus side, there's nothing to
      clean up if we crash, which is contrast to the strategy that the rough
      draft used (actually removing extents from the freespace btrees).
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      3fd129b6
  26. 03 8月, 2016 4 次提交
    • D
      xfs: don't update rmapbt when fixing agfl · 04f13060
      Darrick J. Wong 提交于
      Allow a caller of xfs_alloc_fix_freelist to disable rmapbt updates
      when fixing the AG freelist.  xfs_repair needs this during phase 5
      to be able to adjust the freelist while it's reconstructing the rmap
      btree; the missing entries will be added back at the very end of
      phase 5 once the AGFL contents settle down.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      04f13060
    • D
      xfs: rmap btree requires more reserved free space · 52548852
      Darrick J. Wong 提交于
      Originally-From: Dave Chinner <dchinner@redhat.com>
      
      The rmap btree is allocated from the AGFL, which means we have to
      ensure ENOSPC is reported to userspace before we run out of free
      space in each AG. The last allocation in an AG can cause a full
      height rmap btree split, and that means we have to reserve at least
      this many blocks *in each AG* to be placed on the AGFL at ENOSPC.
      Update the various space calculation functions to handle this.
      
      Also, because the macros are now executing conditional code and are
      called quite frequently, convert them to functions that initialise
      variables in the struct xfs_mount, use the new variables everywhere
      and document the calculations better.
      
      [darrick.wong@oracle.com: don't reserve blocks if !rmap]
      [dchinner@redhat.com: update m_ag_max_usable after growfs]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      52548852
    • D
      xfs: add owner field to extent allocation and freeing · 340785cc
      Darrick J. Wong 提交于
      For the rmap btree to work, we have to feed the extent owner
      information to the the allocation and freeing functions. This
      information is what will end up in the rmap btree that tracks
      allocated extents. While we technically don't need the owner
      information when freeing extents, passing it allows us to validate
      that the extent we are removing from the rmap btree actually
      belonged to the owner we expected it to belong to.
      
      We also define a special set of owner values for internal metadata
      that would otherwise have no owner. This allows us to tell the
      difference between metadata owned by different per-ag btrees, as
      well as static fs metadata (e.g. AG headers) and internal journal
      blocks.
      
      There are also a couple of special cases we need to take care of -
      during EFI recovery, we don't actually know who the original owner
      was, so we need to pass a wildcard to indicate that we aren't
      checking the owner for validity. We also need special handling in
      growfs, as we "free" the space in the last AG when extending it, but
      because it's new space it has no actual owner...
      
      While touching the xfs_bmap_add_free() function, re-order the
      parameters to put the struct xfs_mount first.
      
      Extend the owner field to include both the owner type and some sort
      of index within the owner.  The index field will be used to support
      reverse mappings when reflink is enabled.
      
      When we're freeing extents from an EFI, we don't have the owner
      information available (rmap updates have their own redo items).
      xfs_free_extent therefore doesn't need to do an rmap update. Make
      sure that the log replay code signals this correctly.
      
      This is based upon a patch originally from Dave Chinner. It has been
      extended to add more owner information with the intent of helping
      recovery operations when things go wrong (e.g. offset of user data
      block in a file).
      
      [dchinner: de-shout the xfs_rmap_*_owner helpers]
      [darrick: minor style fixes suggested by Christoph Hellwig]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      340785cc
    • D
      xfs: rmap btree add more reserved blocks · 8018026e
      Darrick J. Wong 提交于
      Originally-From: Dave Chinner <dchinner@redhat.com>
      
      XFS reserves a small amount of space in each AG for the minimum
      number of free blocks needed for operation. Adding the rmap btree
      increases the number of reserved blocks, but it also increases the
      complexity of the calculation as the free inode btree is optional
      (like the rmbt).
      
      Rather than calculate the prealloc blocks every time we need to
      check it, add a function to calculate it at mount time and store it
      in the struct xfs_mount, and convert the XFS_PREALLOC_BLOCKS macro
      just to use the xfs-mount variable directly.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      8018026e
  27. 21 6月, 2016 1 次提交
  28. 01 6月, 2016 1 次提交
  29. 04 1月, 2016 1 次提交
  30. 03 11月, 2015 1 次提交
    • D
      xfs: introduce BMAPI_ZERO for allocating zeroed extents · 3fbbbea3
      Dave Chinner 提交于
      To enable DAX to do atomic allocation of zeroed extents, we need to
      drive the block zeroing deep into the allocator. Because
      xfs_bmapi_write() can return merged extents on allocation that were
      only partially allocated (i.e. requested range spans allocated and
      hole regions, allocation into the hole was contiguous), we cannot
      zero the extent returned from xfs_bmapi_write() as that can
      overwrite existing data with zeros.
      
      Hence we have to drive the extent zeroing into the allocation code,
      prior to where we merge the extents into the BMBT and return the
      resultant map. This means we need to propagate this need down to
      the xfs_alloc_vextent() and issue the block zeroing at this point.
      
      While this functionality is being introduced for DAX, there is no
      reason why it is specific to DAX - we can per-zero blocks during the
      allocation transaction on any type of device. It's just slow (and
      usually slower than unwritten allocation and conversion) on
      traditional block devices so doesn't tend to get used. We can,
      however, hook hardware zeroing optimisations via sb_issue_zeroout()
      to this operation, so it may be useful in future and hence the
      "allocate zeroed blocks" API needs to be implementation neutral.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      3fbbbea3
  31. 22 6月, 2015 2 次提交
    • D
      xfs: clean up XFS_MIN_FREELIST macros · 496817b4
      Dave Chinner 提交于
      We no longer calculate the minimum freelist size from the on-disk
      AGF, so we don't need the macros used for this. That means the
      nested macros can be cleaned up, and turn this into an actual
      function so the logic is clear and concise. This will make it much
      easier to add support for the rmap btree when the time comes.
      
      This also gets rid of the XFS_AG_MAXLEVELS macro used by these
      freelist macros as it is simply a wrapper around a single variable.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      496817b4
    • D
      xfs: xfs_alloc_fix_freelist() can use incore perag structures · 50adbcb4
      Dave Chinner 提交于
      At the moment, xfs_alloc_fix_freelist() uses a mix of per-ag based
      access and agf buffer  based access to freelist and space usage
      information. However, once the AGF buffer is locked inside this
      function, it is guaranteed that both the in-memory and on-disk
      values are identical. xfs_alloc_fix_freelist() doesn't modify the
      values in the structures directly, so it is a read-only user of the
      infomration, and hence can use the per-ag structure exclusively for
      determining what it should do.
      
      This opens up an avenue for cleaning up a lot of duplicated logic
      whose only difference is the structure it gets the data from, and in
      doing so removes a lot of needless byte swapping overhead when
      fixing up the free list.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      50adbcb4