1. 10 4月, 2018 1 次提交
  2. 16 3月, 2018 2 次提交
  3. 12 3月, 2018 1 次提交
    • B
      xfs: account format bouncing into rmapbt swapext tx reservation · b3fed434
      Brian Foster 提交于
      The extent swap mechanism requires a unique implementation for
      rmapbt enabled filesystems. Because the rmapbt tracks extent owner
      information, extent swap must individually unmap and remap each
      extent between the two inodes.
      
      The rmapbt extent swap transaction block reservation currently
      accounts for the worst case bmapbt block and rmapbt block
      consumption based on the extent count of each inode. There is a
      corner case that exists due to the extent swap implementation that
      is not covered by this reservation, however.
      
      If one of the associated inodes is just over the max extent count
      used for extent format inodes (i.e., the inode is in btree format by
      a single extent), the unmap/remap cycle of the extent swap can
      bounce the inode between extent and btree format multiple times,
      almost as many times as there are extents in the inode (if the
      opposing inode happens to have one less, for example). Each back and
      forth cycle involves a block free and allocation, which isn't a
      problem except for that the initial transaction reservation must
      account for the total number of block allocations performed by the
      chain of deferred operations. If not, a block reservation overrun
      occurs and the filesystem shuts down.
      
      Update the rmapbt extent swap block reservation to check for this
      situation and add some block reservation slop to ensure the entire
      operation succeeds. We'd never likely require reservation for both
      inodes as fsr wouldn't defrag the file in that case, but the
      additional reservation is constrained by the data fork size so be
      cautious and check for both.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      b3fed434
  4. 29 1月, 2018 1 次提交
  5. 07 11月, 2017 2 次提交
  6. 27 10月, 2017 6 次提交
  7. 12 10月, 2017 1 次提交
  8. 04 10月, 2017 1 次提交
  9. 26 9月, 2017 1 次提交
  10. 02 9月, 2017 4 次提交
  11. 21 6月, 2017 2 次提交
    • D
      xfs: refactor the ifork block counting function · e7f5d5ca
      Darrick J. Wong 提交于
      Refactor the inode fork block counting function to count extents for us
      at the same time.  This will be used by the bmbt scrubber function.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      e7f5d5ca
    • D
      xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior · d29cb3e4
      Darrick J. Wong 提交于
      There is an inconsistency in the way that _bmap_count_blocks deals with
      delalloc reservations -- if the specified fork is in extents format,
      *count is set to the total number of blocks referenced by the in-core
      fork, including delalloc extents.  However, if the fork is in btree
      format, *count is set to the number of blocks referenced by the on-disk
      fork, which does /not/ include delalloc extents.
      
      For the lone existing caller of _bmap_count_blocks this hasn't been an
      issue because the function is only used to count xattr fork blocks
      (where there aren't any delalloc reservations).  However, when scrub
      comes along it will use this same function to check di_nblocks against
      both on-disk extent maps, so we need this behavior to be consistent.
      
      Therefore, fix _bmap_count_leaves not to include delalloc extents and
      remove unnecessary parameters.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      d29cb3e4
  12. 20 6月, 2017 2 次提交
  13. 19 6月, 2017 1 次提交
  14. 17 5月, 2017 2 次提交
  15. 26 4月, 2017 1 次提交
    • E
      xfs: more do_div cleanups · 4f1adf33
      Eric Sandeen 提交于
      On some architectures do_div does the pointer compare
      trick to make sure that we've sent it an unsigned 64-bit
      number.  (Why unsigned?  I don't know.)
      
      Fix up the few places that squawk about this; in
      xfs_bmap_wants_extents() we just used a bare int64_t so change
      that to unsigned.
      
      In xfs_adjust_extent_unmap_boundaries() all we wanted was the
      mod, and we have an xfs-specific function to handle that w/o
      side effects, which includes proper casting for do_div.
      
      In xfs_daddr_to_ag[b]no, we were using the wrong type anyway;
      XFS_BB_TO_FSBT returns a block in the filesystem, so use
      xfs_rfsblock_t not xfs_daddr_t, and gain the unsignedness
      from that type as a bonus.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      4f1adf33
  16. 12 4月, 2017 1 次提交
  17. 09 4月, 2017 1 次提交
  18. 04 4月, 2017 3 次提交
  19. 18 2月, 2017 1 次提交
  20. 17 2月, 2017 1 次提交
    • B
      xfs: don't reserve blocks for right shift transactions · 48af96ab
      Brian Foster 提交于
      The block reservation for the transaction allocated in
      xfs_shift_file_space() is an artifact of the original collapse range
      support. It exists to handle the case where a collapse range occurs,
      the initial extent is left shifted into a location that forms a
      contiguous boundary with the previous extent and thus the extents
      are merged. This code was subsequently refactored and reused for
      insert range (right shift) support.
      
      If an insert range occurs under low free space conditions, the
      extent at the starting offset is split before the first shift
      transaction is allocated. If the block reservation fails, this
      leaves separate, but contiguous extents around in the inode. While
      not a fatal problem, this is unexpected and will flag a warning on
      subsequent insert range operations on the inode. This problem has
      been reproduce intermittently by generic/270 running against a
      ramdisk device.
      
      Since right shift does not create new extent boundaries in the
      inode, a block reservation for extent merge is unnecessary. Update
      xfs_shift_file_space() to conditionally reserve fs blocks for left
      shift transactions only. This avoids the warning reproduced by
      generic/270.
      Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      48af96ab
  21. 31 1月, 2017 3 次提交
    • E
      xfs: remove unused full argument from bmap · 1dbba086
      Eric Sandeen 提交于
      The "full" argument was used only by the fiemap formatter,
      which is now gone with the iomap updates.
      
      Remove the unused arg.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NAlex Elder <elder@linaro.org>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      1dbba086
    • B
      xfs: fix eofblocks race with file extending async dio writes · e4229d6b
      Brian Foster 提交于
      It's possible for post-eof blocks to end up being used for direct I/O
      writes. dio write performs an upfront unwritten extent allocation, sends
      the dio and then updates the inode size (if necessary) on write
      completion. If a file release occurs while a file extending dio write is
      in flight, it is possible to mistake the post-eof blocks for speculative
      preallocation and incorrectly truncate them from the inode. This means
      that the resulting dio write completion can discover a hole and allocate
      new blocks rather than perform unwritten extent conversion.
      
      This requires a strange mix of I/O and is thus not likely to reproduce
      in real world workloads. It is intermittently reproduced by generic/299.
      The error manifests as an assert failure due to transaction overrun
      because the aforementioned write completion transaction has only
      reserved enough blocks for btree operations:
      
        XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
         file: fs/xfs//xfs_trans.c, line: 309
      
      The root cause is that xfs_free_eofblocks() uses i_size to truncate
      post-eof blocks from the inode, but async, file extending direct writes
      do not update i_size until write completion, long after inode locks are
      dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
      to the incorrect size.
      
      Update xfs_free_eofblocks() to serialize against dio similar to how
      extending writes are serialized against i_size updates before post-eof
      block zeroing. Specifically, wait on dio while under the iolock. This
      ensures that dio write completions have updated i_size before post-eof
      blocks are processed.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      e4229d6b
    • B
      xfs: pull up iolock from xfs_free_eofblocks() · a36b9261
      Brian Foster 提交于
      xfs_free_eofblocks() requires the IOLOCK_EXCL lock, but is called from
      different contexts where the lock may or may not be held. The
      need_iolock parameter exists for this reason, to indicate whether
      xfs_free_eofblocks() must acquire the iolock itself before it can
      proceed.
      
      This is ugly and confusing. Simplify the semantics of
      xfs_free_eofblocks() to require the caller to acquire the iolock
      appropriately and kill the need_iolock parameter. While here, the mp
      param can be removed as well as the xfs_mount is accessible from the
      xfs_inode structure. This patch does not change behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      a36b9261
  22. 27 1月, 2017 1 次提交
    • D
      xfs: fix bmv_count confusion w/ shared extents · c364b6d0
      Darrick J. Wong 提交于
      In a bmapx call, bmv_count is the total size of the array, including the
      zeroth element that userspace uses to supply the search key.  The output
      array starts at offset 1 so that we can set up the user for the next
      invocation.  Since we now can split an extent into multiple bmap records
      due to shared/unshared status, we have to be careful that we don't
      overflow the output array.
      
      In the original patch f86f4037 ("xfs: teach get_bmapx about shared
      extents and the CoW fork") I used cur_ext (the output index) to check
      for overflows, albeit with an off-by-one error.  Since nexleft no longer
      describes the number of unfilled slots in the output, we can rip all
      that out and use cur_ext for the overflow check directly.
      
      Failure to do this causes heap corruption in bmapx callers such as
      xfs_io and xfs_scrub.  xfs/328 can reproduce this problem.
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      c364b6d0
  23. 30 11月, 2016 1 次提交