1. 22 10月, 2019 9 次提交
  2. 27 8月, 2019 1 次提交
  3. 03 7月, 2019 1 次提交
  4. 29 6月, 2019 6 次提交
  5. 13 6月, 2019 1 次提交
  6. 23 4月, 2019 1 次提交
  7. 15 4月, 2019 1 次提交
    • B
      xfs: don't account extra agfl blocks as available · 1ca89fbc
      Brian Foster 提交于
      The block allocation AG selection code has parameters that allow a
      caller to perform multiple allocations from a single AG and
      transaction (under certain conditions). The parameters specify the
      total block allocation count required by the transaction and the AG
      selection code selects and locks an AG that will be able to satisfy
      the overall requirement. If the available block accounting
      calculation turns out to be inaccurate and a subsequent allocation
      call fails with -ENOSPC, the resulting transaction cancel leads to
      filesystem shutdown because the transaction is dirty.
      
      This exact problem can be reproduced with a highly parallel space
      consumer and fsstress workload running long enough to a large
      filesystem against -ENOSPC conditions. A bmbt block allocation
      request made for inode extent to bmap format conversion after an
      extent allocation is expected to be satisfied by the same AG and the
      same transaction as the extent allocation. The bmbt block allocation
      fails, however, because the block availability of the AG has changed
      since the AG was selected (outside of the blocks used for the extent
      itself).
      
      The inconsistent block availability calculation is caused by the
      deferred block freeing behavior of the AGFL. This immediately
      removes extra blocks from the AGFL to free up AGFL slots, but rather
      than immediately freeing such blocks as was done in the past, the
      block free is deferred such that said blocks are not available for
      allocation until the current transaction commits. The AG selection
      logic currently considers all AGFL blocks as available and executes
      shortly before any extra AGFL blocks are freed. This means the block
      availability of the current AG can change before the first
      allocation even occurs, but in practice a failure is more likely to
      manifest via a subsequent allocation because extent allocation
      usually has a contiguity requirement larger than a single block that
      can't be satisfied from the AGFL.
      
      In general, XFS prefers operational robustness to absolute
      allocation efficiency. In other words, we prefer to return -ENOSPC
      slightly earlier at the expense of not being able to allocate every
      last block in an AG to avoid this kind of problem. As such, update
      the AG block availability calculation to consider extra AGFL blocks
      as unavailable since they are immediately removed following the
      calculation and will not become available until the current
      transaction commits.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      1ca89fbc
  8. 12 2月, 2019 2 次提交
  9. 13 12月, 2018 3 次提交
  10. 03 8月, 2018 2 次提交
  11. 01 8月, 2018 1 次提交
  12. 12 7月, 2018 3 次提交
  13. 28 6月, 2018 1 次提交
  14. 09 6月, 2018 1 次提交
  15. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  16. 06 6月, 2018 1 次提交
    • D
      xfs: validate btree records on retrieval · 9e6c08d4
      Dave Chinner 提交于
      So we don't check the validity of records as we walk the btree. When
      there are corrupt records in the free space btree (e.g. zero
      startblock/length or beyond EOAG) we just blindly use it and things
      go bad from there. That leads to assert failures on debug kernels
      like this:
      
      XFS: Assertion failed: fs_is_ok, file: fs/xfs/libxfs/xfs_alloc.c, line: 450
      ....
      Call Trace:
       xfs_alloc_fixup_trees+0x368/0x5c0
       xfs_alloc_ag_vextent_near+0x79a/0xe20
       xfs_alloc_ag_vextent+0x1d3/0x330
       xfs_alloc_vextent+0x5e9/0x870
      
      Or crashes like this:
      
      XFS (loop0): xfs_buf_find: daddr 0x7fb28 out of range, EOFS 0x8000
      .....
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
      ....
      Call Trace:
       xfs_bmap_add_extent_hole_real+0x67d/0x930
       xfs_bmapi_write+0x934/0xc90
       xfs_da_grow_inode_int+0x27e/0x2f0
       xfs_dir2_grow_inode+0x55/0x130
       xfs_dir2_sf_to_block+0x94/0x5d0
       xfs_dir2_sf_addname+0xd0/0x590
       xfs_dir_createname+0x168/0x1a0
       xfs_rename+0x658/0x9b0
      
      By checking that free space records pulled from the trees are
      within the valid range, we catch many of these corruptions before
      they can do damage.
      
      This is a generic btree record checking deficiency. We need to
      validate the records we fetch from all the different btrees before
      we use them to catch corruptions like this.
      
      This patch results in a corrupt record emitting an error message and
      returning -EFSCORRUPTED, and the higher layers catch that and abort:
      
       XFS (loop0): Size Freespace BTree record corruption in AG 0 detected!
       XFS (loop0): start block 0x0 block count 0x0
       XFS (loop0): Internal error xfs_trans_cancel at line 1012 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x42a/0x670
       .....
       Call Trace:
        dump_stack+0x85/0xcb
        xfs_trans_cancel+0x19f/0x1c0
        xfs_create+0x42a/0x670
        xfs_generic_create+0x1f6/0x2c0
        vfs_create+0xf9/0x180
        do_mknodat+0x1f9/0x210
        do_syscall_64+0x5a/0x180
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      .....
       XFS (loop0): xfs_do_force_shutdown(0x8) called from line 1013 of file fs/xfs/xfs_trans.c.  Return address = ffffffff81500868
       XFS (loop0): Corruption of in-memory data detected.  Shutting down filesystem
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      9e6c08d4
  17. 05 6月, 2018 1 次提交
  18. 16 5月, 2018 1 次提交
  19. 10 5月, 2018 3 次提交
    • B
      xfs: add bmapi nodiscard flag · fcb762f5
      Brian Foster 提交于
      Freed extents are unconditionally discarded when online discard is
      enabled. Define XFS_BMAPI_NODISCARD to allow callers to bypass
      discards when unnecessary. For example, this will be useful for
      eofblocks trimming.
      
      This patch does not change behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      fcb762f5
    • B
      xfs: defer agfl block frees when dfops is available · f8f2835a
      Brian Foster 提交于
      The AGFL fixup code executes before every block allocation/free and
      rectifies the AGFL based on the current, dynamic allocation
      requirements of the fs. The AGFL must hold a minimum number of
      blocks to satisfy a worst case split of the free space btrees caused
      by the impending allocation operation. The AGFL is also updated to
      maintain the implicit requirement for a minimum number of free slots
      to satisfy a worst case join of the free space btrees.
      
      Since the AGFL caches individual blocks, AGFL reduction typically
      involves multiple, single block frees. We've had reports of
      transaction overrun problems during certain workloads that boil down
      to AGFL reduction freeing multiple blocks and consuming more space
      in the log than was reserved for the transaction.
      
      Since the objective of freeing AGFL blocks is to ensure free AGFL
      free slots are available for the upcoming allocation, one way to
      address this problem is to release surplus blocks from the AGFL
      immediately but defer the free of those blocks (similar to how
      file-mapped blocks are unmapped from the file in one transaction and
      freed via a deferred operation) until the transaction is rolled.
      This turns AGFL reduction into an operation with predictable log
      reservation consumption.
      
      Add the capability to defer AGFL block frees when a deferred ops
      list is available to the AGFL fixup code. Add a dfops pointer to the
      transaction to carry dfops through various contexts to the allocator
      context. Deferring AGFL frees is  conditional behavior based on
      whether the transaction pointer is populated. The long term
      objective is to reuse the transaction pointer to clean up all
      unrelated callchains that pass dfops on the stack along with a
      transaction and in doing so, consistently defer AGFL blocks from the
      allocator.
      
      A bit of customization is required to handle deferred completion
      processing because AGFL blocks are accounted against a per-ag
      reservation pool and AGFL blocks are not inserted into the extent
      busy list when freed (they are inserted when used and released back
      to the AGFL). Reuse the majority of the existing deferred extent
      free infrastructure and customize it appropriately to handle AGFL
      blocks.
      
      Note that this patch only adds infrastructure. It does not change
      behavior because no callers have been updated to pass ->t_agfl_dfops
      into the allocation code.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      f8f2835a
    • B
      xfs: create agfl block free helper function · 4223f659
      Brian Foster 提交于
      Refactor the AGFL block free code into a new helper such that it can
      be invoked from deferred context. No functional changes.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      4223f659