1. 27 7月, 2018 2 次提交
    • B
      xfs: drop unnecessary xfs_defer_finish() dfops parameter · 9e28a242
      Brian Foster 提交于
      Every caller of xfs_defer_finish() now passes the transaction and
      its associated ->t_dfops. The xfs_defer_ops parameter is therefore
      no longer necessary and can be removed.
      
      Since most xfs_defer_finish() callers also have to consider
      xfs_defer_cancel() on error, update the latter to also receive the
      transaction for consistency. The log recovery code contains an
      outlier case that cancels a dfops directly without an available
      transaction. Retain an internal wrapper to support this outlier case
      for the time being.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NBill O'Donnell <billodo@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      9e28a242
    • B
      xfs: support embedded dfops in transaction · e021a2e5
      Brian Foster 提交于
      The dfops structure used by multi-transaction operations is
      typically stored on the stack and carried around by the associated
      transaction. The lifecycle of dfops does not quite match that of the
      transaction, but they are tightly related in that the former depends
      on the latter.
      
      The relationship of these objects is tight enough that we can avoid
      the cumbersome boilerplate code required in most cases to manage
      them separately by just embedding an xfs_defer_ops in the
      transaction itself. This means that a transaction allocation returns
      with an initialized dfops, a transaction commit finishes pending
      deferred items before the tx commit, a transaction cancel cancels
      the dfops before the transaction and a transaction dup operation
      transfers the current dfops state to the new transaction.
      
      The dup operation is slightly complicated by the fact that we can no
      longer just copy a dfops pointer from the old transaction to the new
      transaction. This is solved through a dfops move helper that
      transfers the pending items and other dfops state across the
      transactions. This also requires that transaction rolling code
      always refer to the transaction for the current dfops reference.
      
      Finally, to facilitate incremental conversion to the internal dfops
      and continue to support the current external dfops mode of
      operation, create the new ->t_dfops_internal field with a layer of
      indirection. On allocation, ->t_dfops points to the internal dfops.
      This state is overridden by callers who re-init a local dfops on the
      transaction. Once ->t_dfops is overridden, the external dfops
      reference is maintained as the transaction rolls.
      
      This patch adds the fundamental ability to support an internal
      dfops. All codepaths that perform deferred processing continue to
      override the internal dfops until they are converted over in
      subsequent patches.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NBill O'Donnell <billodo@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      e021a2e5
  2. 12 7月, 2018 2 次提交
  3. 25 6月, 2018 1 次提交
  4. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  5. 10 5月, 2018 4 次提交
    • D
      xfs: get rid of the log item descriptor · e6631f85
      Dave Chinner 提交于
      It's just a connector between a transaction and a log item. There's
      a 1:1 relationship between a log item descriptor and a log item,
      and a 1:1 relationship between a log item descriptor and a
      transaction. Both relationships are created and terminated at the
      same time, so why do we even have the descriptor?
      
      Replace it with a specific list_head in the log item and a new
      log item dirtied flag to replace the XFS_LID_DIRTY flag.
      Signed-Off-By: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      [darrick: fix up deferred agfl intent finish_item use of LID_DIRTY]
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      e6631f85
    • D
      xfs: add tracing to high level transaction operations · ba18781b
      Dave Chinner 提交于
      Because currently we have no idea what the transaction context we
      are operating in is, and I need to know that information to track
      down bugs in multiple log item joins to transactions.
      Signed-Off-By: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      ba18781b
    • D
      xfs: log item flags are racy · 22525c17
      Dave Chinner 提交于
      The log item flags contain a field that is protected by the AIL
      lock - the XFS_LI_IN_AIL flag. We use non-atomic RMW operations to
      set and clear these flags, but most of the updates and checks are
      not done with the AIL lock held and so are susceptible to update
      races.
      
      Fix this by changing the log item flags to use atomic bitops rather
      than be reliant on the AIL lock for update serialisation.
      Signed-Off-By: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      22525c17
    • B
      xfs: defer agfl block frees when dfops is available · f8f2835a
      Brian Foster 提交于
      The AGFL fixup code executes before every block allocation/free and
      rectifies the AGFL based on the current, dynamic allocation
      requirements of the fs. The AGFL must hold a minimum number of
      blocks to satisfy a worst case split of the free space btrees caused
      by the impending allocation operation. The AGFL is also updated to
      maintain the implicit requirement for a minimum number of free slots
      to satisfy a worst case join of the free space btrees.
      
      Since the AGFL caches individual blocks, AGFL reduction typically
      involves multiple, single block frees. We've had reports of
      transaction overrun problems during certain workloads that boil down
      to AGFL reduction freeing multiple blocks and consuming more space
      in the log than was reserved for the transaction.
      
      Since the objective of freeing AGFL blocks is to ensure free AGFL
      free slots are available for the upcoming allocation, one way to
      address this problem is to release surplus blocks from the AGFL
      immediately but defer the free of those blocks (similar to how
      file-mapped blocks are unmapped from the file in one transaction and
      freed via a deferred operation) until the transaction is rolled.
      This turns AGFL reduction into an operation with predictable log
      reservation consumption.
      
      Add the capability to defer AGFL block frees when a deferred ops
      list is available to the AGFL fixup code. Add a dfops pointer to the
      transaction to carry dfops through various contexts to the allocator
      context. Deferring AGFL frees is  conditional behavior based on
      whether the transaction pointer is populated. The long term
      objective is to reuse the transaction pointer to clean up all
      unrelated callchains that pass dfops on the stack along with a
      transaction and in doing so, consistently defer AGFL blocks from the
      allocator.
      
      A bit of customization is required to handle deferred completion
      processing because AGFL blocks are accounted against a per-ag
      reservation pool and AGFL blocks are not inserted into the extent
      busy list when freed (they are inserted when used and released back
      to the AGFL). Reuse the majority of the existing deferred extent
      free infrastructure and customize it appropriately to handle AGFL
      blocks.
      
      Note that this patch only adds infrastructure. It does not change
      behavior because no callers have been updated to pass ->t_agfl_dfops
      into the allocation code.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      f8f2835a
  6. 15 3月, 2018 1 次提交
  7. 12 3月, 2018 2 次提交
    • B
      xfs: shutdown if block allocation overruns tx reservation · 3e78b9a4
      Brian Foster 提交于
      The ->t_blk_res_used field tracks how many blocks have been used in
      the current transaction. This should never exceed the block
      reservation (->t_blk_res) for a particular transaction. We currently
      assert this condition in the transaction block accounting code, but
      otherwise take no additional action should this situation occur.
      
      The overrun generally has no effect if space ends up being available
      and the associated transaction commits. If the transaction is
      duplicated, however, the current block usage is used to determine
      the remaining block reservation to be transferred to the new
      transaction. If usage exceeds reservation, this calculation
      underflows and creates a transaction with an invalid and excessive
      reservation. When the second transaction commits, the release of
      unused blocks corrupts the in-core free space counters. With lazy
      superblock accounting enabled, this inconsistency eventually
      trickles to the on-disk superblock and corrupts the filesystem.
      
      Replace the transaction block usage accounting assert with an
      explicit overrun check. If the transaction overruns the reservation,
      shutdown the filesystem immediately to prevent corruption. Add a new
      assert to xfs_trans_dup() to catch any callers that might induce
      this invalid state in the future.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      3e78b9a4
    • M
      xfs: Rename xa_ elements to ail_ · 57e80956
      Matthew Wilcox 提交于
      This is a simple rename, except that xa_ail becomes ail_head.
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      57e80956
  8. 09 1月, 2018 1 次提交
  9. 02 9月, 2017 1 次提交
  10. 04 5月, 2017 1 次提交
  11. 07 4月, 2017 1 次提交
  12. 04 4月, 2017 1 次提交
  13. 19 9月, 2016 1 次提交
    • D
      xfs: set up per-AG free space reservations · 3fd129b6
      Darrick J. Wong 提交于
      One unfortunate quirk of the reference count and reverse mapping
      btrees -- they can expand in size when blocks are written to *other*
      allocation groups if, say, one large extent becomes a lot of tiny
      extents.  Since we don't want to start throwing errors in the middle
      of CoWing, we need to reserve some blocks to handle future expansion.
      The transaction block reservation counters aren't sufficient here
      because we have to have a reserve of blocks in every AG, not just
      somewhere in the filesystem.
      
      Therefore, create two per-AG block reservation pools.  One feeds the
      AGFL so that rmapbt expansion always succeeds, and the other feeds all
      other metadata so that refcountbt expansion never fails.
      
      Use the count of how many reserved blocks we need to have on hand to
      create a virtual reservation in the AG.  Through selective clamping of
      the maximum length of allocation requests and of the length of the
      longest free extent, we can make it look like there's less free space
      in the AG unless the reservation owner is asking for blocks.
      
      In other words, play some accounting tricks in-core to make sure that
      we always have blocks available.  On the plus side, there's nothing to
      clean up if we crash, which is contrast to the strategy that the rough
      draft used (actually removing extents from the freespace btrees).
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      3fd129b6
  14. 14 9月, 2016 1 次提交
  15. 06 4月, 2016 2 次提交
  16. 15 3月, 2016 1 次提交
  17. 12 10月, 2015 1 次提交
    • B
      xfs: per-filesystem stats counter implementation · ff6d6af2
      Bill O'Donnell 提交于
      This patch modifies the stats counting macros and the callers
      to those macros to properly increment, decrement, and add-to
      the xfs stats counts. The counts for global and per-fs stats
      are correctly advanced, and cleared by writing a "1" to the
      corresponding clear file.
      
      global counts: /sys/fs/xfs/stats/stats
      per-fs counts: /sys/fs/xfs/sda*/stats/stats
      
      global clear:  /sys/fs/xfs/stats/stats_clear
      per-fs clear:  /sys/fs/xfs/sda*/stats/stats_clear
      
      [dchinner: cleaned up macro variables, removed CONFIG_FS_PROC around
       stats structures and macros. ]
      Signed-off-by: NBill O'Donnell <billodo@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ff6d6af2
  18. 19 8月, 2015 1 次提交
  19. 04 6月, 2015 5 次提交
  20. 23 2月, 2015 5 次提交
    • D
      xfs: replace xfs_mod_incore_sb_batched · 0bd5dded
      Dave Chinner 提交于
      Introduce helper functions for modifying fields in the superblock
      into xfs_trans.c, the only caller of xfs_mod_incore_sb_batch().  We
      can then use these directly in xfs_trans_unreserve_and_mod_sb() and
      so remove another user of the xfs_mode_incore_sb() API without
      losing any functionality or scalability of the transaction commit
      code..
      
      Based on a patch from Christoph Hellwig.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      0bd5dded
    • D
      xfs: introduce xfs_mod_frextents · bab98bbe
      Dave Chinner 提交于
      Add a new helper to modify the incore counter of free realtime
      extents. This matches the helpers used for inode and data block
      counters, and removes a significant users of the xfs_mod_incore_sb()
      interface.
      
      Based on a patch originally from Christoph Hellwig.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      bab98bbe
    • D
      xfs: use generic percpu counters for free block counter · 0d485ada
      Dave Chinner 提交于
      XFS has hand-rolled per-cpu counters for the superblock since before
      there was any generic implementation. The free block counter is
      special in that it is used for ENOSPC detection outside transaction
      contexts for for delayed allocation. This means that the counter
      needs to be accurate at zero. The current per-cpu counter code jumps
      through lots of hoops to ensure we never run past zero, but we don't
      need to make all those jumps with the generic counter
      implementation.
      
      The generic counter implementation allows us to pass a "batch"
      threshold at which the addition/subtraction to the counter value
      will be folded back into global value under lock. We can use this
      feature to reduce the batch size as we approach 0 in a very similar
      manner to the existing counters and their rebalance algorithm. If we
      use a batch size of 1 as we approach 0, then every addition and
      subtraction will be done against the global value and hence allow
      accurate detection of zero threshold crossing.
      
      Hence we can replace the handrolled, accurate-at-zero counters with
      generic percpu counters.
      
      Note: this removes just enough of the icsb infrastructure to compile
      without warnings. The rest will go in subsequent commits.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      0d485ada
    • D
      xfs: use generic percpu counters for free inode counter · e88b64ea
      Dave Chinner 提交于
      XFS has hand-rolled per-cpu counters for the superblock since before
      there was any generic implementation. The free inode counter is not
      used for any limit enforcement - the per-AG free inode counters are
      used during allocation to determine if there are inode available for
      allocation.
      
      Hence we don't need any of the complexity of the hand-rolled
      counters and we can simply replace them with generic per-cpu
      counters similar to the inode counter.
      
      This version introduces a xfs_mod_ifree() helper function from
      Christoph Hellwig.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e88b64ea
    • D
      xfs: use generic percpu counters for inode counter · 501ab323
      Dave Chinner 提交于
      XFS has hand-rolled per-cpu counters for the superblock since before
      there was any generic implementation. There are some warts around
      the  use of them for the inode counter as the hand rolled counter is
      designed to be accurate at zero, but has no specific accurracy at
      any other value. This design causes problems for the maximum inode
      count threshold enforcement, as there is no trigger that balances
      the counters as they get close tothe maximum threshold.
      
      Instead of designing new triggers for balancing, just replace the
      handrolled per-cpu counter with a generic counter.  This enables us
      to update the counter through the normal superblock modification
      funtions, but rather than do that we add a xfs_mod_icount() helper
      function (from Christoph Hellwig) and keep the percpu counter
      outside the superblock in the struct xfs_mount.
      
      This means we still need to initialise the per-cpu counter
      specifically when we read the superblock, and vice versa when we
      log/write it, but it does mean that we don't need to change any
      other code.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      501ab323
  21. 22 1月, 2015 1 次提交
    • D
      xfs: set superblock buffer type correctly · 3443a3bc
      Dave Chinner 提交于
      When the superblock is modified in a transaction, the commonly
      modified fields are not actually copied to the superblock buffer to
      avoid the buffer lock becoming a serialisation point. However, there
      are some other operations that modify the superblock fields within
      the transaction that don't directly log to the superblock but rely
      on the changes to be applied during the transaction commit (to
      minimise the buffer lock hold time).
      
      When we do this, we fail to mark the buffer log item as being a
      superblock buffer and that can lead to the buffer not being marked
      with the corect type in the log and hence causing recovery issues.
      Fix it by setting the type correctly, similar to xfs_mod_sb()...
      
      cc: <stable@vger.kernel.org> # 3.10 to current
      Tested-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      3443a3bc
  22. 28 11月, 2014 2 次提交
  23. 25 6月, 2014 1 次提交
    • D
      xfs: global error sign conversion · 2451337d
      Dave Chinner 提交于
      Convert all the errors the core XFs code to negative error signs
      like the rest of the kernel and remove all the sign conversion we
      do in the interface layers.
      
      Errors for conversion (and comparison) found via searches like:
      
      $ git grep " E" fs/xfs
      $ git grep "return E" fs/xfs
      $ git grep " E[A-Z].*;$" fs/xfs
      
      Negation points found via searches like:
      
      $ git grep "= -[a-z,A-Z]" fs/xfs
      $ git grep "return -[a-z,A-D,F-Z]" fs/xfs
      $ git grep " -[a-z].*;" fs/xfs
      
      [ with some bits I missed from Brian Foster ]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2451337d
  24. 22 6月, 2014 1 次提交