1. 19 3月, 2020 1 次提交
  2. 27 1月, 2020 1 次提交
  3. 17 1月, 2020 3 次提交
  4. 19 12月, 2019 1 次提交
  5. 19 11月, 2019 1 次提交
  6. 08 11月, 2019 1 次提交
  7. 27 8月, 2019 1 次提交
  8. 29 6月, 2019 7 次提交
  9. 15 4月, 2019 1 次提交
  10. 21 11月, 2018 1 次提交
  11. 29 9月, 2018 2 次提交
    • B
      xfs: refactor xfs_buf_log_item reference count handling · 95808459
      Brian Foster 提交于
      The xfs_buf_log_item structure has a reference counter with slightly
      tricky semantics. In the common case, a buffer is logged and
      committed in a transaction, committed to the on-disk log (added to
      the AIL) and then finally written back and removed from the AIL. The
      bli refcount covers two potentially overlapping timeframes:
      
       1. the bli is held in an active transaction
       2. the bli is pinned by the log
      
      The caveat to this approach is that the reference counter does not
      purely dictate the lifetime of the bli. IOW, when a dirty buffer is
      physically logged and unpinned, the bli refcount may go to zero as
      the log item is inserted into the AIL. Only once the buffer is
      written back can the bli finally be freed.
      
      The above semantics means that it is not enough for the various
      refcount decrementing contexts to release the bli on decrement to
      zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and
      unpin (->iop_unpin()) must all drop the associated reference and
      make additional checks to determine if the current context is
      responsible for freeing the item.
      
      For example, if a transaction holds but does not dirty a particular
      bli, the commit may drop the refcount to zero. If the bli itself is
      clean, it is also not AIL resident and must be freed at this time.
      The same is true for xfs_trans_brelse(). If the transaction dirties
      a bli and then aborts or an unpin results in an abort due to a log
      I/O error, the last reference count holder is expected to explicitly
      remove the item from the AIL and release it (since an abort means
      filesystem shutdown and metadata writeback will never occur).
      
      This leads to fairly complex checks being replicated in a few
      different places. Since ->iop_unlock() and xfs_trans_brelse() are
      nearly identical, refactor the logic into a common helper that
      implements and documents the semantics in one place. This patch does
      not change behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      95808459
    • B
      xfs: don't unlock invalidated buf on aborted tx commit · d9183105
      Brian Foster 提交于
      xfstests generic/388,475 occasionally reproduce assertion failures
      in xfs_buf_item_unpin() when the final bli reference is dropped on
      an invalidated buffer and the buffer is not locked as it is expected
      to be. Invalidated buffers should remain locked on transaction
      commit until the final unpin, at which point the buffer is removed
      from the AIL and the bli is freed since stale buffers are not
      written back.
      
      The assert failures are associated with filesystem shutdown,
      typically due to log I/O errors injected by the test. The
      problematic situation can occur if the shutdown happens to cause a
      race between an active transaction that has invalidated a particular
      buffer and an I/O error on a log buffer that contains the bli
      associated with the same (now stale) buffer.
      
      Both transaction and log contexts acquire a bli reference. If the
      transaction has already invalidated the buffer by the time the I/O
      error occurs and ends up aborting due to shutdown, the transaction
      and log hold the last two references to a stale bli. If the
      transaction cancel occurs first, it treats the buffer as non-stale
      due to the aborted state: the bli reference is dropped and the
      buffer is released/unlocked. The log buffer I/O error handling
      eventually calls into xfs_buf_item_unpin(), drops the final
      reference to the bli and treats it as stale. The buffer wasn't left
      locked by xfs_buf_item_unlock(), however, so the assert fails and
      the buffer is double unlocked. The latter problem is mitigated by
      the fact that the fs is shutdown and no further damage is possible.
      
      ->iop_unlock() of an invalidated buffer should behave consistently
      with respect to the bli refcount, regardless of aborted state. If
      the refcount remains elevated on commit, we know the bli is awaiting
      an unpin (since it can't be in another transaction) and will be
      handled appropriately on log buffer completion. If the final bli
      reference of an invalidated buffer is dropped in ->iop_unlock(), we
      can assume the transaction has aborted because invalidation implies
      a dirty transaction. In the non-abort case, the log would have
      acquired a bli reference in ->iop_pin() and prevented bli release at
      ->iop_unlock() time. In the abort case the item must be freed and
      buffer unlocked because it wasn't pinned by the log.
      
      Rework xfs_buf_item_unlock() to simplify the currently circuitous
      and duplicate logic and leave invalidated buffers locked based on
      bli refcount, regardless of aborted state. This ensures that a
      pinned, stale buffer is always found locked when eventually
      unpinned.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      d9183105
  12. 09 6月, 2018 1 次提交
  13. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  14. 10 5月, 2018 3 次提交
  15. 12 3月, 2018 1 次提交
  16. 29 1月, 2018 3 次提交
  17. 02 9月, 2017 4 次提交
  18. 23 8月, 2017 2 次提交
  19. 19 6月, 2017 1 次提交
    • B
      xfs: remove bli from AIL before release on transaction abort · 3d4b4a3e
      Brian Foster 提交于
      When a buffer is modified, logged and committed, it ultimately ends
      up sitting on the AIL with a dirty bli waiting for metadata
      writeback. If another transaction locks and invalidates the buffer
      (freeing an inode chunk, for example) in the meantime, the bli is
      flagged as stale, the dirty state is cleared and the bli remains in
      the AIL.
      
      If a shutdown occurs before the transaction that has invalidated the
      buffer is committed, the transaction is ultimately aborted. The log
      items are flagged as such and ->iop_unlock() handles the aborted
      items. Because the bli is clean (due to the invalidation),
      ->iop_unlock() unconditionally releases it. The log item may still
      reside in the AIL, however, which means the I/O completion handler
      may still run and attempt to access it. This results in assert
      failure due to the release of the bli while still present in the AIL
      and a subsequent NULL dereference and panic in the buffer I/O
      completion handling. This can be reproduced by running generic/388
      in repetition.
      
      To avoid this problem, update xfs_buf_item_unlock() to first check
      whether the bli is aborted and if so, remove it from the AIL before
      it is released. This ensures that the bli is no longer accessed
      during the shutdown sequence after it has been freed.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      3d4b4a3e
  20. 04 2月, 2017 1 次提交
  21. 14 9月, 2016 2 次提交
    • E
      xfs: normalize "infinite" retries in error configs · 77169812
      Eric Sandeen 提交于
      As it stands today, the "fail immediately" vs. "retry forever"
      values for max_retries and retry_timeout_seconds in the xfs metadata
      error configurations are not consistent.
      
      A retry_timeout_seconds of 0 means "retry forever," but a
      max_retries of 0 means "fail immediately."
      
      retry_timeout_seconds < 0 is disallowed, while max_retries == -1
      means "retry forever."
      
      Make this consistent across the error configs, such that a value of
      0 means "fail immediately" (i.e. wait 0 seconds, or retry 0 times),
      and a value of -1 always means "retry forever."
      
      This makes retry_timeout a signed long to accommodate the -1, even
      though it stores jiffies.  Given our limit of a 1 day maximum
      timeout, this should be sufficient even at much higher HZ values
      than we have available today.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      77169812
    • X
      xfs: fix signed integer overflow · 79c350e4
      Xie XiuQi 提交于
      Use 1U for unsigned int to avoid a overflow warning from UBSAN.
      
      [   31.910858] UBSAN: Undefined behaviour in fs/xfs/xfs_buf_item.c:889:25
      [   31.911252] signed integer overflow:
      [   31.911478] -2147483648 - 1 cannot be represented in type 'int'
      [   31.911846] CPU: 1 PID: 1011 Comm: tuned Tainted: G    B          ---- -------   3.10.0-327.28.3.el7.x86_64 #1
      [   31.911857] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 01/07/2011
      [   31.911866]  1ffff1004069cd3b 0000000076bec3fd ffff8802034e69a0 ffffffff81ee3140
      [   31.911883]  ffff8802034e69b8 ffffffff81ee31fd ffffffffa0ad79e0 ffff8802034e6b20
      [   31.911898]  ffffffff81ee46e2 0000002d515470c0 0000000000000001 0000000041b58ab3
      [   31.911913] Call Trace:
      [   31.911932]  [<ffffffff81ee3140>] dump_stack+0x1e/0x20
      [   31.911947]  [<ffffffff81ee31fd>] ubsan_epilogue+0x12/0x55
      [   31.911964]  [<ffffffff81ee46e2>] handle_overflow+0x1ba/0x215
      [   31.912083]  [<ffffffff81ee4798>] __ubsan_handle_sub_overflow+0x2a/0x31
      [   31.912204]  [<ffffffffa08676fb>] xfs_buf_item_log+0x34b/0x3f0 [xfs]
      [   31.912314]  [<ffffffffa0880490>] xfs_trans_log_buf+0x120/0x260 [xfs]
      [   31.912402]  [<ffffffffa079a890>] xfs_btree_log_recs+0x80/0xc0 [xfs]
      [   31.912490]  [<ffffffffa07a29f8>] xfs_btree_delrec+0x11a8/0x2d50 [xfs]
      [   31.913589]  [<ffffffffa07a86f9>] xfs_btree_delete+0xc9/0x260 [xfs]
      [   31.913762]  [<ffffffffa075b5cf>] xfs_free_ag_extent+0x63f/0xe20 [xfs]
      [   31.914339]  [<ffffffffa075ec0f>] xfs_free_extent+0x2af/0x3e0 [xfs]
      [   31.914641]  [<ffffffffa0801b2b>] xfs_bmap_finish+0x32b/0x4b0 [xfs]
      [   31.914841]  [<ffffffffa083c2e7>] xfs_itruncate_extents+0x3b7/0x740 [xfs]
      [   31.915216]  [<ffffffffa08342fa>] xfs_setattr_size+0x60a/0x860 [xfs]
      [   31.915471]  [<ffffffffa08345ea>] xfs_vn_setattr+0x9a/0xe0 [xfs]
      [   31.915590]  [<ffffffff8149ad38>] notify_change+0x5c8/0x8a0
      [   31.915607]  [<ffffffff81450f22>] do_truncate+0x122/0x1d0
      [   31.915640]  [<ffffffff8147beee>] do_last+0x15de/0x2c80
      [   31.915707]  [<ffffffff8147d777>] path_openat+0x1e7/0xcc0
      [   31.915802]  [<ffffffff81480824>] do_filp_open+0xa4/0x160
      [   31.915848]  [<ffffffff81453127>] do_sys_open+0x1b7/0x3f0
      [   31.915879]  [<ffffffff81453392>] SyS_open+0x32/0x40
      [   31.915897]  [<ffffffff81f08989>] system_call_fastpath+0x16/0x1b
      
      [  240.086809] UBSAN: Undefined behaviour in fs/xfs/xfs_buf_item.c:866:34
      [  240.086820] signed integer overflow:
      [  240.086830] -2147483648 - 1 cannot be represented in type 'int'
      [  240.086846] CPU: 1 PID: 12969 Comm: rm Tainted: G    B          ---- -------   3.10.0-327.28.3.el7.x86_64 #1
      [  240.086857] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 01/07/2011
      [  240.086868]  1ffff10040491def 00000000e2ea59c1 ffff88020248ef40 ffffffff81ee3140
      [  240.086885]  ffff88020248ef58 ffffffff81ee31fd ffffffffa0ad79e0 ffff88020248f0c0
      [  240.086901]  ffffffff81ee46e2 0000002d02488000 0000000000000001 0000000041b58ab3
      [  240.086915] Call Trace:
      [  240.086938]  [<ffffffff81ee3140>] dump_stack+0x1e/0x20
      [  240.086953]  [<ffffffff81ee31fd>] ubsan_epilogue+0x12/0x55
      [  240.086971]  [<ffffffff81ee46e2>] handle_overflow+0x1ba/0x215
      ...
      Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      79c350e4
  22. 22 7月, 2016 1 次提交
    • D
      xfs: allocate log vector buffers outside CIL context lock · b1c5ebb2
      Dave Chinner 提交于
      One of the problems we currently have with delayed logging is that
      under serious memory pressure we can deadlock memory reclaim. THis
      occurs when memory reclaim (such as run by kswapd) is reclaiming XFS
      inodes and issues a log force to unpin inodes that are dirty in the
      CIL.
      
      The CIL is pushed, but this will only occur once it gets the CIL
      context lock to ensure that all committing transactions are complete
      and no new transactions start being committed to the CIL while the
      push switches to a new context.
      
      The deadlock occurs when the CIL context lock is held by a
      committing process that is doing memory allocation for log vector
      buffers, and that allocation is then blocked on memory reclaim
      making progress. Memory reclaim, however, is blocked waiting for
      a log force to make progress, and so we effectively deadlock at this
      point.
      
      To solve this problem, we have to move the CIL log vector buffer
      allocation outside of the context lock so that memory reclaim can
      always make progress when it needs to force the log. The problem
      with doing this is that a CIL push can take place while we are
      determining if we need to allocate a new log vector buffer for
      an item and hence the current log vector may go away without
      warning. That means we canot rely on the existing log vector being
      present when we finally grab the context lock and so we must have a
      replacement buffer ready to go at all times.
      
      To ensure this, introduce a "shadow log vector" buffer that is
      always guaranteed to be present when we gain the CIL context lock
      and format the item. This shadow buffer may or may not be used
      during the formatting, but if the log item does not have an existing
      log vector buffer or that buffer is too small for the new
      modifications, we swap it for the new shadow buffer and format
      the modifications into that new log vector buffer.
      
      The result of this is that for any object we modify more than once
      in a given CIL checkpoint, we double the memory required
      to track dirty regions in the log. For single modifications then
      we consume the shadow log vectorwe allocate on commit, and that gets
      consumed by the checkpoint. However, if we make multiple
      modifications, then the second transaction commit will allocate a
      shadow log vector and hence we will end up with double the memory
      usage as only one of the log vectors is consumed by the CIL
      checkpoint. The remaining shadow vector will be freed when th elog
      item is freed.
      
      This can probably be optimised in future - access to the shadow log
      vector is serialised by the object lock (as opposited to the active
      log vector, which is controlled by the CIL context lock) and so we
      can probably free shadow log vector from some objects when the log
      item is marked clean on removal from the AIL.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      b1c5ebb2