1. 20 5月, 2020 1 次提交
    • D
      xfs: use ordered buffers to initialize dquot buffers during quotacheck · 78bba5c8
      Darrick J. Wong 提交于
      While QAing the new xfs_repair quotacheck code, I uncovered a quota
      corruption bug resulting from a bad interaction between dquot buffer
      initialization and quotacheck.  The bug can be reproduced with the
      following sequence:
      
      # mkfs.xfs -f /dev/sdf
      # mount /dev/sdf /opt -o usrquota
      # su nobody -s /bin/bash -c 'touch /opt/barf'
      # sync
      # xfs_quota -x -c 'report -ahi' /opt
      User quota on /opt (/dev/sdf)
                              Inodes
      User ID      Used   Soft   Hard Warn/Grace
      ---------- ---------------------------------
      root            3      0      0  00 [------]
      nobody          1      0      0  00 [------]
      
      # xfs_io -x -c 'shutdown' /opt
      # umount /opt
      # mount /dev/sdf /opt -o usrquota
      # touch /opt/man2
      # xfs_quota -x -c 'report -ahi' /opt
      User quota on /opt (/dev/sdf)
                              Inodes
      User ID      Used   Soft   Hard Warn/Grace
      ---------- ---------------------------------
      root            1      0      0  00 [------]
      nobody          1      0      0  00 [------]
      
      # umount /opt
      
      Notice how the initial quotacheck set the root dquot icount to 3
      (rootino, rbmino, rsumino), but after shutdown -> remount -> recovery,
      xfs_quota reports that the root dquot has only 1 icount.  We haven't
      deleted anything from the filesystem, which means that quota is now
      under-counting.  This behavior is not limited to icount or the root
      dquot, but this is the shortest reproducer.
      
      I traced the cause of this discrepancy to the way that we handle ondisk
      dquot updates during quotacheck vs. regular fs activity.  Normally, when
      we allocate a disk block for a dquot, we log the buffer as a regular
      (dquot) buffer.  Subsequent updates to the dquots backed by that block
      are done via separate dquot log item updates, which means that they
      depend on the logged buffer update being written to disk before the
      dquot items.  Because individual dquots have their own LSN fields, that
      initial dquot buffer must always be recovered.
      
      However, the story changes for quotacheck, which can cause dquot block
      allocations but persists the final dquot counter values via a delwri
      list.  Because recovery doesn't gate dquot buffer replay on an LSN, this
      means that the initial dquot buffer can be replayed over the (newer)
      contents that were delwritten at the end of quotacheck.  In effect, this
      re-initializes the dquot counters after they've been updated.  If the
      log does not contain any other dquot items to recover, the obsolete
      dquot contents will not be corrected by log recovery.
      
      Because quotacheck uses a transaction to log the setting of the CHKD
      flags in the superblock, we skip quotacheck during the second mount
      call, which allows the incorrect icount to remain.
      
      Fix this by changing the ondisk dquot initialization function to use
      ordered buffers to write out fresh dquot blocks if it detects that we're
      running quotacheck.  If the system goes down before quotacheck can
      complete, the CHKD flags will not be set in the superblock and the next
      mount will run quotacheck again, which can fix uninitialized dquot
      buffers.  This requires amending the defer code to maintaine ordered
      buffer state across defer rolls for the sake of the dquot allocation
      code.
      
      For regular operations we preserve the current behavior since the dquot
      items require properly initialized ondisk dquot records.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      78bba5c8
  2. 07 5月, 2020 5 次提交
  3. 29 3月, 2020 1 次提交
  4. 03 3月, 2020 2 次提交
  5. 27 1月, 2020 1 次提交
  6. 07 1月, 2020 1 次提交
  7. 19 11月, 2019 3 次提交
  8. 14 11月, 2019 3 次提交
  9. 30 10月, 2019 1 次提交
  10. 24 10月, 2019 1 次提交
    • B
      xfs: don't set bmapi total block req where minleft is · da781e64
      Brian Foster 提交于
      xfs_bmapi_write() takes a total block requirement parameter that is
      passed down to the block allocation code and is used to specify the
      total block requirement of the associated transaction. This is used
      to try and select an AG that can not only satisfy the requested
      extent allocation, but can also accommodate subsequent allocations
      that might be required to complete the transaction. For example,
      additional bmbt block allocations may be required on insertion of
      the resulting extent to an inode data fork.
      
      While it's important for callers to calculate and reserve such extra
      blocks in the transaction, it is not necessary to pass the total
      value to xfs_bmapi_write() in all cases. The latter automatically
      sets minleft to ensure that sufficient free blocks remain after the
      allocation attempt to expand the format of the associated inode
      (i.e., such as extent to btree conversion, btree splits, etc).
      Therefore, any callers that pass a total block requirement of the
      bmap mapping length plus worst case bmbt expansion essentially
      specify the additional reservation requirement twice. These callers
      can pass a total of zero to rely on the bmapi minleft policy.
      
      Beyond being superfluous, the primary motivation for this change is
      that the total reservation logic in the bmbt code is dubious in
      scenarios where minlen < maxlen and a maxlen extent cannot be
      allocated (which is more common for data extent allocations where
      contiguity is not required). The total value is based on maxlen in
      the xfs_bmapi_write() caller. If the bmbt code falls back to an
      allocation between minlen and maxlen, that allocation will not
      succeed until total is reset to minlen, which essentially throws
      away any additional reservation included in total by the caller. In
      addition, the total value is not reset until after alignment is
      dropped, which means that such callers drop alignment far too
      aggressively than necessary.
      
      Update all callers of xfs_bmapi_write() that pass a total block
      value of the mapping length plus bmbt reservation to instead pass
      zero and rely on xfs_bmapi_minleft() to enforce the bmbt reservation
      requirement. This trades off slightly less conservative AG selection
      for the ability to preserve alignment in more scenarios.
      xfs_bmapi_write() callers that incorporate unrelated or additional
      reservations in total beyond what is already included in minleft
      must continue to use the former.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      da781e64
  11. 30 8月, 2019 1 次提交
  12. 27 8月, 2019 1 次提交
  13. 03 7月, 2019 1 次提交
  14. 29 6月, 2019 1 次提交
  15. 30 4月, 2019 1 次提交
    • D
      xfs: always rejoin held resources during defer roll · 710d707d
      Darrick J. Wong 提交于
      During testing of xfs/141 on a V4 filesystem, I observed some
      inconsistent behavior with regards to resources that are held (i.e.
      remain locked) across a defer roll.  The transaction roll always gives
      the defer roll function a new transaction, even if committing the old
      transaction fails.  However, the defer roll function only rejoins the
      held resources if the transaction commit succeedied.  This means that
      callers of defer roll have to figure out whether the held resources are
      attached to the transaction being passed back.
      
      Worse yet, if the defer roll was part of a defer finish call, we have a
      third possibility: the defer finish could pass back a dirty transaction
      with dirty held resources and an error code.
      
      The only sane way to handle all of these scenarios is to require that
      the code that held the resource either cancel the transaction before
      unlocking and releasing the resources, or use functions that detach
      resources from a transaction properly (e.g.  xfs_trans_brelse) if they
      need to drop the reference before committing or cancelling the
      transaction.
      
      In order to make this so, change the defer roll code to join held
      resources to the new transaction unconditionally and fix all the bhold
      callers to release the held buffers correctly.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      710d707d
  16. 08 8月, 2018 1 次提交
  17. 03 8月, 2018 2 次提交
    • B
      xfs: cancel dfops on xfs_defer_finish() error · 9b1f4e98
      Brian Foster 提交于
      The current semantics of xfs_defer_finish() require the caller to
      call xfs_defer_cancel() on error. This is slightly inconsistent with
      transaction commit error handling where a failed commit cleans up
      the transaction before returning.
      
      More significantly, the only requirement for exposure of
      ->dop_pending outside of xfs_defer_finish() is so that
      xfs_defer_cancel() can drain it on error. Since the only recourse of
      xfs_defer_finish() errors is cancellation, mirror the transaction
      logic and cancel remaining dfops before returning from
      xfs_defer_finish() with an error.
      
      Beside simplifying xfs_defer_finish() semantics, this ensures that
      xfs_defer_finish() always returns with an empty ->dop_pending and
      thus facilitates removal of the list from xfs_defer_ops.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      9b1f4e98
    • B
      xfs: automatic dfops buffer relogging · 82ff27bc
      Brian Foster 提交于
      Buffers that are held across deferred operations are explicitly
      joined to the dfops structure to ensure appropriate relogging.
      While buffers are currently joined explicitly, we can detect the
      conditions that require relogging at dfops finish time by inspecting
      the transaction item list for held buffers.
      
      Replace the xfs_defer_bjoin() infrastructure with such detection and
      automatic relogging of held buffers. This eliminates the need for
      the per-dfops buffer list, replaced by an on-stack variant in
      xfs_defer_trans_roll().
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      82ff27bc
  18. 27 7月, 2018 2 次提交
  19. 12 7月, 2018 7 次提交
  20. 07 6月, 2018 1 次提交
    • D
      xfs: convert to SPDX license tags · 0b61f8a4
      Dave Chinner 提交于
      Remove the verbose license text from XFS files and replace them
      with SPDX tags. This does not change the license of any of the code,
      merely refers to the common, up-to-date license files in LICENSES/
      
      This change was mostly scripted. fs/xfs/Makefile and
      fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
      and modified by the following command:
      
      for f in `git grep -l "GNU General" fs/xfs/` ; do
      	echo $f
      	cat $f | awk -f hdr.awk > $f.new
      	mv -f $f.new $f
      done
      
      And the hdr.awk script that did the modification (including
      detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
      is as follows:
      
      $ cat hdr.awk
      BEGIN {
      	hdr = 1.0
      	tag = "GPL-2.0"
      	str = ""
      }
      
      /^ \* This program is free software/ {
      	hdr = 2.0;
      	next
      }
      
      /any later version./ {
      	tag = "GPL-2.0+"
      	next
      }
      
      /^ \*\// {
      	if (hdr > 0.0) {
      		print "// SPDX-License-Identifier: " tag
      		print str
      		print $0
      		str=""
      		hdr = 0.0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \* / {
      	if (hdr > 1.0)
      		next
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      	next
      }
      
      /^ \*/ {
      	if (hdr > 0.0)
      		next
      	print $0
      	next
      }
      
      // {
      	if (hdr > 0.0) {
      		if (str != "")
      			str = str "\n"
      		str = str $0
      		next
      	}
      	print $0
      }
      
      END { }
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      0b61f8a4
  21. 16 5月, 2018 1 次提交
  22. 10 5月, 2018 2 次提交