1. 13 12月, 2013 1 次提交
  2. 06 12月, 2013 1 次提交
  3. 24 10月, 2013 5 次提交
    • D
      xfs: decouple inode and bmap btree header files · a4fbe6ab
      Dave Chinner 提交于
      Currently the xfs_inode.h header has a dependency on the definition
      of the BMAP btree records as the inode fork includes an array of
      xfs_bmbt_rec_host_t objects in it's definition.
      
      Move all the btree format definitions from xfs_btree.h,
      xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
      xfs_format.h to continue the process of centralising the on-disk
      format definitions. With this done, the xfs inode definitions are no
      longer dependent on btree header files.
      
      The enables a massive culling of unnecessary includes, with close to
      200 #include directives removed from the XFS kernel code base.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a4fbe6ab
    • D
      xfs: decouple log and transaction headers · 239880ef
      Dave Chinner 提交于
      xfs_trans.h has a dependency on xfs_log.h for a couple of
      structures. Most code that does transactions doesn't need to know
      anything about the log, but this dependency means that they have to
      include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
      files and clean up the includes to be in dependency order.
      
      In doing this, remove the direct include of xfs_trans_reserve.h from
      xfs_trans.h so that we remove the dependency between xfs_trans.h and
      xfs_mount.h. Hence the xfs_trans.h include can be moved to the
      indicate the actual dependencies other header files have on it.
      
      Note that these are kernel only header files, so this does not
      translate to any userspace changes at all.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      239880ef
    • D
      xfs: split dquot buffer operations out · 9aede1d8
      Dave Chinner 提交于
      Parts of userspace want to be able to read and modify dquot buffers
      (e.g. xfs_db) so we need to split out the reading and writing of
      these buffers so it is easy to shared code with libxfs in userspace.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      9aede1d8
    • D
      xfs: unify directory/attribute format definitions · 57062787
      Dave Chinner 提交于
      The on-disk format definitions for the directory and attribute
      structures are spread across 3 header files right now, only one of
      which is dedicated to defining on-disk structures and their
      manipulation (xfs_dir2_format.h). Pull all the format definitions
      into a single header file - xfs_da_format.h - and switch all the
      code over to point at that.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      57062787
    • D
      xfs: create a shared header file for format-related information · 70a9883c
      Dave Chinner 提交于
      All of the buffer operations structures are needed to be exported
      for xfs_db, so move them all to a common location rather than
      spreading them all over the place. They are verifying the on-disk
      format, so while xfs_format.h might be a good place, it is not part
      of the on disk format.
      
      Hence we need to create a new header file that we centralise these
      related definitions. Start by moving the bffer operations
      structures, and then also move all the other definitions that have
      crept into xfs_log_format.h and xfs_format.h as there was no other
      shared header file to put them in.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      70a9883c
  4. 18 10月, 2013 1 次提交
  5. 05 10月, 2013 2 次提交
  6. 01 10月, 2013 2 次提交
  7. 25 9月, 2013 1 次提交
    • D
      xfs: log recovery lsn ordering needs uuid check · 566055d3
      Dave Chinner 提交于
      After a fair number of xfstests runs, xfs/182 started to fail
      regularly with a corrupted directory - a directory read verifier was
      failing after recovery because it found a block with a XARM magic
      number (remote attribute block) rather than a directory data block.
      
      The first time I saw this repeated failure I did /something/ and the
      problem went away, so I was never able to find the underlying
      problem. Test xfs/182 failed again today, and I found the root
      cause before I did /something else/ that made it go away.
      
      Tracing indicated that the block in question was being correctly
      logged, the log was being flushed by sync, but the buffer was not
      being written back before the shutdown occurred. Tracing also
      indicated that log recovery was also reading the block, but then
      never writing it before log recovery invalidated the cache,
      indicating that it was not modified by log recovery.
      
      More detailed analysis of the corpse indicated that the filesystem
      had a uuid of "a4131074-1872-4cac-9323-2229adbcb886" but the XARM
      block had a uuid of "8f32f043-c3c9-e7f8-f947-4e7f989c05d3", which
      indicated it was a block from an older filesystem. The reason that
      log recovery didn't replay it was that the LSN in the XARM block was
      larger than the LSN of the transaction being replayed, and so the
      block was not overwritten by log recovery.
      
      Hence, log recovery cant blindly trust the magic number and LSN in
      the block - it must verify that it belongs to the filesystem being
      recovered before using the LSN. i.e. if the UUIDs don't match, we
      need to unconditionally recovery the change held in the log.
      
      This patch was first tested on a block device that was repeatedly
      causing xfs/182 to fail with the same failure on the same block with
      the same directory read corruption signature (i.e. XARM block). It
      did not fail, and hasn't failed since.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      566055d3
  8. 12 9月, 2013 1 次提交
  9. 11 9月, 2013 1 次提交
    • D
      xfs: recovery of swap extents operations for CRC filesystems · 638f4416
      Dave Chinner 提交于
      This is the recovery side of the btree block owner change operation
      performed by swapext on CRC enabled filesystems. We detect that an
      owner change is needed by the flag that has been placed on the inode
      log format flag field. Because the inode recovery is being replayed
      after the buffers that make up the BMBT in the given checkpoint, we
      can walk all the buffers and directly modify them when we see the
      flag set on an inode.
      
      Because the inode can be relogged and hence present in multiple
      chekpoints with the "change owner" flag set, we could do multiple
      passes across the inode to do this change. While this isn't optimal,
      we can't directly ignore the flag as there may be multiple
      independent swap extent operations being replayed on the same inode
      in different checkpoints so we can't ignore them.
      
      Further, because the owner change operation uses ordered buffers, we
      might have buffers that are newer on disk than the current
      checkpoint and so already have the owner changed in them. Hence we
      cannot just peek at a buffer in the tree and check that it has the
      correct owner and assume that the change was completed.
      
      So, for the moment just brute force the owner change every time we
      see an inode with the flag set. Note that we have to be careful here
      because the owner of the buffers may point to either the old owner
      or the new owner. Currently the verifier can't verify the owner
      directly, so there is no failure case here right now. If we verify
      the owner exactly in future, then we'll have to take this into
      account.
      
      This was tested in terms of normal operation via xfstests - all of
      the fsr tests now pass without failure. however, we really need to
      modify xfs/227 to stress v3 inodes correctly to ensure we fully
      cover this case for v5 filesystems.
      
      In terms of recovery testing, I used a hacked version of xfs_fsr
      that held the temp inode open for a few seconds before exiting so
      that the filesystem could be shut down with an open owner change
      recovery flags set on at least the temp inode. fsr leaves the temp
      inode unlinked and in btree format, so this was necessary for the
      owner change to be reliably replayed.
      
      logprint confirmed the tmp inode in the log had the correct flag set:
      
      INO: cnt:3 total:3 a:0x69e9e0 len:56 a:0x69ea20 len:176 a:0x69eae0 len:88
              INODE: #regs:3   ino:0x44  flags:0x209   dsize:88
      	                                 ^^^^^
      
      0x200 is set, indicating a data fork owner change needed to be
      replayed on inode 0x44.  A printk in the revoery code confirmed that
      the inode change was recovered:
      
      XFS (vdc): Mounting Filesystem
      XFS (vdc): Starting recovery (logdev: internal)
      recovering owner change ino 0x44
      XFS (vdc): Version 5 superblock detected. This kernel L support enabled!
      Use of these features in this kernel is at your own risk!
      XFS (vdc): Ending recovery (logdev: internal)
      
      The script used to test this was:
      
      $ cat ./recovery-fsr.sh
      #!/bin/bash
      
      dev=/dev/vdc
      mntpt=/mnt/scratch
      testfile=$mntpt/testfile
      
      umount $mntpt
      mkfs.xfs -f -m crc=1 $dev
      mount $dev $mntpt
      chmod 777 $mntpt
      
      for i in `seq 10000 -1 0`; do
              xfs_io -f -d -c "pwrite $(($i * 4096)) 4096" $testfile > /dev/null 2>&1
      done
      xfs_bmap -vp $testfile |head -20
      
      xfs_fsr -d -v $testfile &
      sleep 10
      /home/dave/src/xfstests-dev/src/godown -f $mntpt
      wait
      umount $mntpt
      
      xfs_logprint -t $dev |tail -20
      time mount $dev $mntpt
      xfs_bmap -vp $testfile
      umount $mntpt
      $
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      638f4416
  10. 10 9月, 2013 1 次提交
  11. 31 8月, 2013 2 次提交
    • D
      xfs: inode buffers may not be valid during recovery readahead · d8914002
      Dave Chinner 提交于
      CRC enabled filesystems fail log recovery with 100% reliability on
      xfstests xfs/085 with the following failure:
      
      XFS (vdb): Mounting Filesystem
      XFS (vdb): Starting recovery (logdev: internal)
      XFS (vdb): Corruption detected. Unmount and run xfs_repair
      XFS (vdb): bad inode magic/vsn daddr 144 #0 (magic=0)
      XFS: Assertion failed: 0, file: fs/xfs/xfs_inode_buf.c, line: 95
      
      The problem is that the inode buffer has not been recovered before
      the readahead on the inode buffer is issued. The checkpoint being
      recovered actually allocates the inode chunk we are doing readahead
      from, so what comes from disk during readahead is essentially
      random and the verifier barfs on it.
      
      This inode buffer readahead problem affects non-crc filesystems,
      too, but xfstests does not trigger it at all on such
      configurations....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      d8914002
    • D
      xfs: check LSN ordering for v5 superblocks during recovery · 50d5c8d8
      Dave Chinner 提交于
      Log recovery has some strict ordering requirements which unordered
      or reordered metadata writeback can defeat. This can occur when an
      item is logged in a transaction, written back to disk, and then
      logged in a new transaction before the tail of the log is moved past
      the original modification.
      
      The result of this is that when we read an object off disk for
      recovery purposes, the buffer that we read may not contain the
      object type that recovery is expecting and hence at the end of the
      checkpoint being recovered we have an invalid object in memory.
      
      This isn't usually a problem, as recovery will then replay all the
      other checkpoints and that brings the object back to a valid and
      correct state, but the issue is that while the object is in the
      invalid state it can be flushed to disk. This results in the object
      verifier failing and triggering a corruption shutdown of log
      recover. This is correct behaviour for the verifiers - the problem
      is that we are not detecting that the object we've read off disk is
      newer than the transaction we are replaying.
      
      All metadata in v5 filesystems has the LSN of it's last modification
      stamped in it. This enabled log recover to read that field and
      determine the age of the object on disk correctly. If the LSN of the
      object on disk is older than the transaction being replayed, then we
      replay the modification. If the LSN of the object matches or is more
      recent than the transaction's LSN, then we should avoid overwriting
      the object as that is what leads to the transient corrupt state.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      50d5c8d8
  12. 29 8月, 2013 2 次提交
    • D
      xfs: fix bad dquot buffer size in log recovery readahead · 0f0d3345
      Dave Chinner 提交于
      xfstests xfs/087 fails 100% reliably with this assert:
      
      XFS (vdb): Mounting Filesystem
      XFS (vdb): Starting recovery (logdev: internal)
      XFS: Assertion failed: bp->b_flags & XBF_STALE, file: fs/xfs/xfs_buf.c, line: 548
      
      while trying to read a dquot buffer in xlog_recover_dquot_ra_pass2().
      
      The issue is that the buffer length to read that is passed to
      xfs_buf_readahead is in units of filesystem blocks, not disk blocks.
      (i.e. FSB, not daddr). Fix it but putting the correct conversion in
      place.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      0f0d3345
    • D
      xfs: don't account buffer cancellation during log recovery readahead · 84a5b730
      Dave Chinner 提交于
      When doing readhaead in log recovery, we check to see if buffers are
      cancelled before doing readahead. If we find a cancelled buffer,
      however, we always decrement the reference count we have on it, and
      that means that readahead is causing a double decrement of the
      cancelled buffer reference count.
      
      This results in log recovery *replaying cancelled buffers* as the
      actual recovery pass does not find the cancelled buffer entry in the
      commit phase of the second pass across a transaction. On debug
      kernels, this results in an ASSERT failure like so:
      
      XFS: Assertion failed: !(flags & XFS_BLF_CANCEL), file: fs/xfs/xfs_log_recover.c, line: 1815
      
      xfstests generic/311 reproduces this ASSERT failure with 100%
      reproducability.
      
      Fix it by making readahead only peek at the buffer cancelled state
      rather than the full accounting that xlog_check_buffer_cancelled()
      does.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      84a5b730
  13. 24 8月, 2013 1 次提交
  14. 21 8月, 2013 3 次提交
  15. 14 8月, 2013 2 次提交
  16. 13 8月, 2013 5 次提交
  17. 25 7月, 2013 2 次提交
    • D
      xfs: di_flushiter considered harmful · e1b4271a
      Dave Chinner 提交于
      When we made all inode updates transactional, we no longer needed
      the log recovery detection for inodes being newer on disk than the
      transaction being replayed - it was redundant as replay of the log
      would always result in the latest version of the inode would be on
      disk. It was redundant, but left in place because it wasn't
      considered to be a problem.
      
      However, with the new "don't read inodes on create" optimisation,
      flushiter has come back to bite us. Essentially, the optimisation
      made always initialises flushiter to zero in the create transaction,
      and so if we then crash and run recovery and the inode already on
      disk has a non-zero flushiter it will skip recovery of that inode.
      As a result, log recovery does the wrong thing and we end up with a
      corrupt filesystem.
      
      Because we have to support old kernel to new kernel upgrades, we
      can't just get rid of the flushiter support in log recovery as we
      might be upgrading from a kernel that doesn't have fully transactional
      inode updates.  Unfortunately, for v4 superblocks there is no way to
      guarantee that log recovery knows about this fact.
      
      We cannot add a new inode format flag to say it's a "special inode
      create" because it won't be understood by older kernels and so
      recovery could do the wrong thing on downgrade. We cannot specially
      detect the combination of zero mode/non-zero flushiter on disk to
      non-zero mode, zero flushiter in the log item during recovery
      because wrapping of the flushiter can result in false detection.
      
      Hence that makes this "don't use flushiter" optimisation limited to
      a disk format that guarantees that we don't need it. And that means
      the only fix here is to limit the "no read IO on create"
      optimisation to version 5 superblocks....
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit e60896d8)
      e1b4271a
    • D
      xfs: di_flushiter considered harmful · e60896d8
      Dave Chinner 提交于
      When we made all inode updates transactional, we no longer needed
      the log recovery detection for inodes being newer on disk than the
      transaction being replayed - it was redundant as replay of the log
      would always result in the latest version of the inode would be on
      disk. It was redundant, but left in place because it wasn't
      considered to be a problem.
      
      However, with the new "don't read inodes on create" optimisation,
      flushiter has come back to bite us. Essentially, the optimisation
      made always initialises flushiter to zero in the create transaction,
      and so if we then crash and run recovery and the inode already on
      disk has a non-zero flushiter it will skip recovery of that inode.
      As a result, log recovery does the wrong thing and we end up with a
      corrupt filesystem.
      
      Because we have to support old kernel to new kernel upgrades, we
      can't just get rid of the flushiter support in log recovery as we
      might be upgrading from a kernel that doesn't have fully transactional
      inode updates.  Unfortunately, for v4 superblocks there is no way to
      guarantee that log recovery knows about this fact.
      
      We cannot add a new inode format flag to say it's a "special inode
      create" because it won't be understood by older kernels and so
      recovery could do the wrong thing on downgrade. We cannot specially
      detect the combination of zero mode/non-zero flushiter on disk to
      non-zero mode, zero flushiter in the log item during recovery
      because wrapping of the flushiter can result in false detection.
      
      Hence that makes this "don't use flushiter" optimisation limited to
      a disk format that guarantees that we don't need it. And that means
      the only fix here is to limit the "no read IO on create"
      optimisation to version 5 superblocks....
      Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      e60896d8
  18. 28 6月, 2013 1 次提交
    • D
      xfs: Inode create item recovery · 28c8e41a
      Dave Chinner 提交于
      When we find a icreate transaction, we need to get and initialise
      the buffers in the range that has been passed. Extract and verify
      the information in the item record, then loop over the range
      initialising and issuing the buffer writes delayed.
      
      Support an arbitrary size range to initialise so that in
      future when we allocate inodes in much larger chunks all kernels
      that understand this transaction can still recover them.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      28c8e41a
  19. 15 6月, 2013 2 次提交
    • D
      xfs: don't shutdown log recovery on validation errors · d302cf1d
      Dave Chinner 提交于
      Unfortunately, we cannot guarantee that items logged multiple times
      and replayed by log recovery do not take objects back in time. When
      they are taken back in time, the go into an intermediate state which
      is corrupt, and hence verification that occurs on this intermediate
      state causes log recovery to abort with a corruption shutdown.
      
      Instead of causing a shutdown and unmountable filesystem, don't
      verify post-recovery items before they are written to disk. This is
      less than optimal, but there is no way to detect this issue for
      non-CRC filesystems If log recovery successfully completes, this
      will be undone and the object will be consistent by subsequent
      transactions that are replayed, so in most cases we don't need to
      take drastic action.
      
      For CRC enabled filesystems, leave the verifiers in place - we need
      to call them to recalculate the CRCs on the objects anyway. This
      recovery problem can be solved for such filesystems - we have a LSN
      stamped in all metadata at writeback time that we can to determine
      whether the item should be replayed or not. This is a separate piece
      of work, so is not addressed by this patch.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 9222a9cf)
      d302cf1d
    • D
      xfs: don't shutdown log recovery on validation errors · 9222a9cf
      Dave Chinner 提交于
      Unfortunately, we cannot guarantee that items logged multiple times
      and replayed by log recovery do not take objects back in time. When
      they are taken back in time, the go into an intermediate state which
      is corrupt, and hence verification that occurs on this intermediate
      state causes log recovery to abort with a corruption shutdown.
      
      Instead of causing a shutdown and unmountable filesystem, don't
      verify post-recovery items before they are written to disk. This is
      less than optimal, but there is no way to detect this issue for
      non-CRC filesystems If log recovery successfully completes, this
      will be undone and the object will be consistent by subsequent
      transactions that are replayed, so in most cases we don't need to
      take drastic action.
      
      For CRC enabled filesystems, leave the verifiers in place - we need
      to call them to recalculate the CRCs on the objects anyway. This
      recovery problem can be solved for such filesystems - we have a LSN
      stamped in all metadata at writeback time that we can to determine
      whether the item should be replayed or not. This is a separate piece
      of work, so is not addressed by this patch.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      9222a9cf
  20. 06 6月, 2013 4 次提交
    • D
      xfs: inode unlinked list needs to recalculate the inode CRC · ad868afd
      Dave Chinner 提交于
      The inode unlinked list manipulations operate directly on the inode
      buffer, and so bypass the inode CRC calculation mechanisms. Hence an
      inode on the unlinked list has an invalid CRC. Fix this by
      recalculating the CRC whenever we modify an unlinked list pointer in
      an inode, ncluding during log recovery. This is trivial to do and
      results in  unlinked list operations always leaving a consistent
      inode in the buffer.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 0a32c26e)
      ad868afd
    • D
      xfs: fix log recovery transaction item reordering · 75406170
      Dave Chinner 提交于
      There are several constraints that inode allocation and unlink
      logging impose on log recovery. These all stem from the fact that
      inode alloc/unlink are logged in buffers, but all other inode
      changes are logged in inode items. Hence there are ordering
      constraints that recovery must follow to ensure the correct result
      occurs.
      
      As it turns out, this ordering has been working mostly by chance
      than good management. The existing code moves all buffers except
      cancelled buffers to the head of the list, and everything else to
      the tail of the list. The problem with this is that is interleaves
      inode items with the buffer cancellation items, and hence whether
      the inode item in an cancelled buffer gets replayed is essentially
      left to chance.
      
      Further, this ordering causes problems for log recovery when inode
      CRCs are enabled. It typically replays the inode unlink buffer long before
      it replays the inode core changes, and so the CRC recorded in an
      unlink buffer is going to be invalid and hence any attempt to
      validate the inode in the buffer is going to fail. Hence we really
      need to enforce the ordering that the inode alloc/unlink code has
      expected log recovery to have since inode chunk de-allocation was
      introduced back in 2003...
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit a775ad77)
      75406170
    • D
      xfs: rework dquot CRCs · bb9b8e86
      Dave Chinner 提交于
      Calculating dquot CRCs when the backing buffer is written back just
      doesn't work reliably. There are several places which manipulate
      dquots directly in the buffers, and they don't calculate CRCs
      appropriately, nor do they always set the buffer up to calculate
      CRCs appropriately.
      
      Firstly, if we log a dquot buffer (e.g. during allocation) it gets
      logged without valid CRC, and so on recovery we end up with a dquot
      that is not valid.
      
      Secondly, if we recover/repair a dquot, we don't have a verifier
      attached to the buffer and hence CRCs are not calculated on the way
      down to disk.
      
      Thirdly, calculating the CRC after we've changed the contents means
      that if we re-read the dquot from the buffer, we cannot verify the
      contents of the dquot are valid, as the CRC is invalid.
      
      So, to avoid all the dquot CRC errors that are being detected by the
      read verifier, change to using the same model as for inodes. That
      is, dquot CRCs are calculated and written to the backing buffer at
      the time the dquot is flushed to the backing buffer. If we modify
      the dquot directly in the backing buffer, calculate the CRC
      immediately after the modification is complete. Hence the dquot in
      the on-disk buffer should always have a valid CRC.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 6fcdc59d)
      bb9b8e86
    • D
      xfs: inode unlinked list needs to recalculate the inode CRC · 0a32c26e
      Dave Chinner 提交于
      The inode unlinked list manipulations operate directly on the inode
      buffer, and so bypass the inode CRC calculation mechanisms. Hence an
      inode on the unlinked list has an invalid CRC. Fix this by
      recalculating the CRC whenever we modify an unlinked list pointer in
      an inode, ncluding during log recovery. This is trivial to do and
      results in  unlinked list operations always leaving a consistent
      inode in the buffer.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      0a32c26e