1. 04 12月, 2012 1 次提交
    • D
      xfs: fix sparse reported log CRC endian issue · f9668a09
      Dave Chinner 提交于
      Not a bug as such, just warning noise from the xlog_cksum()
      returning a __be32 type when it should be returning a __le32 type.
      
      On Wed, Nov 28, 2012 at 08:30:59AM -0500, Christoph Hellwig wrote:
      > But why are we storing the crc field little endian while all other on
      > disk formats are big endian? (And yes I realize it might as well have
      > been me who did that back in the idea, but I still have no idea why)
      
      Because the CRC always returns the calcuation LE format, even on BE
      systems. So rather than always having to byte swap it everywhere and
      have all the force casts and anootations for sparse, it seems simpler to
      just make it a __le32 everywhere....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      f9668a09
  2. 30 11月, 2012 3 次提交
    • D
      xfs: fix stray dquot unlock when reclaiming dquots · b870553c
      Dave Chinner 提交于
      When we fail to get a dquot lock during reclaim, we jump to an error
      handler that unlocks the dquot. This is wrong as we didn't lock the
      dquot, and unlocking it means who-ever is holding the lock has had
      it silently taken away, and hence it results in a lock imbalance.
      
      Found by inspection while modifying the code for the numa-lru
      patchset. This fixes a random hang I've been seeing on xfstest 232
      for the past several months.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      b870553c
    • D
      xfs: fix direct IO nested transaction deadlock. · 437a255a
      Dave Chinner 提交于
      The direct IO path can do a nested transaction reservation when
      writing past the EOF. The first transaction is the append
      transaction for setting the filesize at IO completion, but we can
      also need a transaction for allocation of blocks. If the log is low
      on space due to reservations and small log, the append transaction
      can be granted after wating for space as the only active transaction
      in the system. This then attempts a reservation for an allocation,
      which there isn't space in the log for, and the reservation sleeps.
      The result is that there is nothing left in the system to wake up
      all the processes waiting for log space to come free.
      
      The stack trace that shows this deadlock is relatively innocuous:
      
       xlog_grant_head_wait
       xlog_grant_head_check
       xfs_log_reserve
       xfs_trans_reserve
       xfs_iomap_write_direct
       __xfs_get_blocks
       xfs_get_blocks_direct
       do_blockdev_direct_IO
       __blockdev_direct_IO
       xfs_vm_direct_IO
       generic_file_direct_write
       xfs_file_dio_aio_writ
       xfs_file_aio_write
       do_sync_write
       vfs_write
      
      This was discovered on a filesystem with a log of only 10MB, and a
      log stripe unit of 256k whih increased the base reservations by
      512k. Hence a allocation transaction requires 1.2MB of log space to
      be available instead of only 260k, and so greatly increased the
      chance that there wouldn't be enough log space available for the
      nested transaction to succeed. The key to reproducing it is this
      mkfs command:
      
      mkfs.xfs -f -d agcount=16,su=256k,sw=12 -l su=256k,size=2560b $SCRATCH_DEV
      
      The test case was a 1000 fsstress processes running with random
      freeze and unfreezes every few seconds. Thanks to Eryu Guan
      (eguan@redhat.com) for writing the test that found this on a system
      with a somewhat unique default configuration....
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAndrew Dahl <adahl@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      437a255a
    • D
      xfs: byte range granularity for XFS_IOC_ZERO_RANGE · ef9d8733
      Dave Chinner 提交于
      XFS_IOC_ZERO_RANGE simply does not work properly for non page cache
      aligned ranges. Neither test 242 or 290 exercise this correctly, so
      the behaviour is completely busted even though the tests pass.
      
      Fix it to support full byte range granularity as was originally
      intended for this ioctl.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ef9d8733
  3. 27 11月, 2012 2 次提交
  4. 20 11月, 2012 2 次提交
    • C
      xfs: add CRC checks to the log · 0e446be4
      Christoph Hellwig 提交于
      Implement CRCs for the log buffers.  We re-use a field in
      struct xlog_rec_header that was used for a weak checksum of the
      log buffer payload in debug builds before.
      
      The new checksumming uses the crc32c checksum we will use elsewhere
      in XFS, and also protects the record header and addition cycle data.
      
      Due to this there are some interesting changes in xlog_sync, as we
      need to do the cycle wrapping for the split buffer case much earlier,
      as we would touch the buffer after generating the checksum otherwise.
      
      The CRC calculation is always enabled, even for non-CRC filesystems,
      as adding this CRC does not change the log format. On non-CRC
      filesystems, only issue an alert if a CRC mismatch is found and
      allow recovery to continue - this will act as an indicator that
      log recovery problems are a result of log corruption. On CRC enabled
      filesystems, however, log recovery will fail.
      
      Note that existing debug kernels will write a simple checksum value
      to the log, so the first time this is run on a filesystem taht was
      last used on a debug kernel it will through CRC mismatch warning
      errors. These can be ignored.
      
      Initially based on a patch from Dave Chinner, then modified
      significantly by Christoph Hellwig.  Modified again by Dave Chinner
      to get to this version.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      0e446be4
    • C
      xfs: add CRC infrastructure · bc02e869
      Christoph Hellwig 提交于
       - add a mount feature bit for CRC enabled filesystems
       - add some helpers for generating and verifying the CRCs
       - add a copy_uuid helper
      
      The checksumming helpers are loosely based on similar ones in sctp,
      all other bits come from Dave Chinner.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      bc02e869
  5. 16 11月, 2012 22 次提交
  6. 15 11月, 2012 5 次提交
  7. 14 11月, 2012 5 次提交
    • D
      xfs: make growfs initialise the AGFL header · de497688
      Dave Chinner 提交于
      For verification purposes, AGFLs need to be initialised to a known
      set of values. For upcoming CRC changes, they are also headers that
      need to be initialised. Currently, growfs does neither for the AGFLs
      - it ignores them completely. Add initialisation of the AGFL to be
      full of invalid block numbers (NULLAGBLOCK) to put the
      infrastructure in place needed for CRC support.
      
      Includes a comment clarification from Jeff Liu.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by Rich Johnston <rjohnston@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      de497688
    • D
      xfs: growfs: use uncached buffers for new headers · fd23683c
      Dave Chinner 提交于
      When writing the new AG headers to disk, we can't attach write
      verifiers because they have a dependency on the struct xfs-perag
      being attached to the buffer to be fully initialised and growfs
      can't fully initialise them until later in the process.
      
      The simplest way to avoid this problem is to use uncached buffers
      for writing the new headers. These buffers don't have the xfs-perag
      attached to them, so it's simple to detect in the write verifier and
      be able to skip the checks that need the xfs-perag.
      
      This enables us to attach the appropriate buffer ops to the buffer
      and hence calculate CRCs on the way to disk. IT also means that the
      buffer is torn down immediately, and so the first access to the AG
      headers will re-read the header from disk and perform full
      verification of the buffer. This way we also can catch corruptions
      due to problems that went undetected in growfs.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by Rich Johnston <rjohnston@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      fd23683c
    • D
      xfs: use btree block initialisation functions in growfs · b64f3a39
      Dave Chinner 提交于
      Factor xfs_btree_init_block() to be independent of the btree cursor,
      and use the function to initialise btree blocks in the growfs code.
      This makes adding support for different format btree blocks simple.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by Rich Johnston <rjohnston@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      b64f3a39
    • D
      xfs: add more attribute tree trace points. · ee73259b
      Dave Chinner 提交于
      Added when debugging recent attribute tree problems to more finely
      trace code execution through the maze of twisty passages that makes
      up the attr code.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ee73259b
    • D
      xfs: drop buffer io reference when a bad bio is built · 37eb17e6
      Dave Chinner 提交于
      Error handling in xfs_buf_ioapply_map() does not handle IO reference
      counts correctly. We increment the b_io_remaining count before
      building the bio, but then fail to decrement it in the failure case.
      This leads to the buffer never running IO completion and releasing
      the reference that the IO holds, so at unmount we can leak the
      buffer. This leak is captured by this assert failure during unmount:
      
      XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 273
      
      This is not a new bug - the b_io_remaining accounting has had this
      problem for a long, long time - it's just very hard to get a
      zero length bio being built by this code...
      
      Further, the buffer IO error can be overwritten on a multi-segment
      buffer by subsequent bio completions for partial sections of the
      buffer. Hence we should only set the buffer error status if the
      buffer is not already carrying an error status. This ensures that a
      partial IO error on a multi-segment buffer will not be lost. This
      part of the problem is a regression, however.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      37eb17e6