1. 29 7月, 2015 1 次提交
    • D
      xfs: remote attribute headers contain an invalid LSN · e3c32ee9
      Dave Chinner 提交于
      In recent testing, a system that crashed failed log recovery on
      restart with a bad symlink buffer magic number:
      
      XFS (vda): Starting recovery (logdev: internal)
      XFS (vda): Bad symlink block magic!
      XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2060
      
      On examination of the log via xfs_logprint, none of the symlink
      buffers in the log had a bad magic number, nor were any other types
      of buffer log format headers mis-identified as symlink buffers.
      Tracing was used to find the buffer the kernel was tripping over,
      and xfs_db identified it's contents as:
      
      000: 5841524d 00000000 00000346 64d82b48 8983e692 d71e4680 a5f49e2c b317576e
      020: 00000000 00602038 00000000 006034ce d0020000 00000000 4d4d4d4d 4d4d4d4d
      040: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
      060: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
      .....
      
      This is a remote attribute buffer, which are notable in that they
      are not logged but are instead written synchronously by the remote
      attribute code so that they exist on disk before the attribute
      transactions are committed to the journal.
      
      The above remote attribute block has an invalid LSN in it - cycle
      0xd002000, block 0 - which means when log recovery comes along to
      determine if the transaction that writes to the underlying block
      should be replayed, it sees a block that has a future LSN and so
      does not replay the buffer data in the transaction. Instead, it
      validates the buffer magic number and attaches the buffer verifier
      to it.  It is this buffer magic number check that is failing in the
      above assert, indicating that we skipped replay due to the LSN of
      the underlying buffer.
      
      The problem here is that the remote attribute buffers cannot have a
      valid LSN placed into them, because the transaction that contains 
      the attribute tree pointer changes and the block allocation that the
      attribute data is being written to hasn't yet been committed. Hence
      the LSN field in the attribute block is completely unwritten,
      thereby leaving the underlying contents of the block in the LSN
      field. It could have any value, and hence a future overwrite of the
      block by log recovery may or may not work correctly.
      
      Fix this by always writing an invalid LSN to the remote attribute
      block, as any buffer in log recovery that needs to write over the
      remote attribute should occur. We are protected from having old data
      written over the attribute by the fact that freeing the block before
      the remote attribute is written will result in the buffer being
      marked stale in the log and so all changes prior to the buffer stale
      transaction will be cancelled by log recovery.
      
      Hence it is safe to ignore the LSN in the case or synchronously
      written, unlogged metadata such as remote attribute blocks, and to
      ensure we do that correctly, we need to write an invalid LSN to all
      remote attribute blocks to trigger immediate recovery of metadata
      that is written over the top.
      
      As a further protection for filesystems that may already have remote
      attribute blocks with bad LSNs on disk, change the log recovery code
      to always trigger immediate recovery of metadata over remote
      attribute blocks.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e3c32ee9
  2. 22 6月, 2015 2 次提交
  3. 04 6月, 2015 2 次提交
    • C
      xfs: saner xfs_trans_commit interface · 70393313
      Christoph Hellwig 提交于
      The flags argument to xfs_trans_commit is not useful for most callers, as
      a commit of a transaction without a permanent log reservation must pass
      0 here, and all callers for a transaction with a permanent log reservation
      except for xfs_trans_roll must pass XFS_TRANS_RELEASE_LOG_RES.  So remove
      the flags argument from the public xfs_trans_commit interfaces, and
      introduce low-level __xfs_trans_commit variant just for xfs_trans_roll
      that regrants a log reservation instead of releasing it.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      70393313
    • C
      xfs: remove the flags argument to xfs_trans_cancel · 4906e215
      Christoph Hellwig 提交于
      xfs_trans_cancel takes two flags arguments: XFS_TRANS_RELEASE_LOG_RES and
      XFS_TRANS_ABORT.  Both of them are a direct product of the transaction
      state, and can be deducted:
      
       - any dirty transaction needs XFS_TRANS_ABORT to be properly canceled,
         and XFS_TRANS_ABORT is a noop for a transaction that is not dirty.
       - any transaction with a permanent log reservation needs
         XFS_TRANS_RELEASE_LOG_RES to be properly canceled, and passing
         XFS_TRANS_RELEASE_LOG_RES for a transaction without a permanent
         log reservation is invalid.
      
      So just remove the flags argument and do the right thing.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      4906e215
  4. 29 5月, 2015 2 次提交
  5. 23 2月, 2015 1 次提交
  6. 04 12月, 2014 1 次提交
  7. 28 11月, 2014 3 次提交
  8. 02 10月, 2014 2 次提交
    • D
      xfs: introduce xfs_buf_submit[_wait] · 595bff75
      Dave Chinner 提交于
      There is a lot of cookie-cutter code that looks like:
      
      	if (shutdown)
      		handle buffer error
      	xfs_buf_iorequest(bp)
      	error = xfs_buf_iowait(bp)
      	if (error)
      		handle buffer error
      
      spread through XFS. There's significant complexity now in
      xfs_buf_iorequest() to specifically handle this sort of synchronous
      IO pattern, but there's all sorts of nasty surprises in different
      error handling code dependent on who owns the buffer references and
      the locks.
      
      Pull this pattern into a single helper, where we can hide all the
      synchronous IO warts and hence make the error handling for all the
      callers much saner. This removes the need for a special extra
      reference to protect IO completion processing, as we can now hold a
      single reference across dispatch and waiting, simplifying the sync
      IO smeantics and error handling.
      
      In doing this, also rename xfs_buf_iorequest to xfs_buf_submit and
      make it explicitly handle on asynchronous IO. This forces all users
      to be switched specifically to one interface or the other and
      removes any ambiguity between how the interfaces are to be used. It
      also means that xfs_buf_iowait() goes away.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      595bff75
    • D
      xfs: xfs_buf_ioend and xfs_buf_iodone_work duplicate functionality · e8aaba9a
      Dave Chinner 提交于
      We do some work in xfs_buf_ioend, and some work in
      xfs_buf_iodone_work, but much of that functionality is the same.
      This work can all be done in a single function, leaving
      xfs_buf_iodone just a wrapper to determine if we should execute it
      by workqueue or directly. hence rename xfs_buf_iodone_work to
      xfs_buf_ioend(), and add a new xfs_buf_ioend_async() for places that
      need async processing.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e8aaba9a
  9. 29 9月, 2014 5 次提交
  10. 09 9月, 2014 2 次提交
    • E
      xfs: deduplicate xlog_do_recovery_pass() · 970fd3f0
      Eric Sandeen 提交于
      In xlog_do_recovery_pass(), there are 2 distinct cases:
      non-wrapped and wrapped log recovery.
      
      If we find a wrapped log, we recover around the end
      of the log, and then handle the rest of recovery
      exactly as in the non-wrapped case - using exactly the same
      (duplicated) code.
      
      Rather than having the same code in both cases, we can
      get the wrapped portion out of the way first if needed,
      and then recover the non-wrapped portion of the log.
      
      There should be no functional change here, just code
      reorganization & deduplication.
      
      The patch looks a bit bigger than it really is; the last
      hunk is whitespace changes (un-indenting).
      
      Tested with xfstests "check -g log" on a stock configuration.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      970fd3f0
    • B
      xfs: export log_recovery_delay to delay mount time log recovery · 2e227178
      Brian Foster 提交于
      XFS log recovery has been discovered to have race conditions with
      buffers when I/O errors occur. External tools are available to simulate
      I/O errors to XFS, but this alone is not sufficient for testing log
      recovery. XFS unconditionally resets the inactive region of the log
      prior to log recovery to avoid confusion over processing any partially
      written log records that might have been written before an unclean
      shutdown. Therefore, unconditional write I/O failures at mount time are
      caught by the reset sequence rather than log recovery and hinder the
      ability to test the latter.
      
      The device-mapper dm-flakey module uses an up/down timer to define a
      cycle for when to fail I/Os. Create a pre log recovery delay tunable
      that can be used to coordinate XFS log recovery with I/O errors
      simulated by dm-flakey. This facilitates coordination in userspace that
      allows the reset of stale log blocks to succeed and writes due to log
      recovery to fail. For example, define a dm-flakey instance with an
      uptime long enough to allow log reset to succeed and a log recovery
      delay long enough to allow the dm-flakey uptime to expire.
      
      The 'log_recovery_delay' sysfs tunable is exported under
      /sys/fs/xfs/debug and is only enabled for kernels compiled in XFS debug
      mode. The value is exported in units of seconds and allows for a delay
      of up to 60 seconds. Note that this is for XFS debug and test
      instrumentation purposes only and should not be used by applications. No
      delay is enabled by default.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2e227178
  11. 04 8月, 2014 2 次提交
    • D
      xfs: dquot recovery needs verifiers · ad3714b8
      Dave Chinner 提交于
      dquot recovery should add verifiers to the dquot buffers that it
      recovers changes into. Unfortunately, it doesn't attached the
      verifiers to the buffers in a consistent manner. For example,
      xlog_recover_dquot_pass2() reads dquot buffers without a verifier
      and then writes it without ever having attached a verifier to the
      buffer.
      
      Further, dquot buffer recovery may write a dquot buffer that has not
      been modified, or indeed, shoul dbe written because quotas are not
      enabled and hence changes to the buffer were not replayed. In this
      case, we again write buffers without verifiers attached because that
      doesn't happen until after the buffer changes have been replayed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ad3714b8
    • D
      xfs: ensure verifiers are attached to recovered buffers · 67dc288c
      Dave Chinner 提交于
      Crash testing of CRC enabled filesystems has resulted in a number of
      reports of bad CRCs being detected after the filesystem was mounted.
      Errors such as the following were being seen:
      
      XFS (sdb3): Mounting V5 Filesystem
      XFS (sdb3): Starting recovery (logdev: internal)
      XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1
      XFS (sdb3): Unmount and run xfs_repair
      XFS (sdb3): First 64 bytes of corrupted metadata buffer:
      ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40  XAGF...........@
      ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01  ..mS..w.........
      ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03  ................
      ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00  ................
      XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1
      
      The errors were typically being seen in AGF, AGI and their related
      btree block buffers some time after log recovery had run. Often it
      wasn't until later subsequent mounts that the problem was
      discovered. The common symptom was a buffer with the correct
      contents, but a CRC and an LSN that matched an older version of the
      contents.
      
      Some debug added to _xfs_buf_ioapply() indicated that buffers were
      being written without verifiers attached to them from log recovery,
      and Jan Kara isolated the cause to log recovery readahead an dit's
      interactions with buffers that had a more recent LSN on disk than
      the transaction being recovered. In this case, the buffer did not
      get a verifier attached, and os when the second phase of log
      recovery ran and recovered EFIs and unlinked inodes, the buffers
      were modified and written without the verifier running. Hence they
      had up to date contents, but stale LSNs and CRCs.
      
      Fix it by attaching verifiers to buffers we skip due to future LSN
      values so they don't escape into the buffer cache without the
      correct verifier attached.
      
      This patch is based on analysis and a patch from Jan Kara.
      
      cc: <stable@vger.kernel.org>
      Reported-by: NJan Kara <jack@suse.cz>
      Reported-by: NFanael Linithien <fanael4@gmail.com>
      Reported-by: NGrozdan <neutrino8@gmail.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      67dc288c
  12. 25 6月, 2014 1 次提交
    • D
      xfs: global error sign conversion · 2451337d
      Dave Chinner 提交于
      Convert all the errors the core XFs code to negative error signs
      like the rest of the kernel and remove all the sign conversion we
      do in the interface layers.
      
      Errors for conversion (and comparison) found via searches like:
      
      $ git grep " E" fs/xfs
      $ git grep "return E" fs/xfs
      $ git grep " E[A-Z].*;$" fs/xfs
      
      Negation points found via searches like:
      
      $ git grep "= -[a-z,A-Z]" fs/xfs
      $ git grep "return -[a-z,A-D,F-Z]" fs/xfs
      $ git grep " -[a-z].*;" fs/xfs
      
      [ with some bits I missed from Brian Foster ]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2451337d
  13. 22 6月, 2014 2 次提交
  14. 24 4月, 2014 1 次提交
  15. 14 4月, 2014 2 次提交
  16. 17 12月, 2013 1 次提交
    • C
      xfs: remove xfsbdstrat error · 83a0adc3
      Christoph Hellwig 提交于
      The xfsbdstrat helper is a small but useless wrapper for xfs_buf_iorequest that
      handles the case of a shut down filesystem.  Most of the users have private,
      uncached buffers that can just be freed in this case, but the complex error
      handling in xfs_bioerror_relse messes up the case when it's called without
      a locked buffer.
      
      Remove xfsbdstrat and opencode the error handling in the callers.  All but
      one can simply return an error and don't need to deal with buffer state,
      and the one caller that cares about the buffer state could do with a major
      cleanup as well, but we'll defer that to later.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      83a0adc3
  17. 13 12月, 2013 3 次提交
  18. 06 12月, 2013 1 次提交
  19. 24 10月, 2013 5 次提交
    • D
      xfs: decouple inode and bmap btree header files · a4fbe6ab
      Dave Chinner 提交于
      Currently the xfs_inode.h header has a dependency on the definition
      of the BMAP btree records as the inode fork includes an array of
      xfs_bmbt_rec_host_t objects in it's definition.
      
      Move all the btree format definitions from xfs_btree.h,
      xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
      xfs_format.h to continue the process of centralising the on-disk
      format definitions. With this done, the xfs inode definitions are no
      longer dependent on btree header files.
      
      The enables a massive culling of unnecessary includes, with close to
      200 #include directives removed from the XFS kernel code base.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a4fbe6ab
    • D
      xfs: decouple log and transaction headers · 239880ef
      Dave Chinner 提交于
      xfs_trans.h has a dependency on xfs_log.h for a couple of
      structures. Most code that does transactions doesn't need to know
      anything about the log, but this dependency means that they have to
      include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
      files and clean up the includes to be in dependency order.
      
      In doing this, remove the direct include of xfs_trans_reserve.h from
      xfs_trans.h so that we remove the dependency between xfs_trans.h and
      xfs_mount.h. Hence the xfs_trans.h include can be moved to the
      indicate the actual dependencies other header files have on it.
      
      Note that these are kernel only header files, so this does not
      translate to any userspace changes at all.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      239880ef
    • D
      xfs: split dquot buffer operations out · 9aede1d8
      Dave Chinner 提交于
      Parts of userspace want to be able to read and modify dquot buffers
      (e.g. xfs_db) so we need to split out the reading and writing of
      these buffers so it is easy to shared code with libxfs in userspace.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      9aede1d8
    • D
      xfs: unify directory/attribute format definitions · 57062787
      Dave Chinner 提交于
      The on-disk format definitions for the directory and attribute
      structures are spread across 3 header files right now, only one of
      which is dedicated to defining on-disk structures and their
      manipulation (xfs_dir2_format.h). Pull all the format definitions
      into a single header file - xfs_da_format.h - and switch all the
      code over to point at that.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      57062787
    • D
      xfs: create a shared header file for format-related information · 70a9883c
      Dave Chinner 提交于
      All of the buffer operations structures are needed to be exported
      for xfs_db, so move them all to a common location rather than
      spreading them all over the place. They are verifying the on-disk
      format, so while xfs_format.h might be a good place, it is not part
      of the on disk format.
      
      Hence we need to create a new header file that we centralise these
      related definitions. Start by moving the bffer operations
      structures, and then also move all the other definitions that have
      crept into xfs_log_format.h and xfs_format.h as there was no other
      shared header file to put them in.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      70a9883c
  20. 18 10月, 2013 1 次提交