1. 05 12月, 2016 2 次提交
  2. 05 10月, 2016 2 次提交
  3. 04 10月, 2016 2 次提交
  4. 26 9月, 2016 5 次提交
    • B
      xfs: log recovery tracepoints to track current lsn and buffer submission · 5cd9cee9
      Brian Foster 提交于
      Log recovery has particular rules around buffer submission along with
      tricky corner cases where independent transactions can share an LSN. As
      such, it can be difficult to follow when/why buffers are submitted
      during recovery.
      
      Add a couple tracepoints to post the current LSN of a record when a new
      record is being processed and when a buffer is being skipped due to LSN
      ordering. Also, update the recover item class to include the LSN of the
      current transaction for the item being processed.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      5cd9cee9
    • B
      xfs: update metadata LSN in buffers during log recovery · 60a4a222
      Brian Foster 提交于
      Log recovery is currently broken for v5 superblocks in that it never
      updates the metadata LSN of buffers written out during recovery. The
      metadata LSN is recorded in various bits of metadata to provide recovery
      ordering criteria that prevents transient corruption states reported by
      buffer write verifiers. Without such ordering logic, buffer updates can
      be replayed out of order and lead to false positive transient corruption
      states. This is generally not a corruption vector on its own, but
      corruption detection shuts down the filesystem and ultimately prevents a
      mount if it occurs during log recovery. This requires an xfs_repair run
      that clears the log and potentially loses filesystem updates.
      
      This problem is avoided in most cases as metadata writes during normal
      filesystem operation update the metadata LSN appropriately. The problem
      with log recovery not updating metadata LSNs manifests if the system
      happens to crash shortly after log recovery itself. In this scenario, it
      is possible for log recovery to complete all metadata I/O such that the
      filesystem is consistent. If a crash occurs after that point but before
      the log tail is pushed forward by subsequent operations, however, the
      next mount performs the same log recovery over again. If a buffer is
      updated multiple times in the dirty range of the log, an earlier update
      in the log might not be valid based on the current state of the
      associated buffer after all of the updates in the log had been replayed
      (before the previous crash). If a verifier happens to detect such a
      problem, the filesystem claims corruption and immediately shuts down.
      
      This commonly manifests in practice as directory block verifier failures
      such as the following, likely due to directory verifiers being
      particularly detailed in their checks as compared to most others:
      
        ...
        Mounting V5 Filesystem
        XFS (dm-0): Starting recovery (logdev: internal)
        XFS (dm-0): Internal error XFS_WANT_CORRUPTED_RETURN at line ... of \
          file fs/xfs/libxfs/xfs_dir2_data.c.  Caller xfs_dir3_data_verify ...
        ...
      
      Update log recovery to update the metadata LSN of recovered buffers.
      Since metadata LSNs are already updated by write verifer functions via
      attached log items, attach a dummy log item to the buffer during
      validation and explicitly set the LSN of the current transaction. This
      ensures that the metadata LSN of a buffer is updated based on whether
      the recovery I/O actually completes, and if so, that subsequent recovery
      attempts identify that the buffer is already up to date with respect to
      the current transaction.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      60a4a222
    • B
      xfs: don't warn on buffers not being recovered due to LSN · 040c52c0
      Brian Foster 提交于
      The log recovery buffer validation function is invoked in cases where a
      buffer update may be skipped due to LSN ordering. If the validation
      function happens to come across directory conversion situations (e.g., a
      dir3 block to data conversion), it may warn about seeing a buffer log
      format of one type and a buffer with a magic number of another.
      
      This warning is not valid as the buffer update is ultimately skipped.
      This is indicated by a current_lsn of NULLCOMMITLSN provided by the
      caller. As such, update xlog_recover_validate_buf_type() to only warn in
      such cases when a buffer update is expected.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      040c52c0
    • B
      xfs: pass current lsn to log recovery buffer validation · 22db9af2
      Brian Foster 提交于
      The current LSN must be available to the buffer validation function to
      provide the ability to update the metadata LSN of the buffer. Pass the
      current_lsn value down to xlog_recover_validate_buf_type() in
      preparation.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      22db9af2
    • B
      xfs: rework log recovery to submit buffers on LSN boundaries · 12818d24
      Brian Foster 提交于
      The fix to log recovery to update the metadata LSN in recovered buffers
      introduces the requirement that a buffer is submitted only once per
      current LSN. Log recovery currently submits buffers on transaction
      boundaries. This is not sufficient as the abstraction between log
      records and transactions allows for various scenarios where multiple
      transactions can share the same current LSN. If independent transactions
      share an LSN and both modify the same buffer, log recovery can
      incorrectly skip updates and leave the filesystem in an inconsisent
      state.
      
      In preparation for proper metadata LSN updates during log recovery,
      update log recovery to submit buffers for write on LSN change boundaries
      rather than transaction boundaries. Explicitly track the current LSN in
      a new struct xlog field to handle the various corner cases of when the
      current LSN may or may not change.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      12818d24
  5. 03 8月, 2016 6 次提交
  6. 06 4月, 2016 2 次提交
  7. 07 3月, 2016 5 次提交
    • D
      xfs: reinitialise per-AG structures if geometry changes during recovery · a798011c
      Dave Chinner 提交于
      If a crash occurs immediately after a filesystem grow operation, the
      updated superblock geometry is found only in the log. After we
      recover the log, the superblock is reread and re-initialised and so
      has the new geometry in memory. If the new geometry has more AGs
      than prior to the grow operation, then the new AGs will not have
      in-memory xfs_perag structurea associated with them.
      
      This will result in an oops when the first metadata buffer from a
      new AG is looked up in the buffer cache, as the block lies within
      the new geometry but then fails to find a perag structure on lookup.
      This is easily fixed by simply re-initialising the perag structure
      after re-reading the superblock at the conclusion of the first pahse
      of log recovery.
      
      This, however, does not fix the case of log recovery requiring
      access to metadata in the newly grown space. Fortunately for us,
      because the in-core superblock has not been updated, this will
      result in detection of access beyond the end of the filesystem
      and so recovery will fail at that point. If this proves to be
      a problem, then we can address it separately to the current
      reported issue.
      Reported-by: NAlex Lyakas <alex@zadarastorage.com>
      Tested-by: NAlex Lyakas <alex@zadarastorage.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      a798011c
    • B
      xfs: only run torn log write detection on dirty logs · 7f6aff3a
      Brian Foster 提交于
      XFS uses CRC verification over a sub-range of the head of the log to
      detect and handle torn writes. This torn log write detection currently
      runs unconditionally at mount time, regardless of whether the log is
      dirty or clean. This is problematic in cases where a filesystem might
      end up being moved across different, incompatible (i.e., opposite
      byte-endianness) architectures.
      
      The problem lies in the fact that log data is not necessarily written in
      an architecture independent format. For example, certain bits of data
      are written in native endian format. Further, the size of certain log
      data structures differs (i.e., struct xlog_rec_header) depending on the
      word size of the cpu. This leads to false positive crc verification
      errors and ultimately failed mounts when a cleanly unmounted filesystem
      is mounted on a system with an incompatible architecture from data that
      was written near the head of the log.
      
      Update the log head/tail discovery code to run torn write detection only
      when the log is not clean. This means something other than an unmount
      record resides at the head of the log and log recovery is imminent. It
      is a requirement to run log recovery on the same type of host that had
      written the content of the dirty log and therefore CRC failures are
      legitimate corruptions in that scenario.
      Reported-by: NJan Beulich <JBeulich@suse.com>
      Tested-by: NJan Beulich <JBeulich@suse.com>
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      7f6aff3a
    • B
      xfs: refactor in-core log state update to helper · 717bc0eb
      Brian Foster 提交于
      Once the record at the head of the log is identified and verified, the
      in-core log state is updated based on the record. This includes
      information such as the current head block and cycle, the start block of
      the last record written to the log, the tail lsn, etc.
      
      Once torn write detection is conditional, this logic will need to be
      reused. Factor the code to update the in-core log data structures into a
      new helper function. This patch does not change behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      717bc0eb
    • B
      xfs: refactor unmount record detection into helper · 65b99a08
      Brian Foster 提交于
      Once the mount sequence has identified the head and tail blocks of the
      physical log, the record at the head of the log is located and examined
      for an unmount record to determine if the log is clean. This currently
      occurs after torn write verification of the head region of the log.
      
      This must ultimately be separated from torn write verification and may
      need to be called again if the log head is walked back due to a torn
      write (to determine whether the new head record is an unmount record).
      Separate this logic into a new helper function. This patch does not
      change behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      65b99a08
    • B
      xfs: separate log head record discovery from verification · 82ff6cc2
      Brian Foster 提交于
      The code that locates the log record at the head of the log is buried in
      the log head verification function. This is fine when torn write
      verification occurs unconditionally, but this behavior is problematic
      for filesystems that might be moved across systems with different
      architectures.
      
      In preparation for separating examination of the log head for unmount
      records from torn write detection, lift the record location logic out of
      the log verification function and into the caller. This patch does not
      change behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      82ff6cc2
  8. 10 2月, 2016 5 次提交
  9. 09 2月, 2016 6 次提交
  10. 08 2月, 2016 1 次提交
  11. 12 1月, 2016 1 次提交
    • D
      xfs: handle dquot buffer readahead in log recovery correctly · 7d6a13f0
      Dave Chinner 提交于
      When we do dquot readahead in log recovery, we do not use a verifier
      as the underlying buffer may not have dquots in it. e.g. the
      allocation operation hasn't yet been replayed. Hence we do not want
      to fail recovery because we detect an operation to be replayed has
      not been run yet. This problem was addressed for inodes in commit
      d8914002 ("xfs: inode buffers may not be valid during recovery
      readahead") but the problem was not recognised to exist for dquots
      and their buffers as the dquot readahead did not have a verifier.
      
      The result of not using a verifier is that when the buffer is then
      next read to replay a dquot modification, the dquot buffer verifier
      will only be attached to the buffer if *readahead is not complete*.
      Hence we can read the buffer, replay the dquot changes and then add
      it to the delwri submission list without it having a verifier
      attached to it. This then generates warnings in xfs_buf_ioapply(),
      which catches and warns about this case.
      
      Fix this and make it handle the same readahead verifier error cases
      as for inode buffers by adding a new readahead verifier that has a
      write operation as well as a read operation that marks the buffer as
      not done if any corruption is detected.  Also make sure we don't run
      readahead if the dquot buffer has been marked as cancelled by
      recovery.
      
      This will result in readahead either succeeding and the buffer
      having a valid write verifier, or readahead failing and the buffer
      state requiring the subsequent read to resubmit the IO with the new
      verifier.  In either case, this will result in the buffer always
      ending up with a valid write verifier on it.
      
      Note: we also need to fix the inode buffer readahead error handling
      to mark the buffer with EIO. Brian noticed the code I copied from
      there wrong during review, so fix it at the same time. Add comments
      linking the two functions that handle readahead verifier errors
      together so we don't forget this behavioural link in future.
      
      cc: <stable@vger.kernel.org> # 3.12 - current
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      7d6a13f0
  12. 05 1月, 2016 1 次提交
    • B
      xfs: detect and trim torn writes during log recovery · 7088c413
      Brian Foster 提交于
      Certain types of storage, such as persistent memory, do not provide
      sector atomicity for writes. This means that if a crash occurs while XFS
      is writing log records, only part of those records might make it to the
      storage. This is problematic because log recovery uses the cycle value
      packed at the top of each log block to locate the head/tail of the log.
      This can lead to CRC verification failures during log recovery and an
      unmountable fs for a filesystem that is otherwise consistent.
      
      Update log recovery to incorporate log record CRC verification as part
      of the head/tail discovery process. Once the head is located via the
      traditional algorithm, run a CRC-only pass over the records up to the
      head of the log. If CRC verification fails, assume that the records are
      torn as a matter of policy and trim the head block back to the start of
      the first bad record.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      7088c413
  13. 04 1月, 2016 2 次提交
    • B
      xfs: refactor log record start detection into a new helper · eed6b462
      Brian Foster 提交于
      As part of the head/tail discovery process, log recovery locates the
      head block and then reverse seeks to find the start of the last active
      record in the log. This is non-trivial as the record itself could have
      wrapped around the end of the physical log. Log recovery torn write
      detection potentially needs to walk further behind the last record in
      the log, as multiple log I/Os can be in-flight at one time during a
      crash event.
      
      Therefore, refactor the reverse log record header search mechanism into
      a new helper that supports the ability to seek past an arbitrary number
      of log records (or until the tail is hit). Update the head/tail search
      mechanism to call the new helper, but otherwise there is no change in
      log recovery behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      eed6b462
    • B
      xfs: support a crc verification only log record pass · 6528250b
      Brian Foster 提交于
      Log recovery torn write detection uses CRC verification over a range of
      the active log to identify torn writes. Since the generic log recovery
      pass code implements a superset of the functionality required for CRC
      verification, it can be easily modified to support a CRC verification
      only pass.
      
      Create a new CRC pass type and update the log record processing helper
      to skip everything beyond CRC verification when in this mode. This pass
      will be invoked in subsequent patches to implement torn write detection.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      6528250b