1. 05 8月, 2015 35 次提交
  2. 02 8月, 2015 1 次提交
    • A
      link_path_walk(): be careful when failing with ENOTDIR · 97242f99
      Al Viro 提交于
      In RCU mode we might end up with dentry evicted just we check
      that it's a directory.  In such case we should return ECHILD
      rather than ENOTDIR, so that pathwalk would be retries in non-RCU
      mode.
      
      Breakage had been introduced in commit b18825a7 - prior to that
      we were looking at nd->inode, which had been fetched before
      verifying that ->d_seq was still valid.  That form of check
      would only be satisfied if at some point the pathname prefix
      would indeed have resolved to a non-directory.  The fix consists
      of checking ->d_seq after we'd run into a non-directory dentry,
      and failing with ECHILD in case of mismatch.
      
      Note that all branches since 3.12 have that problem...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      97242f99
  3. 31 7月, 2015 2 次提交
  4. 29 7月, 2015 2 次提交
    • D
      xfs: remote attributes need to be considered data · df150ed1
      Dave Chinner 提交于
      We don't log remote attribute contents, and instead write them
      synchronously before we commit the block allocation and attribute
      tree update transaction. As a result we are writing to the allocated
      space before the allcoation has been made permanent.
      
      As a result, we cannot consider this allocation to be a metadata
      allocation. Metadata allocation can take blocks from the free list
      and so reuse them before the transaction that freed the block is
      committed to disk. This behaviour is perfectly fine for journalled
      metadata changes as log recovery will ensure the free operation is
      replayed before the overwrite, but for remote attribute writes this
      is not the case.
      
      Hence we have to consider the remote attribute blocks to contain
      data and allocate accordingly. We do this by dropping the
      XFS_BMAPI_METADATA flag from the block allocation. This means the
      allocation will not use blocks that are on the busy list without
      first ensuring that the freeing transaction has been committed to
      disk and the blocks removed from the busy list. This ensures we will
      never overwrite a freed block without first ensuring that it is
      really free.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      df150ed1
    • D
      xfs: remote attribute headers contain an invalid LSN · e3c32ee9
      Dave Chinner 提交于
      In recent testing, a system that crashed failed log recovery on
      restart with a bad symlink buffer magic number:
      
      XFS (vda): Starting recovery (logdev: internal)
      XFS (vda): Bad symlink block magic!
      XFS: Assertion failed: 0, file: fs/xfs/xfs_log_recover.c, line: 2060
      
      On examination of the log via xfs_logprint, none of the symlink
      buffers in the log had a bad magic number, nor were any other types
      of buffer log format headers mis-identified as symlink buffers.
      Tracing was used to find the buffer the kernel was tripping over,
      and xfs_db identified it's contents as:
      
      000: 5841524d 00000000 00000346 64d82b48 8983e692 d71e4680 a5f49e2c b317576e
      020: 00000000 00602038 00000000 006034ce d0020000 00000000 4d4d4d4d 4d4d4d4d
      040: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
      060: 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d 4d4d4d4d
      .....
      
      This is a remote attribute buffer, which are notable in that they
      are not logged but are instead written synchronously by the remote
      attribute code so that they exist on disk before the attribute
      transactions are committed to the journal.
      
      The above remote attribute block has an invalid LSN in it - cycle
      0xd002000, block 0 - which means when log recovery comes along to
      determine if the transaction that writes to the underlying block
      should be replayed, it sees a block that has a future LSN and so
      does not replay the buffer data in the transaction. Instead, it
      validates the buffer magic number and attaches the buffer verifier
      to it.  It is this buffer magic number check that is failing in the
      above assert, indicating that we skipped replay due to the LSN of
      the underlying buffer.
      
      The problem here is that the remote attribute buffers cannot have a
      valid LSN placed into them, because the transaction that contains 
      the attribute tree pointer changes and the block allocation that the
      attribute data is being written to hasn't yet been committed. Hence
      the LSN field in the attribute block is completely unwritten,
      thereby leaving the underlying contents of the block in the LSN
      field. It could have any value, and hence a future overwrite of the
      block by log recovery may or may not work correctly.
      
      Fix this by always writing an invalid LSN to the remote attribute
      block, as any buffer in log recovery that needs to write over the
      remote attribute should occur. We are protected from having old data
      written over the attribute by the fact that freeing the block before
      the remote attribute is written will result in the buffer being
      marked stale in the log and so all changes prior to the buffer stale
      transaction will be cancelled by log recovery.
      
      Hence it is safe to ignore the LSN in the case or synchronously
      written, unlogged metadata such as remote attribute blocks, and to
      ensure we do that correctly, we need to write an invalid LSN to all
      remote attribute blocks to trigger immediate recovery of metadata
      that is written over the top.
      
      As a further protection for filesystems that may already have remote
      attribute blocks with bad LSNs on disk, change the log recovery code
      to always trigger immediate recovery of metadata over remote
      attribute blocks.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e3c32ee9