1. 29 1月, 2013 5 次提交
    • D
      xfs: fix shutdown hang on invalid inode during create · 9f87832a
      Dave Chinner 提交于
      When the new inode verify in xfs_iread() fails, the create
      transaction is aborted and a shutdown occurs. The subsequent unmount
      then hangs in xfs_wait_buftarg() on a buffer that has an elevated
      hold count. Debug showed that it was an AGI buffer getting stuck:
      
      [   22.576147] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
      [   22.976213] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
      [   23.376206] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
      [   23.776325] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
      
      The trace of this buffer leading up to the shutdown (trimmed for
      brevity) looks like:
      
      xfs_buf_init:        bno 0x2 nblks 0x1 hold 1 caller xfs_buf_get_map
      xfs_buf_get:         bno 0x2 len 0x200 hold 1 caller xfs_buf_read_map
      xfs_buf_read:        bno 0x2 len 0x200 hold 1 caller xfs_trans_read_buf_map
      xfs_buf_iorequest:   bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
      xfs_buf_hold:        bno 0x2 nblks 0x1 hold 1 caller xfs_buf_iorequest
      xfs_buf_rele:        bno 0x2 nblks 0x1 hold 2 caller xfs_buf_iorequest
      xfs_buf_iowait:      bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
      xfs_buf_ioerror:     bno 0x2 len 0x200 hold 1 caller xfs_buf_bio_end_io
      xfs_buf_iodone:      bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_ioend
      xfs_buf_iowait_done: bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
      xfs_buf_hold:        bno 0x2 nblks 0x1 hold 1 caller xfs_buf_item_init
      xfs_trans_read_buf:  bno 0x2 len 0x200 hold 2 recur 0 refcount 1
      xfs_trans_brelse:    bno 0x2 len 0x200 hold 2 recur 0 refcount 1
      xfs_buf_item_relse:  bno 0x2 nblks 0x1 hold 2 caller xfs_trans_brelse
      xfs_buf_rele:        bno 0x2 nblks 0x1 hold 2 caller xfs_buf_item_relse
      xfs_buf_unlock:      bno 0x2 nblks 0x1 hold 1 caller xfs_trans_brelse
      xfs_buf_rele:        bno 0x2 nblks 0x1 hold 1 caller xfs_trans_brelse
      xfs_buf_trylock:     bno 0x2 nblks 0x1 hold 2 caller _xfs_buf_find
      xfs_buf_find:        bno 0x2 len 0x200 hold 2 caller xfs_buf_get_map
      xfs_buf_get:         bno 0x2 len 0x200 hold 2 caller xfs_buf_read_map
      xfs_buf_read:        bno 0x2 len 0x200 hold 2 caller xfs_trans_read_buf_map
      xfs_buf_hold:        bno 0x2 nblks 0x1 hold 2 caller xfs_buf_item_init
      xfs_trans_read_buf:  bno 0x2 len 0x200 hold 3 recur 0 refcount 1
      xfs_trans_log_buf:   bno 0x2 len 0x200 hold 3 recur 0 refcount 1
      xfs_buf_item_unlock: bno 0x2 len 0x200 hold 3 flags DIRTY liflags ABORTED
      xfs_buf_unlock:      bno 0x2 nblks 0x1 hold 3 caller xfs_buf_item_unlock
      xfs_buf_rele:        bno 0x2 nblks 0x1 hold 3 caller xfs_buf_item_unlock
      
      And that is the AGI buffer from cold cache read into memory to
      transaction abort. You can see at transaction abort the bli is dirty
      and only has a single reference. The item is not pinned, and it's
      not in the AIL. Hence the only reference to it is this transaction.
      
      The problem is that the xfs_buf_item_unlock() call is dropping the
      last reference to the xfs_buf_log_item attached to the buffer (which
      holds a reference to the buffer), but it is not freeing the
      xfs_buf_log_item. Hence nothing will ever release the buffer, and
      the unmount hangs waiting for this reference to go away.
      
      The fix is simple - xfs_buf_item_unlock needs to detect the last
      reference going away in this case and free the xfs_buf_log_item to
      release the reference it holds on the buffer.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      9f87832a
    • D
      xfs: limit speculative prealloc near ENOSPC thresholds · f2a45956
      Dave Chinner 提交于
      There is a window on small filesytsems where specualtive
      preallocation can be larger than that ENOSPC throttling thresholds,
      resulting in specualtive preallocation trying to reserve more space
      than there is space available. This causes immediate ENOSPC to be
      triggered, prealloc to be turned off and flushing to occur. One the
      next write (i.e. next 4k page), we do exactly the same thing, and so
      effective drive into synchronous 4k writes by triggering ENOSPC
      flushing on every page while in the window between the prealloc size
      and the ENOSPC prealloc throttle threshold.
      
      Fix this by checking to see if the prealloc size would consume all
      free space, and throttle it appropriately to avoid premature
      ENOSPC...
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      f2a45956
    • D
      xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end · eb178619
      Dave Chinner 提交于
      When _xfs_buf_find is passed an out of range address, it will fail
      to find a relevant struct xfs_perag and oops with a null
      dereference. This can happen when trying to walk a filesystem with a
      metadata inode that has a partially corrupted extent map (i.e. the
      block number returned is corrupt, but is otherwise intact) and we
      try to read from the corrupted block address.
      
      In this case, just fail the lookup. If it is readahead being issued,
      it will simply not be done, but if it is real read that fails we
      will get an error being reported.  Ideally this case should result
      in an EFSCORRUPTED error being reported, but we cannot return an
      error through xfs_buf_read() or xfs_buf_get() so this lookup failure
      may result in ENOMEM or EIO errors being reported instead.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      eb178619
    • B
      xfs: pull up stack_switch check into xfs_bmapi_write · d26978dd
      Brian Foster 提交于
      The stack_switch check currently occurs in __xfs_bmapi_allocate,
      which means the stack switch only occurs when xfs_bmapi_allocate()
      is called in a loop. Pull the check up before the loop in
      xfs_bmapi_write() such that the first iteration of the loop has
      consistent behavior.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      d26978dd
    • E
      xfs: Do not return EFSCORRUPTED when filesystem probe finds no XFS magic · 1bee12b8
      Eric Sandeen 提交于
      98021821 changed the return value from EWRONGFS (aka EINVAL)
      to EFSCORRUPTED which doesn't seem to be handled properly by
      the root filesystem probe.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Tested-by: NSergei Trofimovich <slyfox@gentoo.org>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      1bee12b8
  2. 17 1月, 2013 6 次提交
  3. 22 12月, 2012 29 次提交