1. 28 3月, 2012 2 次提交
    • J
      xfs: Fix oops on IO error during xlog_recover_process_iunlinks() · d97d32ed
      Jan Kara 提交于
      When an IO error happens during inode deletion run from
      xlog_recover_process_iunlinks() filesystem gets shutdown. Thus any subsequent
      attempt to read buffers fails. Code in xlog_recover_process_iunlinks() does not
      count with the fact that read of a buffer which was read a while ago can
      really fail which results in the oops on
        agi = XFS_BUF_TO_AGI(agibp);
      
      Fix the problem by cleaning up the buffer handling in
      xlog_recover_process_iunlinks() as suggested by Dave Chinner. We release buffer
      lock but keep buffer reference to AG buffer. That is enough for buffer to stay
      pinned in memory and we don't have to call xfs_read_agi() all the time.
      
      CC: stable@kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      d97d32ed
    • D
      xfs: fix fstrim offset calculations · a66d6363
      Dave Chinner 提交于
      xfs_ioc_fstrim() doesn't treat the incoming offset and length
      correctly. It treats them as a filesystem block address, rather than
      a disk address. This is wrong because the range passed in is a
      linear representation, while the filesystem block address notation
      is a sparse representation. Hence we cannot convert the range direct
      to filesystem block units and then use that for calculating the
      range to trim.
      
      While this sounds dangerous, the problem is limited to calculating
      what AGs need to be trimmed. The code that calcuates the actual
      ranges to trim gets the right result (i.e. only ever discards free
      space), even though it uses the wrong ranges to limit what is
      trimmed. Hence this is not a bug that endangers user data.
      
      Fix this by treating the range as a disk address range and use the
      appropriate functions to convert the range into the desired formats
      for calculations.
      
      Further, fix the first free extent lookup (the longest) to actually
      find the largest free extent. Currently this lookup uses a <=
      lookup, which results in finding the extent to the left of the
      largest because we can never get an exact match on the largest
      extent. This is due to the fact that while we know it's size, we
      don't know it's location and so the exact match fails and we move
      one record to the left to get the next largest extent. Instead, use
      a >= search so that the lookup returns the largest extent regardless
      of the fact we don't get an exact match on it.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a66d6363
  2. 27 3月, 2012 3 次提交
    • D
      xfs: Account log unmount transaction correctly · 3948659e
      Dave Chinner 提交于
      There have been a few reports of this warning appearing recently:
      
      XFS (dm-4): xlog_space_left: head behind tail
       tail_cycle = 129, tail_bytes = 20163072
       GH   cycle = 129, GH   bytes = 20162880
      
      The common cause appears to be lots of freeze and unfreeze cycles,
      and the output from the warnings indicates that we are leaking
      around 8 bytes of log space per freeze/unfreeze cycle.
      
      When we freeze the filesystem, we write an unmount record and that
      uses xlog_write directly - a special type of transaction,
      effectively. What it doesn't do, however, is correctly account for
      the log space it uses. The unmount record writes an 8 byte structure
      with a special magic number into the log, and the space this
      consumes is not accounted for in the log ticket tracking the
      operation. Hence we leak 8 bytes every unmount record that is
      written.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      3948659e
    • D
      xfs: don't cache inodes read through bulkstat · 5132ba8f
      Dave Chinner 提交于
      When we read inodes via bulkstat, we generally only read them once
      and then throw them away - they never get used again. If we retain
      them in cache, then it simply causes the working set of inodes and
      other cached items to be reclaimed just so the inode cache can grow.
      
      Avoid this problem by marking inodes read by bulkstat not to be
      cached and check this flag in .drop_inode to determine whether the
      inode should be added to the VFS LRU or not. If the inode lookup
      hits an already cached inode, then don't set the flag. If the inode
      lookup hits an inode marked with no cache flag, remove the flag and
      allow it to be cached once the current reference goes away.
      
      Inodes marked as not cached will get cleaned up by the background
      inode reclaim or via memory pressure, so they will still generate
      some short term cache pressure. They will, however, be reclaimed
      much sooner and in preference to cache hot inodes.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      5132ba8f
    • C
      xfs: trace xfs_name strings correctly · f6161375
      Christoph Hellwig 提交于
      Strings store in an xfs_name structure are often not NUL terminated,
      print them using the correct printf specifiers that make use of the
      string length store in the xfs_name structure.
      Reported-by: NBrian Candler <B.Candler@pobox.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      f6161375
  3. 23 3月, 2012 4 次提交
    • D
      xfs: introduce an allocation workqueue · c999a223
      Dave Chinner 提交于
      We currently have significant issues with the amount of stack that
      allocation in XFS uses, especially in the writeback path. We can
      easily consume 4k of stack between mapping the page, manipulating
      the bmap btree and allocating blocks from the free list. Not to
      mention btree block readahead and other functionality that issues IO
      in the allocation path.
      
      As a result, we can no longer fit allocation in the writeback path
      in the stack space provided on x86_64. To alleviate this problem,
      introduce an allocation workqueue and move all allocations to a
      seperate context. This can be easily added as an interposing layer
      into xfs_alloc_vextent(), which takes a single argument structure
      and does not return until the allocation is complete or has failed.
      
      To do this, add a work structure and a completion to the allocation
      args structure. This allows xfs_alloc_vextent to queue the args onto
      the workqueue and wait for it to be completed by the worker. This
      can be done completely transparently to the caller.
      
      The worker function needs to ensure that it sets and clears the
      PF_TRANS flag appropriately as it is being run in an active
      transaction context. Work can also be queued in a memory reclaim
      context, so a rescuer is needed for the workqueue.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      c999a223
    • D
      xfs: Fix open flag handling in open_by_handle code · 1a1d7724
      Dave Chinner 提交于
      Sparse identified some unsafe handling of open flags in the xfs open
      by handle ioctl code. Update the code to use the correct access
      macros to ensure that we handle the open flags correctly.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      1a1d7724
    • K
      xfs: fix deadlock in xfs_rtfree_extent · 5575acc7
      Kamal Dasu 提交于
      To fix the deadlock caused by repeatedly calling xfs_rtfree_extent
      
       - removed xfs_ilock() and xfs_trans_ijoin() from xfs_rtfree_extent(),
         instead added asserts that the inode is locked and has an inode_item
         attached to it.
       - in xfs_bunmapi() when dealing with an inode with the rt flag
         call xfs_ilock() and xfs_trans_ijoin() so that the
         reference count is bumped on the inode and attached it to the
         transaction before calling into xfs_bmap_del_extent, similar to
         what we do in xfs_bmap_rtalloc.
      Signed-off-by: NKamal Dasu <kdasu.kdev@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      5575acc7
    • G
      fs: xfs: fix section mismatch in linux-next · 1c2ccc66
      Gerard Snitselaar 提交于
      xfs_qm_exit() is called in init_xfs_fs().
      Signed-off-by: NGerard Snitselaar <dev@snitselaar.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      1c2ccc66
  4. 16 3月, 2012 4 次提交
  5. 15 3月, 2012 6 次提交
  6. 14 3月, 2012 5 次提交
  7. 06 3月, 2012 3 次提交
  8. 01 3月, 2012 3 次提交
  9. 26 2月, 2012 1 次提交
    • A
      xfs: only take the ILOCK in xfs_reclaim_inode() · ad637a10
      Alex Elder 提交于
      At the end of xfs_reclaim_inode(), the inode is locked in order to
      we wait for a possible concurrent lookup to complete before the
      inode is freed.  This synchronization step was taking both the ILOCK
      and the IOLOCK, but the latter was causing lockdep to produce
      reports of the possibility of deadlock.
      
      It turns out that there's no need to acquire the IOLOCK at this
      point anyway.  It may have been required in some earlier version of
      the code, but there should be no need to take the IOLOCK in
      xfs_iget(), so there's no (longer) any need to get it here for
      synchronization.  Add an assertion in xfs_iget() as a reminder
      of this assumption.
      
      Dave Chinner diagnosed this on IRC, and Christoph Hellwig suggested
      no longer including the IOLOCK.  I just put together the patch.
      Signed-off-by: NAlex Elder <elder@dreamhost.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ad637a10
  10. 23 2月, 2012 9 次提交