1. 09 11月, 2012 5 次提交
  2. 18 10月, 2012 7 次提交
    • D
      xfs: remove xfs_iget.c · 33479e05
      Dave Chinner 提交于
      The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c
      along with the other inode cache functions. This removes all functionality from
      xfs_iget.c, so the file can simply be removed.
      
      This move results in various functions now only having the scope of a single
      file (e.g. xfs_inode_free()), so clean up all the definitions and exported
      prototypes in xfs_icache.[ch] and xfs_inode.h appropriately.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      33479e05
    • D
      xfs: rename xfs_sync.[ch] to xfs_icache.[ch] · 6d8b79cf
      Dave Chinner 提交于
      xfs_sync.c now only contains inode reclaim functions and inode cache
      iteration functions. It is not related to sync operations anymore.
      Rename to xfs_icache.c to reflect it's contents and prepare for
      consolidation with the other inode cache file that exists
      (xfs_iget.c).
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      6d8b79cf
    • D
      xfs: move xfs_quiesce_attr() into xfs_super.c · c7eea6f7
      Dave Chinner 提交于
      Both callers of xfs_quiesce_attr() are in xfs_super.c, and there's
      nothing really sync-specific about this functionality so it doesn't
      really matter where it lives. Move it to benext to it's callers, so
      all the remount/sync_fs code is in the one place.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      c7eea6f7
    • D
      xfs: syncd workqueue is no more · 5889608d
      Dave Chinner 提交于
      With the syncd functions moved to the log and/or removed, the syncd
      workqueue is the only remaining bit left. It is used by the log
      covering/ail pushing work, as well as by the inode reclaim work.
      
      Given how cheap workqueues are these days, give the log and inode
      reclaim work their own work queues and kill the syncd work queue.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      5889608d
    • D
      xfs: xfs_sync_data is redundant. · 9aa05000
      Dave Chinner 提交于
      We don't do any data writeback from XFS any more - the VFS is
      completely responsible for that, including for freeze. We can
      replace the remaining caller with a VFS level function that
      achieves the same thing, but without conflicting with current
      writeback work.
      
      This means we can remove the flush_work and xfs_flush_inodes() - the
      VFS functionality completely replaces the internal flush queue for
      doing this writeback work in a separate context to avoid stack
      overruns.
      
      This does have one complication - it cannot be called with page
      locks held.  Hence move the flushing of delalloc space when ENOSPC
      occurs back up into xfs_file_aio_buffered_write when we don't hold
      any locks that will stall writeback.
      
      Unfortunately, writeback_inodes_sb_if_idle() is not sufficient to
      trigger delalloc conversion fast enough to prevent spurious ENOSPC
      whent here are hundreds of writers, thousands of small files and GBs
      of free RAM.  Hence we need to use sync_sb_inodes() to block callers
      while we wait for writeback like the previous xfs_flush_inodes
      implementation did.
      
      That means we have to hold the s_umount lock here, but because this
      call can nest inside i_mutex (the parent directory in the create
      case, held by the VFS), we have to use down_read_trylock() to avoid
      potential deadlocks. In practice, this trylock will succeed on
      almost every attempt as unmount/remount type operations are
      exceedingly rare.
      
      Note: we always need to pass a count of zero to
      generic_file_buffered_write() as the previously written byte count.
      We only do this by accident before this patch by the virtue of ret
      always being zero when there are no errors. Make this explicit
      rather than needing to specifically zero ret in the ENOSPC retry
      case.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      9aa05000
    • D
      xfs: sync work is now only periodic log work · f661f1e0
      Dave Chinner 提交于
      The only thing the periodic sync work does now is flush the AIL and
      idle the log. These are really functions of the log code, so move
      the work to xfs_log.c and rename it appropriately.
      
      The only wart that this leaves behind is the xfssyncd_centisecs
      sysctl, otherwise the xfssyncd is dead. Clean up any comments that
      related to xfssyncd to reflect it's passing.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      f661f1e0
    • D
      xfs: xfs_syncd_stop must die · 33c7a2bc
      Dave Chinner 提交于
      xfs_syncd_start and xfs_syncd_stop tie a bunch of unrelated
      functionailty together that actually have different start and stop
      requirements. Kill these functions and open code the start/stop
      methods for each of the background functions.
      
      Subsequent patches will move the start/stop functions around to the
      correct places to avoid races and shutdown issues.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      33c7a2bc
  3. 14 3月, 2012 1 次提交
  4. 24 12月, 2011 1 次提交
    • C
      xfs: log all dirty inodes in xfs_fs_sync_fs · be4f1ac8
      Christoph Hellwig 提交于
      Since Linux 2.6.36 the writeback code has introduces various measures for
      live lock prevention during sync().  Unfortunately some of these are
      actively harmful for the XFS model, where the inode gets marked dirty for
      metadata from the data I/O handler.
      
      The older_than_this checks that are now more strictly enforced since
      
          writeback: avoid livelocking WB_SYNC_ALL writeback
      
      by only calling into __writeback_inodes_sb and thus only sampling the
      current cut off time once.  But on a slow enough devices the previous
      asynchronous sync pass might not have fully completed yet, and thus XFS
      might mark metadata dirty only after that sampling of the cut off time for
      the blocking pass already happened.  I have not myself reproduced this
      myself on a real system, but by introducing artificial delay into the
      XFS I/O completion workqueues it can be reproduced easily.
      
      Fix this by iterating over all XFS inodes in ->sync_fs and log all that
      are dirty.  This might log inode that only got redirtied after the
      previous pass, but given how cheap delayed logging of inodes is it
      isn't a major concern for performance.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Tested-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      be4f1ac8
  5. 13 8月, 2011 1 次提交
    • C
      xfs: remove subdirectories · c59d87c4
      Christoph Hellwig 提交于
      Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
      annoying subdirectories in the XFS source code.  Besides the large
      amount of file rename the only changes are to the Makefile, a few
      files including headers with the subdirectory prefix, and the binary
      sysctl compat code that includes a header under fs/xfs/ from
      kernel/.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c59d87c4
  6. 21 7月, 2011 1 次提交
  7. 08 7月, 2011 1 次提交
  8. 08 4月, 2011 1 次提交
    • D
      xfs: introduce a xfssyncd workqueue · c6d09b66
      Dave Chinner 提交于
      All of the work xfssyncd does is background functionality. There is
      no need for a thread per filesystem to do this work - it can al be
      managed by a global workqueue now they manage concurrency
      effectively.
      
      Introduce a new gglobal xfssyncd workqueue, and convert the periodic
      work to use this new functionality. To do this, use a delayed work
      construct to schedule the next running of the periodic sync work
      for the filesystem. When the sync work is complete, queue a new
      delayed work for the next running of the sync work.
      
      For laptop mode, we wait on completion for the sync works, so ensure
      that the sync work queuing interface can flush and wait for work to
      complete to enable the work queue infrastructure to replace the
      current sequence number and wakeup that is used.
      
      Because the sync work does non-trivial amounts of work, mark the
      new work queue as CPU intensive.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      c6d09b66
  9. 19 10月, 2010 2 次提交
  10. 27 7月, 2010 1 次提交
  11. 20 7月, 2010 1 次提交
  12. 30 4月, 2010 1 次提交
    • D
      xfs: add a shrinker to background inode reclaim · 9bf729c0
      Dave Chinner 提交于
      On low memory boxes or those with highmem, kernel can OOM before the
      background reclaims inodes via xfssyncd. Add a shrinker to run inode
      reclaim so that it inode reclaim is expedited when memory is low.
      
      This is more complex than it needs to be because the VM folk don't
      want a context added to the shrinker infrastructure. Hence we need
      to add a global list of XFS mount structures so the shrinker can
      traverse them.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      9bf729c0
  13. 16 1月, 2010 2 次提交
  14. 12 12月, 2009 1 次提交
    • C
      xfs: simplify inode teardown · 848ce8f7
      Christoph Hellwig 提交于
      Currently the reclaim code for the case where we don't reclaim the
      final reclaim is overly complicated.  We know that the inode is clean
      but instead of just directly reclaiming the clean inode we go through
      the whole process of marking the inode reclaimable just to directly
      reclaim it from the calling context.  Besides being overly complicated
      this introduces a race where iget could recycle an inode between
      marked reclaimable and actually being reclaimed leading to panics.
      
      This patch gets rid of the existing reclaim path, and replaces it with
      a simple call to xfs_ireclaim if the inode was clean.  While we're at
      it we also use the slightly more lax xfs_inode_clean check we'd use
      later to determine if we need to flush the inode here.
      
      Finally get rid of xfs_reclaim function and place the remaining small
      bits of reclaim code directly into xfs_fs_destroy_inode.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reported-by: NPatrick Schreurs <patrick@news-service.com>
      Reported-by: NTommy van Leeuwen <tommy@news-service.com>
      Tested-by: NPatrick Schreurs <patrick@news-service.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      848ce8f7
  15. 01 9月, 2009 1 次提交
  16. 18 8月, 2009 1 次提交
    • C
      xfs: fix locking in xfs_iget_cache_hit · a022fe09
      Christoph Hellwig 提交于
      The locking in xfs_iget_cache_hit currently has numerous problems:
      
       - we clear the reclaim tag without i_flags_lock which protects
         modifications to it
       - we call inode_init_always which can sleep with pag_ici_lock
         held (this is oss.sgi.com BZ #819)
       - we acquire and drop i_flags_lock a lot and thus provide no
         consistency between the various flags we set/clear under it
      
      This patch fixes all that with a major revamp of the locking in
      the function.  The new version acquires i_flags_lock early and
      only drops it once we need to call into inode_init_always or before
      calling xfs_ilock.
      
      This patch fixes a bug seen in the wild where we race modifying the
      reclaim tag.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NFelix Blyakher <felixb@sgi.com>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NFelix Blyakher <felixb@sgi.com>
      a022fe09
  17. 17 8月, 2009 1 次提交
    • C
      xfs: fix locking in xfs_iget_cache_hit · bc990f5c
      Christoph Hellwig 提交于
      The locking in xfs_iget_cache_hit currently has numerous problems:
      
       - we clear the reclaim tag without i_flags_lock which protects
         modifications to it
       - we call inode_init_always which can sleep with pag_ici_lock
         held (this is oss.sgi.com BZ #819)
       - we acquire and drop i_flags_lock a lot and thus provide no
         consistency between the various flags we set/clear under it
      
      This patch fixes all that with a major revamp of the locking in
      the function.  The new version acquires i_flags_lock early and
      only drops it once we need to call into inode_init_always or before
      calling xfs_ilock.
      
      This patch fixes a bug seen in the wild where we race modifying the
      reclaim tag.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NFelix Blyakher <felixb@sgi.com>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NFelix Blyakher <felixb@sgi.com>
      bc990f5c
  18. 02 7月, 2009 1 次提交
  19. 08 6月, 2009 5 次提交
  20. 07 4月, 2009 3 次提交
    • D
      xfs: block callers of xfs_flush_inodes() correctly · e43afd72
      Dave Chinner 提交于
      xfs_flush_inodes() currently uses a magic timeout to wait for
      some inodes to be flushed before returning. This isn't
      really reliable but used to be the best that could be done
      due to deadlock potential of waiting for the entire flush.
      
      Now the inode flush is safe to execute while we hold page
      and inode locks, we can wait for all the inodes to flush
      synchronously. Convert the wait mechanism to a completion
      to do this efficiently. This should remove all remaining
      spurious ENOSPC errors from the delayed allocation reservation
      path.
      
      This is extracted almost line for line from a larger patch
      from Mikulas Patocka.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      e43afd72
    • D
      xfs: make inode flush at ENOSPC synchronous · 5825294e
      Dave Chinner 提交于
      When we are writing to a single file and hit ENOSPC, we trigger a background
      flush of the inode and try again.  Because we hold page locks and the iolock,
      the flush won't proceed until after we release these locks. This occurs once
      we've given up and ENOSPC has been reported. Hence if this one is the only
      dirty inode in the system, we'll get an ENOSPC prematurely.
      
      To fix this, remove the async flush from the allocation routines and move
      it to the top of the write path where we can do a synchronous flush
      and retry the write again. Only retry once as a second ENOSPC indicates
      that we really are ENOSPC.
      
      This avoids a page cache deadlock when trying to do this flush synchronously
      in the allocation layer that was identified by Mikulas Patocka.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      5825294e
    • D
      xfs: use xfs_sync_inodes() for device flushing · a8d770d9
      Dave Chinner 提交于
      Currently xfs_device_flush calls sync_blockdev() which is
      a no-op for XFS as all it's metadata is held in a different
      address to the one sync_blockdev() works on.
      
      Call xfs_sync_inodes() instead to flush all the delayed
      allocation blocks out. To do this as efficiently as possible,
      do it via two passes - one to do an async flush of all the
      dirty blocks and a second to wait for all the IO to complete.
      This requires some modification to the xfs-sync_inodes_ag()
      flush code to do efficiently.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      a8d770d9
  21. 09 2月, 2009 1 次提交
  22. 30 10月, 2008 1 次提交