1. 13 8月, 2011 2 次提交
  2. 27 7月, 2010 1 次提交
    • C
      xfs: drop dmapi hooks · 288699fe
      Christoph Hellwig 提交于
      Dmapi support was never merged upstream, but we still have a lot of hooks
      bloating XFS for it, all over the fast pathes of the filesystem.
      
      This patch drops over 700 lines of dmapi overhead.  If we'll ever get HSM
      support in mainline at least the namespace events can be done much saner
      in the VFS instead of the individual filesystem, so it's not like this
      is much help for future work.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      288699fe
  3. 21 7月, 2010 1 次提交
  4. 29 5月, 2010 1 次提交
  5. 24 5月, 2010 1 次提交
    • D
      xfs: Improve scalability of busy extent tracking · ed3b4d6c
      Dave Chinner 提交于
      When we free a metadata extent, we record it in the per-AG busy
      extent array so that it is not re-used before the freeing
      transaction hits the disk. This array is fixed size, so when it
      overflows we make further allocation transactions synchronous
      because we cannot track more freed extents until those transactions
      hit the disk and are completed. Under heavy mixed allocation and
      freeing workloads with large log buffers, we can overflow this array
      quite easily.
      
      Further, the array is sparsely populated, which means that inserts
      need to search for a free slot, and array searches often have to
      search many more slots that are actually used to check all the
      busy extents. Quite inefficient, really.
      
      To enable this aspect of extent freeing to scale better, we need
      a structure that can grow dynamically. While in other areas of
      XFS we have used radix trees, the extents being freed are at random
      locations on disk so are better suited to being indexed by an rbtree.
      
      So, use a per-AG rbtree indexed by block number to track busy
      extents.  This incures a memory allocation when marking an extent
      busy, but should not occur too often in low memory situations. This
      should scale to an arbitrary number of extents so should not be a
      limitation for features such as in-memory aggregation of
      transactions.
      
      However, there are still situations where we can't avoid allocating
      busy extents (such as allocation from the AGFL). To minimise the
      overhead of such occurences, we need to avoid doing a synchronous
      log force while holding the AGF locked to ensure that the previous
      transactions are safely on disk before we use the extent. We can do
      this by marking the transaction doing the allocation as synchronous
      rather issuing a log force.
      
      Because of the locking involved and the ordering of transactions,
      the synchronous transaction provides the same guarantees as a
      synchronous log force because it ensures that all the prior
      transactions are already on disk when the synchronous transaction
      hits the disk. i.e. it preserves the free->allocate order of the
      extent correctly in recovery.
      
      By doing this, we avoid holding the AGF locked while log writes are
      in progress, hence reducing the length of time the lock is held and
      therefore we increase the rate at which we can allocate and free
      from the allocation group, thereby increasing overall throughput.
      
      The only problem with this approach is that when a metadata buffer is
      marked stale (e.g. a directory block is removed), then buffer remains
      pinned and locked until the log goes to disk. The issue here is that
      if that stale buffer is reallocated in a subsequent transaction, the
      attempt to lock that buffer in the transaction will hang waiting
      the log to go to disk to unlock and unpin the buffer. Hence if
      someone tries to lock a pinned, stale, locked buffer we need to
      push on the log to get it unlocked ASAP. Effectively we are trading
      off a guaranteed log force for a much less common trigger for log
      force to occur.
      
      Ideally we should not reallocate busy extents. That is a much more
      complex fix to the problem as it involves direct intervention in the
      allocation btree searches in many places. This is left to a future
      set of modifications.
      
      Finally, now that we track busy extents in allocated memory, we
      don't need the descriptors in the transaction structure to point to
      them. We can replace the complex busy chunk infrastructure with a
      simple linked list of busy extents. This allows us to remove a large
      chunk of code, making the overall change a net reduction in code
      size.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      ed3b4d6c
  6. 22 5月, 2010 2 次提交
    • C
      quota: unify ->set_dqblk · c472b432
      Christoph Hellwig 提交于
      Pass the larger struct fs_disk_quota to the ->set_dqblk operation so
      that the Q_SETQUOTA and Q_XSETQUOTA operations can be implemented
      with a single filesystem operation and we can retire the ->set_xquota
      operation.  The additional information (RT-subvolume accounting and
      warn counts) are left zero for the VFS quota implementation.
      
      Add new fieldmask values for setting the numer of blocks and inodes
      values which is required for the VFS quota, but wasn't for XFS.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      c472b432
    • C
      quota: unify ->get_dqblk · b9b2dd36
      Christoph Hellwig 提交于
      Pass the larger struct fs_disk_quota to the ->get_dqblk operation so
      that the Q_GETQUOTA and Q_XGETQUOTA operations can be implemented
      with a single filesystem operation and we can retire the ->get_xquota
      operation.  The additional information (RT-subvolume accounting and
      warn counts) are left zero for the VFS quota implementation.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      b9b2dd36
  7. 05 3月, 2010 2 次提交
    • C
      quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota · ac0e7737
      Christoph Hellwig 提交于
      We already do these checks in the generic code.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      ac0e7737
    • C
      quota: clean up Q_XQUOTASYNC · 8c4e4acd
      Christoph Hellwig 提交于
      Currently Q_XQUOTASYNC calls into the quota_sync method, but XFS does something
      entirely different in it than the rest of the filesystems.  xfs_quota which
      calls Q_XQUOTASYNC expects an asynchronous data writeout to flush delayed
      allocations, while the "VFS" quota support wants to flush changes to the quota
      file.
      
      So make Q_XQUOTASYNC call into the writeback code directly and make the
      quota_sync method optional as XFS doesn't need in the sense expected by the
      rest of the quota code.
      
      GFS2 was using limited XFS-style quota and has a quota_sync method fitting
      neither the style used by vfs_quota_sync nor xfs_fs_quota_sync.  I left it
      in for now as per discussion with Steve it expects to be called from the
      sync path this way.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJan Kara <jack@suse.cz>
      8c4e4acd
  8. 30 10月, 2009 1 次提交
  9. 22 9月, 2009 1 次提交
  10. 08 6月, 2009 1 次提交
    • C
      xfs: split xfs_sync_inodes · 075fe102
      Christoph Hellwig 提交于
      xfs_sync_inodes is used to write back either file data or inode metadata.
      In general we always do these separately, except for one fishy case in
      xfs_fs_put_super that does both.  So separate xfs_sync_inodes into
      separate xfs_sync_data and xfs_sync_attr functions.  In xfs_fs_put_super
      we first call the data sync and then the attr sync as that was the previous
      order.  The moved log force in that path doesn't make a difference because
      we will force the log again as part of the real unmount process.
      
      The filesystem readonly checks are not performed by the new function but
      instead moved into the callers, given that most callers alredy have it
      further up in the stack.  Also add debug checks that we do not pass in
      incorrect flags in the new xfs_sync_data and xfs_sync_attr function and
      fix the one place that did pass in a wrong flag.
      
      Also remove a comment mentioning xfs_sync_inodes that has been incorrect
      for a while because we always take either the iolock or ilock in the
      sync path these days.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      075fe102
  11. 09 2月, 2009 1 次提交