1. 19 10月, 2010 2 次提交
  2. 02 9月, 2010 1 次提交
    • D
      xfs: improve buffer cache hash scalability · 9bc08a45
      Dave Chinner 提交于
      When doing large parallel file creates on a 16p machines, large amounts of
      time is being spent in _xfs_buf_find(). A system wide profile with perf top
      shows this:
      
                1134740.00 19.3% _xfs_buf_find
                 733142.00 12.5% __ticket_spin_lock
      
      The problem is that the hash contains 45,000 buffers, and the hash table width
      is only 256 buffers. That means we've got around 200 buffers per chain, and
      searching it is quite expensive. The hash table size needs to increase.
      
      Secondly, every time we do a lookup, we promote the buffer we find to the head
      of the hash chain. This is causing cachelines to be dirtied and causes
      invalidation of cachelines across all CPUs that may have walked the hash chain
      recently. hence every walk of the hash chain is effectively a cold cache walk.
      Remove the promotion to avoid this invalidation.
      
      The results are:
      
                1045043.00 21.2% __ticket_spin_lock
                 326184.00  6.6% _xfs_buf_find
      
      A 70% drop in the CPU usage when looking up buffers. Unfortunately that does
      not result in an increase in performance underthis workload as contention on
      the inode_lock soaks up most of the reduction in CPU usage.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      9bc08a45
  3. 27 7月, 2010 3 次提交
    • C
      xfs: kill the b_strat callback in xfs_buf · 939d723b
      Christoph Hellwig 提交于
      The b_strat callback is used by xfs_buf_iostrategy to perform additional
      checks before submitting a buffer.  It is used in xfs_bwrite and when
      writing out delayed buffers.  In xfs_bwrite it we can de-virtualize the
      call easily as b_strat is set a few lines above the call to
      xfs_buf_iostrategy.  For the delayed buffers the rationale is a bit
      more complicated:
      
       - there are three callers of xfs_buf_delwri_queue, which places buffers
         on the delwri list:
          (1) xfs_bdwrite - this sets up b_strat, so it's fine
          (2) xfs_buf_iorequest.  None of the callers can have XBF_DELWRI set:
      	- xlog_bdstrat is only used for log buffers, which are never delwri
      	- _xfs_buf_read explicitly clears the delwri flag
      	- xfs_buf_iodone_work retries log buffers only
      	- xfsbdstrat - only used for reads, superblock writes without the
      	  delwri flag, log I/O and file zeroing with explicitly allocated
      	  buffers.
      	- xfs_buf_iostrategy - only calls xfs_buf_iorequest if b_strat is
      	  not set
          (3) xfs_buf_unlock
      	- only puts the buffer on the delwri list if the DELWRI flag is
      	  already set.  The DELWRI flag is only ever set in xfs_bwrite,
      	  xfs_buf_iodone_callbacks, or xfs_trans_log_buf.  For
      	  xfs_buf_iodone_callbacks and xfs_trans_log_buf we require
      	  an initialized buf item, which means b_strat was set to
      	  xfs_bdstrat_cb in xfs_buf_item_init.
      
      Conclusion: we can just get rid of the callback and replace it with
      explicit calls to xfs_bdstrat_cb.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      939d723b
    • C
      xfs: do not use emums for flags used in tracing · 807cbbdb
      Christoph Hellwig 提交于
      The tracing code can't print flags defined as enums.  Most flags that
      we want to print are defines as macros already, but move the few remaining
      ones over to make the trace output more useful.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      807cbbdb
    • C
      xfs: simplify buffer pinning · 4d16e924
      Christoph Hellwig 提交于
      Get rid of the xfs_buf_pin/xfs_buf_unpin/xfs_buf_ispin helpers and opencode
      them in their only callers, just like we did for the inode pinning a while
      ago.  Also remove duplicate trace points - the bufitem tracepoints cover
      all the information that is present in a buffer tracepoint.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      4d16e924
  4. 19 5月, 2010 1 次提交
    • J
      xfs: add blockdev name to kthreads · e2a07812
      Jan Engelhardt 提交于
      This allows to see in `ps` and similar tools which kthreads are
      allotted to which block device/filesystem, similar to what jbd2
      does. As the process name is a fixed 16-char array, no extra
      space is needed in tasks.
      
        PID TTY      STAT   TIME COMMAND
          2 ?        S      0:00 [kthreadd]
        197 ?        S      0:00  \_ [jbd2/sda2-8]
        198 ?        S      0:00  \_ [ext4-dio-unwrit]
        204 ?        S      0:00  \_ [flush-8:0]
       2647 ?        S      0:00  \_ [xfs_mru_cache]
       2648 ?        S      0:00  \_ [xfslogd/0]
       2649 ?        S      0:00  \_ [xfsdatad/0]
       2650 ?        S      0:00  \_ [xfsconvertd/0]
       2651 ?        S      0:00  \_ [xfsbufd/ram0]
       2652 ?        S      0:00  \_ [xfsaild/ram0]
       2653 ?        S      0:00  \_ [xfssyncd/ram0]
      Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
      Reviewed-by: NDave Chinner <david@fromorbit.com>
      e2a07812
  5. 04 2月, 2010 1 次提交
  6. 02 2月, 2010 1 次提交
    • D
      xfs: Don't issue buffer IO direct from AIL push V2 · d808f617
      Dave Chinner 提交于
      All buffers logged into the AIL are marked as delayed write.
      When the AIL needs to push the buffer out, it issues an async write of the
      buffer. This means that IO patterns are dependent on the order of
      buffers in the AIL.
      
      Instead of flushing the buffer, promote the buffer in the delayed
      write list so that the next time the xfsbufd is run the buffer will
      be flushed by the xfsbufd. Return the state to the xfsaild that the
      buffer was promoted so that the xfsaild knows that it needs to cause
      the xfsbufd to run to flush the buffers that were promoted.
      
      Using the xfsbufd for issuing the IO allows us to dispatch all
      buffer IO from the one queue. This means that we can make much more
      enlightened decisions on what order to flush buffers to disk as
      we don't have multiple places issuing IO. Optimisations to xfsbufd
      will be in a future patch.
      
      Version 2
      - kill XFS_ITEM_FLUSHING as it is now unused.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      d808f617
  7. 22 1月, 2010 1 次提交
  8. 20 1月, 2010 1 次提交
  9. 16 1月, 2010 3 次提交
  10. 17 12月, 2009 1 次提交
  11. 15 12月, 2009 1 次提交
    • C
      xfs: event tracing support · 0b1b213f
      Christoph Hellwig 提交于
      Convert the old xfs tracing support that could only be used with the
      out of tree kdb and xfsidbg patches to use the generic event tracer.
      
      To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
      all xfs trace channels by:
      
         echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
      
      or alternatively enable single events by just doing the same in one
      event subdirectory, e.g.
      
         echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable
      
      or set more complex filters, etc. In Documentation/trace/events.txt
      all this is desctribed in more detail.  To reads the events do a
      
         cat /sys/kernel/debug/tracing/trace
      
      Compared to the last posting this patch converts the tracing mostly to
      the one tracepoint per callsite model that other users of the new
      tracing facility also employ.  This allows a very fine-grained control
      of the tracing, a cleaner output of the traces and also enables the
      perf tool to use each tracepoint as a virtual performance counter,
           allowing us to e.g. count how often certain workloads git various
           spots in XFS.  Take a look at
      
          http://lwn.net/Articles/346470/
      
      for some examples.
      
      Also the btree tracing isn't included at all yet, as it will require
      additional core tracing features not in mainline yet, I plan to
      deliver it later.
      
      And the really nice thing about this patch is that it actually removes
      many lines of code while adding this nice functionality:
      
       fs/xfs/Makefile                |    8
       fs/xfs/linux-2.6/xfs_acl.c     |    1
       fs/xfs/linux-2.6/xfs_aops.c    |   52 -
       fs/xfs/linux-2.6/xfs_aops.h    |    2
       fs/xfs/linux-2.6/xfs_buf.c     |  117 +--
       fs/xfs/linux-2.6/xfs_buf.h     |   33
       fs/xfs/linux-2.6/xfs_fs_subr.c |    3
       fs/xfs/linux-2.6/xfs_ioctl.c   |    1
       fs/xfs/linux-2.6/xfs_ioctl32.c |    1
       fs/xfs/linux-2.6/xfs_iops.c    |    1
       fs/xfs/linux-2.6/xfs_linux.h   |    1
       fs/xfs/linux-2.6/xfs_lrw.c     |   87 --
       fs/xfs/linux-2.6/xfs_lrw.h     |   45 -
       fs/xfs/linux-2.6/xfs_super.c   |  104 ---
       fs/xfs/linux-2.6/xfs_super.h   |    7
       fs/xfs/linux-2.6/xfs_sync.c    |    1
       fs/xfs/linux-2.6/xfs_trace.c   |   75 ++
       fs/xfs/linux-2.6/xfs_trace.h   | 1369 +++++++++++++++++++++++++++++++++++++++++
       fs/xfs/linux-2.6/xfs_vnode.h   |    4
       fs/xfs/quota/xfs_dquot.c       |  110 ---
       fs/xfs/quota/xfs_dquot.h       |   21
       fs/xfs/quota/xfs_qm.c          |   40 -
       fs/xfs/quota/xfs_qm_syscalls.c |    4
       fs/xfs/support/ktrace.c        |  323 ---------
       fs/xfs/support/ktrace.h        |   85 --
       fs/xfs/xfs.h                   |   16
       fs/xfs/xfs_ag.h                |   14
       fs/xfs/xfs_alloc.c             |  230 +-----
       fs/xfs/xfs_alloc.h             |   27
       fs/xfs/xfs_alloc_btree.c       |    1
       fs/xfs/xfs_attr.c              |  107 ---
       fs/xfs/xfs_attr.h              |   10
       fs/xfs/xfs_attr_leaf.c         |   14
       fs/xfs/xfs_attr_sf.h           |   40 -
       fs/xfs/xfs_bmap.c              |  507 +++------------
       fs/xfs/xfs_bmap.h              |   49 -
       fs/xfs/xfs_bmap_btree.c        |    6
       fs/xfs/xfs_btree.c             |    5
       fs/xfs/xfs_btree_trace.h       |   17
       fs/xfs/xfs_buf_item.c          |   87 --
       fs/xfs/xfs_buf_item.h          |   20
       fs/xfs/xfs_da_btree.c          |    3
       fs/xfs/xfs_da_btree.h          |    7
       fs/xfs/xfs_dfrag.c             |    2
       fs/xfs/xfs_dir2.c              |    8
       fs/xfs/xfs_dir2_block.c        |   20
       fs/xfs/xfs_dir2_leaf.c         |   21
       fs/xfs/xfs_dir2_node.c         |   27
       fs/xfs/xfs_dir2_sf.c           |   26
       fs/xfs/xfs_dir2_trace.c        |  216 ------
       fs/xfs/xfs_dir2_trace.h        |   72 --
       fs/xfs/xfs_filestream.c        |    8
       fs/xfs/xfs_fsops.c             |    2
       fs/xfs/xfs_iget.c              |  111 ---
       fs/xfs/xfs_inode.c             |   67 --
       fs/xfs/xfs_inode.h             |   76 --
       fs/xfs/xfs_inode_item.c        |    5
       fs/xfs/xfs_iomap.c             |   85 --
       fs/xfs/xfs_iomap.h             |    8
       fs/xfs/xfs_log.c               |  181 +----
       fs/xfs/xfs_log_priv.h          |   20
       fs/xfs/xfs_log_recover.c       |    1
       fs/xfs/xfs_mount.c             |    2
       fs/xfs/xfs_quota.h             |    8
       fs/xfs/xfs_rename.c            |    1
       fs/xfs/xfs_rtalloc.c           |    1
       fs/xfs/xfs_rw.c                |    3
       fs/xfs/xfs_trans.h             |   47 +
       fs/xfs/xfs_trans_buf.c         |   62 -
       fs/xfs/xfs_vnodeops.c          |    8
       70 files changed, 2151 insertions(+), 2592 deletions(-)
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      0b1b213f
  12. 12 12月, 2009 1 次提交
  13. 07 3月, 2009 1 次提交
  14. 04 3月, 2009 1 次提交
  15. 22 12月, 2008 1 次提交
  16. 11 12月, 2008 1 次提交
  17. 04 12月, 2008 1 次提交
    • C
      kill xfs_buf_iostart · 5d765b97
      Christoph Hellwig 提交于
      xfs_buf_iostart is a "shared" helper for xfs_buf_read_flags,
      xfs_bawrite, and xfs_bdwrite - except that there isn't much shared
      code but rather special cases for each caller.
      
      So remove this function and move the functionality to the caller.
      xfs_bawrite and xfs_bdwrite are now big enough to be moved out of
      line and the xfs_buf_read_flags is moved into a new helper called
      _xfs_buf_read.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NNiv Sardi <xaiki@sgi.com>
      5d765b97
  18. 11 10月, 2008 1 次提交
  19. 13 8月, 2008 1 次提交
  20. 28 7月, 2008 1 次提交
    • C
      [XFS] sort out opening and closing of the block devices · 19f354d4
      Christoph Hellwig 提交于
      Currently closing the rt/log block device is done in the wrong spot, and
      far too early. So revampt it:
      
      - xfs_blkdev_put moved out of xfs_free_buftarg into the caller so that
      
      it is done after tearing down the buftarg completely.
      
      - call to xfs_unmountfs_close moved from xfs_mountfs into caller so
      
      that it's done after tearing down the filesystem completely.
      
      - xfs_unmountfs_close is renamed to xfs_close_devices and made static
      
      in xfs_super.c
      
      - opening of the block devices is split into a helper xfs_open_devices
      
      that is symetric in use to xfs_close_devices
      
      - xfs_unmountfs can now lose struct cred
      
      - error handling around device opening sanitized in xfs_fs_fill_super
      
      SGI-PV: 981951
      SGI-Modid: xfs-linux-melb:xfs-kern:31193a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      19f354d4
  21. 23 5月, 2008 1 次提交
  22. 18 4月, 2008 1 次提交
  23. 07 2月, 2008 3 次提交
  24. 14 7月, 2007 1 次提交
  25. 08 5月, 2007 1 次提交
  26. 10 2月, 2007 1 次提交
    • D
      [XFS] Current usage of buftarg flags is incorrect. · 5e6a07df
      David Chinner 提交于
      The {test,set,clear}_bit() operations take a bit index for the bit to
      operate on. The XBT_* flags are defined as bit fields which is incorrect,
      not to mention the way the bit fields are enumerated is broken too. This
      was only working by chance.
      
      Fix the definitions of the flags and make the code using them use the
      {test,set,clear}_bit() operations correctly.
      
      SGI-PV: 958639
      SGI-Modid: xfs-linux-melb:xfs-kern:27565a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      5e6a07df
  27. 28 9月, 2006 2 次提交
  28. 28 7月, 2006 1 次提交
  29. 01 7月, 2006 1 次提交
  30. 11 1月, 2006 2 次提交
  31. 02 11月, 2005 1 次提交