1. 10 8月, 2010 5 次提交
    • A
      convert remaining ->clear_inode() to ->evict_inode() · b57922d9
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b57922d9
    • A
      simplify checks for I_CLEAR/I_FREEING · a4ffdde6
      Al Viro 提交于
      add I_CLEAR instead of replacing I_FREEING with it.  I_CLEAR is
      equivalent to I_FREEING for almost all code looking at either;
      it's there to keep track of having called clear_inode() exactly
      once per inode lifetime, at some point after having set I_FREEING.
      I_CLEAR and I_FREEING never get set at the same time with the
      current code, so we can switch to setting i_flags to I_FREEING | I_CLEAR
      instead of I_CLEAR without loss of information.  As the result of
      such change, checks become simpler and the amount of code that needs
      to know about I_CLEAR shrinks a lot.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a4ffdde6
    • C
      xfs: new truncate sequence · fa9b227e
      Christoph Hellwig 提交于
      Convert XFS to the new truncate sequence.  We still can have errors after
      updating the file size in xfs_setattr, but these are real I/O errors and lead
      to a transaction abort and filesystem shutdown, so they are not an issue.
      
      Errors from ->write_begin and write_end can now be handled correctly because
      we can actually get rid of the delalloc extents while previous the buffer
      state was stipped in block_invalidatepage.
      
      There is still no error handling for ->direct_IO, because doing so will need
      some major restructuring given that we only have the iolock shared and do not
      hold i_mutex at all.  Fortunately leaving the normally allocated blocks behind
      there is not a major issue and this will get cleaned up by xfs_free_eofblock
      later.
      
      Note: the patch is against Al's vfs.git tree as that contains the nessecary
      preparations.  I'd prefer to get it applied there so that we can get some
      testing in linux-next.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fa9b227e
    • C
      get rid of block_write_begin_newtrunc · 155130a4
      Christoph Hellwig 提交于
      Move the call to vmtruncate to get rid of accessive blocks to the callers
      in preparation of the new truncate sequence and rename the non-truncating
      version to block_write_begin.
      
      While we're at it also remove several unused arguments to block_write_begin.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      155130a4
    • C
      sort out blockdev_direct_IO variants · eafdc7d1
      Christoph Hellwig 提交于
      Move the call to vmtruncate to get rid of accessive blocks to the callers
      in prepearation of the new truncate calling sequence.  This was only done
      for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant
      was not needed anyway.  Get rid of blockdev_direct_IO_no_locking and
      its _newtrunc variant while at it as just opencoding the two additional
      paramters is shorted than the name suffix.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      eafdc7d1
  2. 27 7月, 2010 35 次提交
    • C
      direct-io: move aio_complete into ->end_io · 552ef802
      Christoph Hellwig 提交于
      Filesystems with unwritten extent support must not complete an AIO request
      until the transaction to convert the extent has been commited.  That means
      the aio_complete calls needs to be moved into the ->end_io callback so
      that the filesystem can control when to call it exactly.
      
      This makes a bit of a mess out of dio_complete and the ->end_io callback
      prototype even more complicated. 
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: Jan Kara <jack@suse.cz> 
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      552ef802
    • C
      xfs simplify and speed up direct I/O completions · 209fb87a
      Christoph Hellwig 提交于
      Our current handling of direct I/O completions is rather suboptimal,
      because we defer it to a workqueue more often than needed, and we
      perform a much to aggressive flush of the workqueue in case unwritten
      extent conversions happen.
      
      This patch changes the direct I/O reads to not even use a completion
      handler, as we don't bother to use it at all, and to perform the unwritten
      extent conversions in caller context for synchronous direct I/O.
      
      For a small I/O size direct I/O workload on a consumer grade SSD, such as
      the untar of a kernel tree inside qemu this patch gives speedups of
      about 5%.  Getting us much closer to the speed of a native block device,
      or a fully allocated XFS file.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      209fb87a
    • C
      xfs: move aio completion after unwritten extent conversion · fb511f21
      Christoph Hellwig 提交于
      If we write into an unwritten extent using AIO we need to complete the AIO
      request after the extent conversion has finished.  Without that a read could
      race to see see the extent still unwritten and return zeros.   For synchronous
      I/O we already take care of that by flushing the xfsconvertd workqueue (which
      might be a bit of overkill).
      
      To do that add iocb and result fields to struct xfs_ioend, so that we can
      call aio_complete from xfs_end_io after the extent conversion has happened.
      Note that we need a new result field as io_error is used for positive errno
      values, while the AIO code can return negative error values and positive
      transfer sizes.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      fb511f21
    • C
      direct-io: move aio_complete into ->end_io · 40e2e973
      Christoph Hellwig 提交于
      Filesystems with unwritten extent support must not complete an AIO request
      until the transaction to convert the extent has been commited.  That means
      the aio_complete calls needs to be moved into the ->end_io callback so
      that the filesystem can control when to call it exactly.
      
      This makes a bit of a mess out of dio_complete and the ->end_io callback
      prototype even more complicated.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      40e2e973
    • D
      xfs: fix big endian build · 696123fc
      Dave Chinner 提交于
      Commit 0fd7275cc42ab734eaa1a2c747e65479bd1e42af ("xfs: fix gcc 4.6
      set but not read and unused statement warnings") failed to convert
      some code inside XFS_NATIVE_HOST (big endian host code only) and
      hence fails to build on such machines. Fix it.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      696123fc
    • C
      xfs: clean up xfs_bmap_get_bp · ecd7f082
      Christoph Hellwig 提交于
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ecd7f082
    • C
      xfs: simplify xfs_truncate_file · 5d18898b
      Christoph Hellwig 提交于
      xfs_truncate_file is only used for truncating quota files.  Move it to
      xfs_qm_syscalls.c so it can be marked static and take advatange of the
      fact by removing the unused page cache validation and taking the iget
      into the helper.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      5d18898b
    • C
      xfs: kill the b_strat callback in xfs_buf · 939d723b
      Christoph Hellwig 提交于
      The b_strat callback is used by xfs_buf_iostrategy to perform additional
      checks before submitting a buffer.  It is used in xfs_bwrite and when
      writing out delayed buffers.  In xfs_bwrite it we can de-virtualize the
      call easily as b_strat is set a few lines above the call to
      xfs_buf_iostrategy.  For the delayed buffers the rationale is a bit
      more complicated:
      
       - there are three callers of xfs_buf_delwri_queue, which places buffers
         on the delwri list:
          (1) xfs_bdwrite - this sets up b_strat, so it's fine
          (2) xfs_buf_iorequest.  None of the callers can have XBF_DELWRI set:
      	- xlog_bdstrat is only used for log buffers, which are never delwri
      	- _xfs_buf_read explicitly clears the delwri flag
      	- xfs_buf_iodone_work retries log buffers only
      	- xfsbdstrat - only used for reads, superblock writes without the
      	  delwri flag, log I/O and file zeroing with explicitly allocated
      	  buffers.
      	- xfs_buf_iostrategy - only calls xfs_buf_iorequest if b_strat is
      	  not set
          (3) xfs_buf_unlock
      	- only puts the buffer on the delwri list if the DELWRI flag is
      	  already set.  The DELWRI flag is only ever set in xfs_bwrite,
      	  xfs_buf_iodone_callbacks, or xfs_trans_log_buf.  For
      	  xfs_buf_iodone_callbacks and xfs_trans_log_buf we require
      	  an initialized buf item, which means b_strat was set to
      	  xfs_bdstrat_cb in xfs_buf_item_init.
      
      Conclusion: we can just get rid of the callback and replace it with
      explicit calls to xfs_bdstrat_cb.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      939d723b
    • C
      xfs: remove obsolete osyncisosync mount option · a64afb05
      Christoph Hellwig 提交于
      Since Linux 2.6.33 the kernel has support for real O_SYNC, which made
      the osyncisosync option a no-op.  Warn the users about this and remove
      the mount flag for it.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      a64afb05
    • C
      xfs: clean up filestreams helpers · 0664ce8d
      Christoph Hellwig 提交于
      Move xfs_filestream_peek_ag, xxfs_filestream_get_ag and xfs_filestream_put_ag
      from xfs_filestream.h to xfs_filestream.c where it's only callers are, and
      remove the inline marker while we're at it to let the compiler decide on the
      inlining.  Also don't return a value from xfs_filestream_put_ag because
      we don't need it.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      0664ce8d
    • C
      xfs: fix gcc 4.6 set but not read and unused statement warnings · 73523a2e
      Christoph Hellwig 提交于
      [hch: dropped a few hunks that need structural changes instead]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      73523a2e
    • T
      xfs: Fix build when CONFIG_XFS_POSIX_ACL=n · 0f1a932f
      Tony Luck 提交于
      When CONFIG_XFS_POSIX_ACL is not set "xfs_check_acl" is #defined
      to NULL - which breaks the code attempting to add a tracepoint
      on this function.
      
      Only define the tracepoint when the function exists.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      0f1a932f
    • K
      xfs: fix unsigned underflow in xfs_free_eofblocks · 3f34885c
      Kulikov Vasiliy 提交于
      map_len is unsigned. Checking map_len <= 0 is buggy when it should be
      below zero. So, check exact expression instead of map_len.
      Signed-off-by: NKulikov Vasiliy <segooon@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      3f34885c
    • D
      xfs: use GFP_NOFS for page cache allocation · aea1b953
      Dave Chinner 提交于
      Avoid a lockdep warning by preventing page cache allocation from
      recursing back into the filesystem during memory reclaim.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      aea1b953
    • D
      xfs: fix memory reclaim recursion deadlock on locked inode buffer · 4a7edddc
      Dave Chinner 提交于
      Calling into memory reclaim with a locked inode buffer can deadlock
      if memory reclaim tries to lock the inode buffer during inode
      teardown. Convert the relevant memory allocations to use KM_NOFS to
      avoid this deadlock condition.
      Reported-by: NPeter Watkins <treestem@gmail.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      4a7edddc
    • D
      xfs: fix xfs_trans_add_item() lockdep warnings · 43869706
      Dave Chinner 提交于
      xfs_trans_add_item() is called with ip->i_ilock held, which means it
      is unsafe for memory reclaim to recurse back into the filesystem
      (ilock is required in writeback). Hence the allocation needs to be
      KM_NOFS to avoid recursion.
      
      Lockdep report indicating memory allocation being called with the
      ip->i_ilock held is as follows:
      
      [ 1749.866796] =================================
      [ 1749.867788] [ INFO: inconsistent lock state ]
      [ 1749.868327] 2.6.35-rc3-dgc+ #25
      [ 1749.868741] ---------------------------------
      [ 1749.868741] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
      [ 1749.868741] dd/2835 [HC0[0]:SC0[0]:HE1:SE1] takes:
      [ 1749.868741]  (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff813170fb>] xfs_ilock+0x10b/0x190
      [ 1749.868741] {IN-RECLAIM_FS-W} state was registered at:
      [ 1749.868741]   [<ffffffff810b3a97>] __lock_acquire+0x437/0x1450
      [ 1749.868741]   [<ffffffff810b4b56>] lock_acquire+0xa6/0x160
      [ 1749.868741]   [<ffffffff810a20b5>] down_write_nested+0x65/0xb0
      [ 1749.868741]   [<ffffffff813170fb>] xfs_ilock+0x10b/0x190
      [ 1749.868741]   [<ffffffff8134e819>] xfs_reclaim_inode+0x99/0x310
      [ 1749.868741]   [<ffffffff8134f56b>] xfs_inode_ag_walk+0x8b/0x150
      [ 1749.868741]   [<ffffffff8134f6bb>] xfs_inode_ag_iterator+0x8b/0xf0
      [ 1749.868741]   [<ffffffff8134f7a8>] xfs_reclaim_inode_shrink+0x88/0x90
      [ 1749.868741]   [<ffffffff81119d07>] shrink_slab+0x137/0x1a0
      [ 1749.868741]   [<ffffffff8111bbe1>] balance_pgdat+0x421/0x6a0
      [ 1749.868741]   [<ffffffff8111bf7d>] kswapd+0x11d/0x320
      [ 1749.868741]   [<ffffffff8109ce56>] kthread+0x96/0xa0
      [ 1749.868741]   [<ffffffff81035de4>] kernel_thread_helper+0x4/0x10
      [ 1749.868741] irq event stamp: 4234335
      [ 1749.868741] hardirqs last  enabled at (4234335): [<ffffffff81147d25>] kmem_cache_free+0x115/0x220
      [ 1749.868741] hardirqs last disabled at (4234334): [<ffffffff81147c4d>] kmem_cache_free+0x3d/0x220
      [ 1749.868741] softirqs last  enabled at (4233112): [<ffffffff81084dd2>] __do_softirq+0x142/0x260
      [ 1749.868741] softirqs last disabled at (4233095): [<ffffffff81035edc>] call_softirq+0x1c/0x50
      [ 1749.868741] 
      [ 1749.868741] other info that might help us debug this:
      [ 1749.868741] 2 locks held by dd/2835:
      [ 1749.868741]  #0:  (&(&ip->i_iolock)->mr_lock#2){+.+.+.}, at: [<ffffffff81316edd>] xfs_ilock_nowait+0xed/0x200
      [ 1749.868741]  #1:  (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff813170fb>] xfs_ilock+0x10b/0x190
      [ 1749.868741] 
      [ 1749.868741] stack backtrace:
      [ 1749.868741] Pid: 2835, comm: dd Not tainted 2.6.35-rc3-dgc+ #25
      [ 1749.868741] Call Trace:
      [ 1749.868741]  [<ffffffff810b1faa>] print_usage_bug+0x18a/0x190
      [ 1749.868741]  [<ffffffff8104264f>] ? save_stack_trace+0x2f/0x50
      [ 1749.868741]  [<ffffffff810b2400>] ? check_usage_backwards+0x0/0xf0
      [ 1749.868741]  [<ffffffff810b2f11>] mark_lock+0x331/0x400
      [ 1749.868741]  [<ffffffff810b3047>] mark_held_locks+0x67/0x90
      [ 1749.868741]  [<ffffffff810b3111>] lockdep_trace_alloc+0xa1/0xe0
      [ 1749.868741]  [<ffffffff81147419>] kmem_cache_alloc+0x39/0x1e0
      [ 1749.868741]  [<ffffffff8133f954>] kmem_zone_alloc+0x94/0xe0
      [ 1749.868741]  [<ffffffff8133f9be>] kmem_zone_zalloc+0x1e/0x50
      [ 1749.868741]  [<ffffffff81335f02>] xfs_trans_add_item+0x72/0xb0
      [ 1749.868741]  [<ffffffff81339e41>] xfs_trans_ijoin+0xa1/0xd0
      [ 1749.868741]  [<ffffffff81319f82>] xfs_itruncate_finish+0x312/0x5d0
      [ 1749.868741]  [<ffffffff8133cb87>] xfs_free_eofblocks+0x227/0x280
      [ 1749.868741]  [<ffffffff8133cd18>] xfs_release+0x138/0x190
      [ 1749.868741]  [<ffffffff813464c5>] xfs_file_release+0x15/0x20
      [ 1749.868741]  [<ffffffff81150ebf>] fput+0x13f/0x260
      [ 1749.868741]  [<ffffffff8114d8c2>] filp_close+0x52/0x80
      [ 1749.868741]  [<ffffffff8114d9a9>] sys_close+0xb9/0x120
      [ 1749.868741]  [<ffffffff81034ff2>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      43869706
    • D
      xfs: simplify and remove xfs_ireclaim · 2f11feab
      Dave Chinner 提交于
      xfs_ireclaim has to get and put te pag structure because it is only
      called with the inode to reclaim. The one caller of this function
      already has a reference on the pag and a pointer to is, so move the
      radix tree delete to the caller and remove xfs_ireclaim completely.
      This avoids a xfs_perag_get/put on every inode being reclaimed.
      
      The overhead was noticed in a bug report at:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=16348Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2f11feab
    • D
      xfs: don't block on buffer read errors · ec53d1db
      Dave Chinner 提交于
      xfs_buf_read() fails to detect dispatch errors before attempting to
      wait on sychronous IO. If there was an error, it will get stuck
      forever, waiting for an I/O that was never started. Make sure the
      error is detected correctly.
      
      Further, such a failure can leave locked pages in the page cache
      which will cause a later operation to hang on the page. Ensure that
      we correctly process pages in the buffers when we get a dispatch
      error.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ec53d1db
    • D
      xfs: move inode shrinker unregister even earlier · a4190f90
      Dave Chinner 提交于
      I missed Dave Chinner's second revision of this change, and pushed
      his first version out to the repository instead.
      
      	commit a476c59ebb279d738718edc0e3fb76aab3687114
      	Author: Dave Chinner <dchinner@redhat.com>
      
      This commit compensates for that by moving a block of code up a bit
      further, with a result that matches the the effect of Dave's second
      version.
      
      Dave's first version was:
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Dave's second version was:
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      a4190f90
    • C
      xfs: remove a dmapi leftover · fa17b25e
      Christoph Hellwig 提交于
      The open_exec file operation is only added by the external dmapi
      patch.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      fa17b25e
    • C
      xfs: writepage always has buffers · 78558fe8
      Christoph Hellwig 提交于
      These days we always have buffers thanks to ->page_mkwrite.  And we
      already have an assert a few lines above tripping in case that was
      not true due to a bug.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      78558fe8
    • C
      xfs: allow writeback from kswapd · d4f7a5cb
      Christoph Hellwig 提交于
      We only need disable I/O from direct or memcg reclaim.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      d4f7a5cb
    • C
      xfs: remove incorrect log write optimization · 651701d7
      Christoph Hellwig 提交于
      We do need a barrier for the first buffer of a split log write.
      Otherwise we might incorrectly stamp the tail LSN into transactions
      in the first part of the split write, or not flush data I/O before
      updating the inode size.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      651701d7
    • D
      xfs: unregister inode shrinker before freeing filesystem structures · 2727ccc9
      Dave Chinner 提交于
      Currently we don't remove the XFS mount from the shrinker list until
      late in the unmount path. By this time, we have already torn down
      the internals of the filesystem (e.g. the per-ag structures), and
      hence if the shrinker is executed between the teardown and the
      unregistering, the shrinker will get NULL per-ag structure pointers
      and panic trying to dereference them.
      
      Fix this by removing the xfs mount from the shrinker list before
      tearing down it's internal structures.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      2727ccc9
    • C
      xfs: split xfs_itrace_entry · cca28fb8
      Christoph Hellwig 提交于
      Replace the xfs_itrace_entry catchall with specific trace points.  For
      most simple callers we now use the simple inode class, which used to
      be the iget class, but add more details tracing for namespace events,
      which now includes the name of the directory entries manipulated.
      
      Remove the xfs_inactive trace point, which is a duplicate of the clear_inode
      one, and the xfs_change_file_space trace point, which is immediately
      followed by the more specific alloc/free space trace points.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      cca28fb8
    • C
      xfs: remove xfs_iput · f2d67614
      Christoph Hellwig 提交于
      xfs_iput is just a small wrapper for xfs_iunlock + IRELE.  Having this
      out of line wrapper means the trace events in those two can't track
      their caller properly.  So just remove the wrapper and opencode the
      unlock + rele in the few callers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      f2d67614
    • C
      xfs: remove xfs_iput_new · ef35e925
      Christoph Hellwig 提交于
      We never get an i_mode of 0 or a locked VFS inode until we pass in the
      XFS_IGET_CREATE flag to xfs_iget, which makes xfs_iput_new equivalent to
      xfs_iput for the only caller.  In addition to that xfs_nfs_get_inode
      does not even need to lock the inode given that the generation never changes
      for a life inode, so just pass a 0 lock_flags to xfs_iget and release
      the inode using IRELE in the error path.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      ef35e925
    • C
      xfs: some iget tracing cleanups / fixes · d2e078c3
      Christoph Hellwig 提交于
      The xfs_iget_alloc/found tracepoints are a bit misnamed and misplaced.
      Rename them to xfs_iget_hit/xfs_iget_miss and move them to the beggining
      of the xfs_iget_cache_hit/miss functions.  Add a new xfs_iget_reclaim_fail
      tracepoint for the case where we fail to re-initialize a VFS inode,
      and add a second instance of the xfs_iget_skip tracepoint for the case
      of a failed igrab() call.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      d2e078c3
    • C
      xfs: do not use emums for flags used in tracing · 807cbbdb
      Christoph Hellwig 提交于
      The tracing code can't print flags defined as enums.  Most flags that
      we want to print are defines as macros already, but move the few remaining
      ones over to make the trace output more useful.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      807cbbdb
    • C
      xfs: remove explicit xfs_sync_data/xfs_sync_attr calls on umount · 64c86149
      Christoph Hellwig 提交于
      On the final put of a superblock the VFS already calls sync_filesystem
      for us to write out all data and wait for it.  No need to start another
      asynchronous writeback inside ->put_super.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      64c86149
    • C
      xfs: small cleanups for xfs_iomap / __xfs_get_blocks · f2bde9b8
      Christoph Hellwig 提交于
      Remove the flags argument to  __xfs_get_blocks as we can easily derive
      it from the direct argument, and remove the unused BMAPI_MMAP flag.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      f2bde9b8
    • C
      xfs: reduce stack usage in xfs_iomap · 3070451e
      Christoph Hellwig 提交于
      xfs_iomap passes a xfs_bmbt_irec pointer to xfs_iomap_write_direct and
      xfs_iomap_write_allocate to give them the results of our read-only
      xfs_bmapi query.  Instead of allocating a new xfs_bmbt_irec on stack
      for the next call to xfs_bmapi re use the one we got passed as it's not
      used after this point.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      3070451e
    • C
      xfs: avoid synchronous transaction in xfs_fs_write_inode · 7a36c8a9
      Christoph Hellwig 提交于
      We already rely on the fact that the sync code will cause a synchronous
      log force later on (currently via xfs_fs_sync_fs -> xfs_quiesce_data ->
      xfs_sync_data), so no need to do this here.  This allows us to avoid
      a lot of synchronous log forces during sync, which pays of especially
      with delayed logging enabled.   Some compilebench numbers that show
      this:
      
      xfs (delayed logging, 256k logbufs)
      ===================================
      
      intial create		  25.94 MB/s	  25.75 MB/s	  25.64 MB/s
      create			   8.54 MB/s	   9.12 MB/s	   9.15 MB/s
      patch			   2.47 MB/s	   2.47 MB/s	   3.17 MB/s
      compile			  29.65 MB/s	  30.51 MB/s	  27.33 MB/s
      clean			  90.92 MB/s	  98.83 MB/s	 128.87 MB/s
      read tree		  11.90 MB/s	  11.84 MB/s	   8.56 MB/s
      read compiled		  28.75 MB/s	  29.96 MB/s	  24.25 MB/s
      delete tree		8.39 seconds	8.12 seconds	8.46 seconds
      delete compiled		8.35 seconds	8.44 seconds	5.11 seconds
      stat tree		6.03 seconds	5.59 seconds	5.19 seconds
      stat compiled tree	9.00 seconds	9.52 seconds	8.49 seconds
      
      xfs + write_inode log_force removal
      ===================================
      intial create		  25.87 MB/s	  25.76 MB/s	  25.87 MB/s
      create			  15.18 MB/s	  14.80 MB/s	  14.94 MB/s
      patch			   3.13 MB/s	   3.14 MB/s	   3.11 MB/s
      compile			  36.74 MB/s	  37.17 MB/s	  36.84 MB/s
      clean			 226.02 MB/s	 222.58 MB/s	 217.94 MB/s
      read tree		  15.14 MB/s	  15.02 MB/s	  15.14 MB/s
      read compiled tree	  29.30 MB/s	  29.31 MB/s	  29.32 MB/s
      delete tree		6.22 seconds	6.14 seconds	6.15 seconds
      delete compiled tree	5.75 seconds	5.92 seconds	5.81 seconds
      stat tree		4.60 seconds	4.51 seconds	4.56 seconds
      stat compiled tree	4.07 seconds	3.87 seconds	3.96 seconds
      
      In addition to that also remove the delwri inode flush that is unessecary
      now that bulkstat is always coherent.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7a36c8a9
    • C
      xfs: simplify xfs_vm_writepage · 20cb52eb
      Christoph Hellwig 提交于
      The writepage implementation in XFS still tries to deal with dirty but
      unmapped buffers which used to caused by writes through shared mmaps.  Since
      the introduction of ->page_mkwrite these can't happen anymore, so remove the
      code dealing with them.
      
      Note that the all_bh variable which causes us to start I/O on all buffers on
      the pages was controlled by the count of unmapped buffers, which also
      included those not actually dirty.  It's now unconditionally initialized to
      0 but set to 1 for the case of small file size extensions.  It probably can
      be removed entirely, but that's left for another patch.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      20cb52eb
    • C
      xfs: simplify xfs_vm_releasepage · 89f3b363
      Christoph Hellwig 提交于
      Currently the xfs releasepage implementation has code to deal with converting
      delayed allocated and unwritten space.  But we never get called for those as
      we always convert delayed and unwritten space when cleaning a page, or drop
      the state from the buffers in block_invalidatepage.  We still keep a WARN_ON
      on those cases for now, but remove all the case dealing with it, which allows
      to fold xfs_page_state_convert into xfs_vm_writepage and remove the !startio
      case from the whole writeback path.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      89f3b363