1. 12 10月, 2011 2 次提交
    • D
      xfs: don't serialise adjacent concurrent direct IO appending writes · 7271d243
      Dave Chinner 提交于
      For append write workloads, extending the file requires a certain
      amount of exclusive locking to be done up front to ensure sanity in
      things like ensuring that we've zeroed any allocated regions
      between the old EOF and the start of the new IO.
      
      For single threads, this typically isn't a problem, and for large
      IOs we don't serialise enough for it to be a problem for two
      threads on really fast block devices. However for smaller IO and
      larger thread counts we have a problem.
      
      Take 4 concurrent sequential, single block sized and aligned IOs.
      After the first IO is submitted but before it completes, we end up
      with this state:
      
              IO 1    IO 2    IO 3    IO 4
            +-------+-------+-------+-------+
            ^       ^
            |       |
            |       |
            |       |
            |       \- ip->i_new_size
            \- ip->i_size
      
      And the IO is done without exclusive locking because offset <=
      ip->i_size. When we submit IO 2, we see offset > ip->i_size, and
      grab the IO lock exclusive, because there is a chance we need to do
      EOF zeroing. However, there is already an IO in progress that avoids
      the need for IO zeroing because offset <= ip->i_new_size. hence we
      could avoid holding the IO lock exlcusive for this. Hence after
      submission of the second IO, we'd end up this state:
      
              IO 1    IO 2    IO 3    IO 4
            +-------+-------+-------+-------+
            ^               ^
            |               |
            |               |
            |               |
            |               \- ip->i_new_size
            \- ip->i_size
      
      There is no need to grab the i_mutex of the IO lock in exclusive
      mode if we don't need to invalidate the page cache. Taking these
      locks on every direct IO effective serialises them as taking the IO
      lock in exclusive mode has to wait for all shared holders to drop
      the lock. That only happens when IO is complete, so effective it
      prevents dispatch of concurrent direct IO writes to the same inode.
      
      And so you can see that for the third concurrent IO, we'd avoid
      exclusive locking for the same reason we avoided the exclusive lock
      for the second IO.
      
      Fixing this is a bit more complex than that, because we need to hold
      a write-submission local value of ip->i_new_size to that clearing
      the value is only done if no other thread has updated it before our
      IO completes.....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      7271d243
    • D
      xfs: don't serialise direct IO reads on page cache checks · 0c38a251
      Dave Chinner 提交于
      There is no need to grab the i_mutex of the IO lock in exclusive
      mode if we don't need to invalidate the page cache. Taking these
      locks on every direct IO effective serialises them as taking the IO
      lock in exclusive mode has to wait for all shared holders to drop
      the lock. That only happens when IO is complete, so effective it
      prevents dispatch of concurrent direct IO reads to the same inode.
      
      Fix this by taking the IO lock shared to check the page cache state,
      and only then drop it and take the IO lock exclusively if there is
      work to be done. Hence for the normal direct IO case, no exclusive
      locking will occur.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-by: NJoern Engel <joern@logfs.org>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      0c38a251
  2. 14 9月, 2011 1 次提交
  3. 01 9月, 2011 2 次提交
    • C
      xfs: fix ->write_inode return values · 58d84c4e
      Christoph Hellwig 提交于
      Currently we always redirty an inode that was attempted to be written out
      synchronously but has been cleaned by an AIL pushed internall, which is
      rather bogus.  Fix that by doing the i_update_core check early on and
      return 0 for it.  Also include async calls for it, as doing any work for
      those is just as pointless.  While we're at it also fix the sign for the
      EIO return in case of a filesystem shutdown, and fix the completely
      non-sensical locking around xfs_log_inode.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      (cherry picked from commit 297db93bb74cf687510313eb235a7aec14d67e97)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      58d84c4e
    • C
      xfs: fix xfs_mark_inode_dirty during umount · 866e4ed7
      Christoph Hellwig 提交于
      During umount we do not add a dirty inode to the lru and wait for it to
      become clean first, but force writeback of data and metadata with
      I_WILL_FREE set.  Currently there is no way for XFS to detect that the
      inode has been redirtied for metadata operations, as we skip the
      mark_inode_dirty call during teardown.  Fix this by setting i_update_core
      nanually in that case, so that the inode gets flushed during inode reclaim.
      
      Alternatively we could enable calling mark_inode_dirty for inodes in
      I_WILL_FREE state, and let the VFS dirty tracking handle this.  I decided
      against this as we will get better I/O patterns from reclaim compared to
      the synchronous writeout in write_inode_now, and always marking the inode
      dirty in some way from xfs_mark_inode_dirty is a better safetly net in
      either case.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      (cherry picked from commit da6742a5a4cc844a9982fdd936ddb537c0747856)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      866e4ed7
  4. 25 8月, 2011 1 次提交
  5. 23 8月, 2011 1 次提交
  6. 13 8月, 2011 4 次提交
  7. 11 8月, 2011 1 次提交
  8. 10 8月, 2011 1 次提交
  9. 01 8月, 2011 3 次提交
  10. 30 7月, 2011 1 次提交
  11. 27 7月, 2011 6 次提交
  12. 26 7月, 2011 17 次提交