1. 22 10月, 2013 1 次提交
  2. 13 8月, 2013 2 次提交
    • D
      xfs: kill xfs_vnodeops.[ch] · c24b5dfa
      Dave Chinner 提交于
      Now we have xfs_inode.c for holding kernel-only XFS inode
      operations, move all the inode operations from xfs_vnodeops.c to
      this new file as it holds another set of kernel-only inode
      operations. The name of this file traces back to the days of Irix
      and it's vnodes which we don't have anymore.
      
      Essentially this move consolidates the inode locking functions
      and a bunch of XFS inode operations into the one file. Eventually
      the high level functions will be merged into the VFS interface
      functions in xfs_iops.c.
      
      This leaves only internal preallocation, EOF block manipulation and
      hole punching functions in vnodeops.c. Move these to xfs_bmap_util.c
      where we are already consolidating various in-kernel physical extent
      manipulation and querying functions.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      c24b5dfa
    • D
      xfs: reshuffle dir2 definitions around for userspace · 2b9ab5ab
      Dave Chinner 提交于
      Many of the definitions within xfs_dir2_priv.h are needed in
      userspace outside libxfs. Definitions within xfs_dir2_priv.h are
      wholly contained within libxfs, so we need to shuffle some of the
      definitions around to keep consistency across files shared between
      user and kernel space.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      2b9ab5ab
  3. 03 7月, 2013 1 次提交
    • J
      vfs: export lseek_execute() to modules · 46a1c2c7
      Jie Liu 提交于
      For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
      SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
      matter in lseek_execute() to update the current file offset
      to the desired offset if it is valid, ceph also does the
      simliar things at ceph_llseek().
      
      To reduce the duplications, this patch make lseek_execute()
      public accessible so that we can call it directly from the
      underlying file systems.
      
      Thanks Dave Chinner for this suggestion.
      
      [AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]
      
      v2->v1:
      - Add kernel-doc comments for lseek_execute()
      - Call lseek_execute() in ceph->llseek()
      Signed-off-by: NJie Liu <jeff.liu@oracle.com>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: Ted Tso <tytso@mit.edu>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Sage Weil <sage@inktank.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      46a1c2c7
  4. 29 6月, 2013 1 次提交
  5. 08 5月, 2013 1 次提交
  6. 28 4月, 2013 1 次提交
  7. 10 4月, 2013 1 次提交
  8. 23 2月, 2013 1 次提交
  9. 30 11月, 2012 1 次提交
  10. 16 11月, 2012 2 次提交
  11. 15 11月, 2012 1 次提交
  12. 18 10月, 2012 1 次提交
    • D
      xfs: xfs_sync_data is redundant. · 9aa05000
      Dave Chinner 提交于
      We don't do any data writeback from XFS any more - the VFS is
      completely responsible for that, including for freeze. We can
      replace the remaining caller with a VFS level function that
      achieves the same thing, but without conflicting with current
      writeback work.
      
      This means we can remove the flush_work and xfs_flush_inodes() - the
      VFS functionality completely replaces the internal flush queue for
      doing this writeback work in a separate context to avoid stack
      overruns.
      
      This does have one complication - it cannot be called with page
      locks held.  Hence move the flushing of delalloc space when ENOSPC
      occurs back up into xfs_file_aio_buffered_write when we don't hold
      any locks that will stall writeback.
      
      Unfortunately, writeback_inodes_sb_if_idle() is not sufficient to
      trigger delalloc conversion fast enough to prevent spurious ENOSPC
      whent here are hundreds of writers, thousands of small files and GBs
      of free RAM.  Hence we need to use sync_sb_inodes() to block callers
      while we wait for writeback like the previous xfs_flush_inodes
      implementation did.
      
      That means we have to hold the s_umount lock here, but because this
      call can nest inside i_mutex (the parent directory in the create
      case, held by the VFS), we have to use down_read_trylock() to avoid
      potential deadlocks. In practice, this trylock will succeed on
      almost every attempt as unmount/remount type operations are
      exceedingly rare.
      
      Note: we always need to pass a count of zero to
      generic_file_buffered_write() as the previously written byte count.
      We only do this by accident before this patch by the virtue of ret
      always being zero when there are no errors. Make this explicit
      rather than needing to specifically zero ret in the ENOSPC retry
      case.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      9aa05000
  13. 09 10月, 2012 1 次提交
    • K
      mm: kill vma flag VM_CAN_NONLINEAR · 0b173bc4
      Konstantin Khlebnikov 提交于
      Move actual pte filling for non-linear file mappings into the new special
      vma operation: ->remap_pages().
      
      Filesystems must implement this method to get non-linear mapping support,
      if it uses filemap_fault() then generic_file_remap_pages() can be used.
      
      Now device drivers can implement this method and obtain nonlinear vma support.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Carsten Otte <cotte@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>	#arch/tile
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0b173bc4
  14. 25 8月, 2012 4 次提交
  15. 31 7月, 2012 1 次提交
    • J
      xfs: Convert to new freezing code · d9457dc0
      Jan Kara 提交于
      Generic code now blocks all writers from standard write paths. So we add
      blocking of all writers coming from ioctl (we get a protection of ioctl against
      racing remount read-only as a bonus) and convert xfs_file_aio_write() to a
      non-racy freeze protection. We also keep freeze protection on transaction
      start to block internal filesystem writes such as removal of preallocated
      blocks.
      
      CC: Ben Myers <bpm@sgi.com>
      CC: Alex Elder <elder@kernel.org>
      CC: xfs@oss.sgi.com
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9457dc0
  16. 15 6月, 2012 2 次提交
  17. 02 6月, 2012 1 次提交
    • J
      fs: introduce inode operation ->update_time · c3b2da31
      Josef Bacik 提交于
      Btrfs has to make sure we have space to allocate new blocks in order to modify
      the inode, so updating time can fail.  We've gotten around this by having our
      own file_update_time but this is kind of a pain, and Christoph has indicated he
      would like to make xfs do something different with atime updates.  So introduce
      ->update_time, where we will deal with i_version an a/m/c time updates and
      indicate which changes need to be made.  The normal version just does what it
      has always done, updates the time and marks the inode dirty, and then
      filesystems can choose to do something different.
      
      I've gone through all of the users of file_update_time and made them check for
      errors with the exception of the fault code since it's complicated and I wasn't
      quite sure what to do there, also Jan is going to be pushing the file time
      updates into page_mkwrite for those who have it so that should satisfy btrfs and
      make it not a big deal to check the file_update_time() return code in the
      generic fault path. Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      c3b2da31
  18. 15 5月, 2012 5 次提交
  19. 14 3月, 2012 2 次提交
  20. 18 1月, 2012 4 次提交
    • C
      xfs: cleanup xfs_file_aio_write · d0606464
      Christoph Hellwig 提交于
      With all the size field updates out of the way xfs_file_aio_write can
      be further simplified by pushing all iolock handling into
      xfs_file_dio_aio_write and xfs_file_buffered_aio_write and using
      the generic generic_write_sync helper for synchronous writes.
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      d0606464
    • C
      xfs: always return with the iolock held from xfs_file_aio_write_checks · 5bf1f262
      Christoph Hellwig 提交于
      While xfs_iunlock is fine with 0 lockflags the calling conventions are much
      cleaner if xfs_file_aio_write_checks never returns without the iolock held.
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      5bf1f262
    • C
      xfs: remove the i_new_size field in struct xfs_inode · 2813d682
      Christoph Hellwig 提交于
      Now that we use the VFS i_size field throughout XFS there is no need for the
      i_new_size field any more given that the VFS i_size field gets updated
      in ->write_end before unlocking the page, and thus is always uptodate when
      writeback could see a page.  Removing i_new_size also has the advantage that
      we will never have to trim back di_size during a failed buffered write,
      given that it never gets updated past i_size.
      
      Note that currently the generic direct I/O code only updates i_size after
      calling our end_io handler, which requires a small workaround to make
      sure di_size actually makes it to disk.  I hope to fix this properly in
      the generic code.
      
      A downside is that we lose the support for parallel non-overlapping O_DIRECT
      appending writes that recently was added.  I don't think keeping the complex
      and fragile i_new_size infrastructure for this is a good tradeoff - if we
      really care about parallel appending writers we should investigate turning
      the iolock into a range lock, which would also allow for parallel
      non-overlapping buffered writers.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      2813d682
    • C
      xfs: remove the i_size field in struct xfs_inode · ce7ae151
      Christoph Hellwig 提交于
      There is no fundamental need to keep an in-memory inode size copy in the XFS
      inode.  We already have the on-disk value in the dinode, and the separate
      in-memory copy that we need for regular files only in the XFS inode.
      
      Remove the xfs_inode i_size field and change the XFS_ISIZE macro to use the
      VFS inode i_size field for regular files.  Switch code that was directly
      accessing the i_size field in the xfs_inode to XFS_ISIZE, or in cases where
      we are limited to regular files direct access of the VFS inode i_size field.
      
      This also allows dropping some fairly complicated code in the write path
      which dealt with keeping the xfs_inode i_size uptodate with the VFS i_size
      that is getting updated inside ->write_end.
      
      Note that we do not bother resetting the VFS i_size when truncating a file
      that gets freed to zero as there is no point in doing so because the VFS inode
      is no longer in use at this point.  Just relax the assert in xfs_ifree to
      only check the on-disk size instead.
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ce7ae151
  21. 02 12月, 2011 1 次提交
  22. 12 10月, 2011 5 次提交
    • C
      xfs: optimize fsync on directories · 1da2f2db
      Christoph Hellwig 提交于
      Directories are only updated transactionally, which means fsync only
      needs to flush the log the inode is currently dirty, but not bother
      with checking for dirty data, non-transactional updates, and most
      importanly doesn't have to flush disk caches except as part of a
      transaction commit.
      
      While the first two optimizations can't easily be measured, the
      latter actually makes a difference when doing lots of fsync that do
      not actually have to commit the inode, e.g. because an earlier fsync
      already pushed the log far enough.
      
      The new xfs_dir_fsync is identical to xfs_nfs_commit_metadata except
      for the prototype, but I'm not sure creating a common helper for the
      two is worth it given how simple the functions are.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      
      1da2f2db
    • C
      xfs: simplify xfs_trans_ijoin* again · ddc3415a
      Christoph Hellwig 提交于
      There is no reason to keep a reference to the inode even if we unlock
      it during transaction commit because we never drop a reference between
      the ijoin and commit.  Also use this fact to merge xfs_trans_ijoin_ref
      back into xfs_trans_ijoin - the third argument decides if an unlock
      is needed now.
      
      I'm actually starting to wonder if allowing inodes to be unlocked
      at transaction commit really is worth the effort.  The only real
      benefit is that they can be unlocked earlier when commiting a
      synchronous transactions, but that could be solved by doing the
      log force manually after the unlock, too.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      
      ddc3415a
    • C
      xfs: unlock the inode before log force in xfs_fsync · b1037058
      Christoph Hellwig 提交于
      Only read the LSN we need to push to with the ilock held, and then release
      it before we do the log force to improve concurrency.
      
      This also removes the only direct caller of _xfs_trans_commit, thus
      allowing it to be merged into the plain xfs_trans_commit again.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      
      b1037058
    • D
      xfs: introduce xfs_bmapi_read() · 5c8ed202
      Dave Chinner 提交于
      xfs_bmapi() currently handles both extent map reading and
      allocation. As a result, the code is littered with "if (wr)"
      branches to conditionally do allocation operations if required.
      This makes the code much harder to follow and causes significant
      indent issues with the code.
      
      Given that read mapping is much simpler than allocation, we can
      split out read mapping from xfs_bmapi() and reuse the logic that
      we have already factored out do do all the hard work of handling the
      extent map manipulations. The results in a much simpler function for
      the common extent read operations, and will allow the allocation
      code to be simplified in another commit.
      
      Once xfs_bmapi_read() is implemented, convert all the callers of
      xfs_bmapi() that are only reading extents to use the new function.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      
      5c8ed202
    • C
      xfs: avoid direct I/O write vs buffered I/O race · c58cb165
      Christoph Hellwig 提交于
      Currently a buffered reader or writer can add pages to the pagecache
      while we are waiting for the iolock in xfs_file_dio_aio_write.  Prevent
      this by re-checking mapping->nrpages after we got the iolock, and if
      nessecary upgrade the lock to exclusive mode.  To simplify this a bit
      only take the ilock inside of xfs_file_aio_write_checks.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c58cb165