1. 18 4月, 2008 5 次提交
    • D
      [XFS] Propagate errors from xfs_trans_commit(). · e5720eec
      David Chinner 提交于
      xfs_trans_commit() can return errors when there are problems in the
      transaction subsystem. They are indicative that the entire transaction may
      be incomplete, and hence the error should be propagated as there is a good
      possibility that there is something fatally wrong in the filesystem. Catch
      and propagate or warn about commit errors in the places where they are
      currently ignored.
      
      SGI-PV: 980084
      SGI-Modid: xfs-linux-melb:xfs-kern:30795a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NNiv Sardi <xaiki@sgi.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      e5720eec
    • D
      [XFS] Use xfs_inode_clean() in more places · 33540408
      David Chinner 提交于
      Remove open coded checks for the whether the inode is clean and replace
      them with an inlined function.
      
      SGI-PV: 977461
      SGI-Modid: xfs-linux-melb:xfs-kern:30503a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      33540408
    • D
      [XFS] Remove the xfs_icluster structure · bad55843
      David Chinner 提交于
      Remove the xfs_icluster structure and replace with a radix tree lookup.
      
      We don't need to keep a list of inodes in each cluster around anymore as
      we can look them up quickly when we need to. The only time we need to do
      this now is during inode writeback.
      
      Factor the inode cluster writeback code out of xfs_iflush and convert it
      to use radix_tree_gang_lookup() instead of walking a list of inodes built
      when we first read in the inodes.
      
      This remove 3 pointers from each xfs_inode structure and the xfs_icluster
      structure per inode cluster. Hence we reduce the cache footprint of the
      xfs_inodes by between 5-10% depending on cluster sparseness.
      
      To be truly efficient we need a radix_tree_gang_lookup_range() call to
      stop searching once we are past the end of the cluster instead of trying
      to find a full cluster's worth of inodes.
      
      Before (ia64):
      
      $ cat /sys/slab/xfs_inode/object_size 536
      
      After:
      
      $ cat /sys/slab/xfs_inode/object_size 512
      
      SGI-PV: 977460
      SGI-Modid: xfs-linux-melb:xfs-kern:30502a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      bad55843
    • D
      [XFS] Don't block pdflush when writing back inodes · a3f74ffb
      David Chinner 提交于
      When pdflush is writing back inodes, it can get stuck on inode cluster
      buffers that are currently under I/O. This occurs when we write data to
      multiple inodes in the same inode cluster at the same time.
      
      Effectively, delayed allocation marks the inode dirty during the data
      writeback. Hence if the inode cluster was flushed during the writeback of
      the first inode, the writeback of the second inode will block waiting for
      the inode cluster write to complete before writing it again for the newly
      dirtied inode.
      
      Basically, we want to avoid this from happening so we don't block pdflush
      and slow down all of writeback. Hence we introduce a non-blocking async
      inode flush flag that pdflush uses. If this flag is set, we use
      non-blocking operations (e.g. try locks) whereever we can to avoid
      blocking or extra I/O being issued.
      
      SGI-PV: 970925
      SGI-Modid: xfs-linux-melb:xfs-kern:30501a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      a3f74ffb
    • D
      [XFS] Factor xfs_itobp() and xfs_inotobp(). · 4ae29b43
      David Chinner 提交于
      The only difference between the functions is one passes an inode for the
      lookup, the other passes an inode number. However, they don't do the same
      validity checking or set all the same state on the buffer that is returned
      yet they should.
      
      Factor the functions into a common implementation.
      
      SGI-PV: 970925
      SGI-Modid: xfs-linux-melb:xfs-kern:30500a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      4ae29b43
  2. 10 4月, 2008 1 次提交
  3. 07 2月, 2008 10 次提交
  4. 18 12月, 2007 1 次提交
    • L
      [XFS] Don't wait for pending I/Os when purging blocks beyond eof. · c734c79b
      Lachlan McIlroy 提交于
      On last close of a file we purge blocks beyond eof. The same code is used
      when we truncate the file size down. In this case we need to wait for any
      pending I/Os for dirty pages beyond the new eof. For the last close case
      we are not changing the file size and therefore do not need to wait for
      any I/Os to complete. This fixes a performance bottleneck where writes
      into the page cache and cache flushes can become mutually exclusive.
      
      SGI-PV: 964002
      SGI-Modid: xfs-linux-melb:xfs-kern:30220a
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NPeter Leckie <pleckie@sgi.com>
      c734c79b
  5. 16 10月, 2007 7 次提交
  6. 15 10月, 2007 5 次提交
  7. 14 7月, 2007 4 次提交
    • E
      [XFS] Clean up function name handling in tracing code · 3a59c94c
      Eric Sandeen 提交于
      Remove the hardcoded "fnames" for tracing, and just embed them in tracing
      macros via __FUNCTION__. Kills a lot of #ifdefs too.
      
      SGI-PV: 967353
      SGI-Modid: xfs-linux-melb:xfs-kern:29099a
      Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      3a59c94c
    • D
      [XFS] Quota inode has no parent. · b11f94d5
      David Chinner 提交于
      Avoid using a special "zero inode" as the parent of the quota inode as
      this can confuse the filestreams code into thinking the quota inode has a
      parent. We do not want the quota inode to follow filestreams allocation
      rules, so pass a NULL as the parent inode and detect this condition when
      doing stream associations.
      
      SGI-PV: 964469
      SGI-Modid: xfs-linux-melb:xfs-kern:29098a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      b11f94d5
    • D
      [XFS] Concurrent Multi-File Data Streams · 2a82b8be
      David Chinner 提交于
      In media spaces, video is often stored in a frame-per-file format. When
      dealing with uncompressed realtime HD video streams in this format, it is
      crucial that files do not get fragmented and that multiple files a placed
      contiguously on disk.
      
      When multiple streams are being ingested and played out at the same time,
      it is critical that the filesystem does not cross the streams and
      interleave them together as this creates seek and readahead cache miss
      latency and prevents both ingest and playout from meeting frame rate
      targets.
      
      This patch set creates a "stream of files" concept into the allocator to
      place all the data from a single stream contiguously on disk so that RAID
      array readahead can be used effectively. Each additional stream gets
      placed in different allocation groups within the filesystem, thereby
      ensuring that we don't cross any streams. When an AG fills up, we select a
      new AG for the stream that is not in use.
      
      The core of the functionality is the stream tracking - each inode that we
      create in a directory needs to be associated with the directories' stream.
      Hence every time we create a file, we look up the directories' stream
      object and associate the new file with that object.
      
      Once we have a stream object for a file, we use the AG that the stream
      object point to for allocations. If we can't allocate in that AG (e.g. it
      is full) we move the entire stream to another AG. Other inodes in the same
      stream are moved to the new AG on their next allocation (i.e. lazy
      update).
      
      Stream objects are kept in a cache and hold a reference on the inode.
      Hence the inode cannot be reclaimed while there is an outstanding stream
      reference. This means that on unlink we need to remove the stream
      association and we also need to flush all the associations on certain
      events that want to reclaim all unreferenced inodes (e.g. filesystem
      freeze).
      
      SGI-PV: 964469
      SGI-Modid: xfs-linux-melb:xfs-kern:29096a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NBarry Naujok <bnaujok@sgi.com>
      Signed-off-by: NDonald Douwsma <donaldd@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      Signed-off-by: NVlad Apostolov <vapo@sgi.com>
      2a82b8be
    • V
      [XFS] Use is_power_of_2 instead of open coding checks · 16a087d8
      Vignesh Babu 提交于
      SGI-PV: 966576
      SGI-Modid: xfs-linux-melb:xfs-kern:28950a
      Signed-off-by: NVignesh Babu <vignesh.babu@wipro.com>
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      16a087d8
  8. 08 5月, 2007 3 次提交
    • L
      [XFS] Fix to prevent the notorious 'NULL files' problem after a crash. · ba87ea69
      Lachlan McIlroy 提交于
      The problem that has been addressed is that of synchronising updates of
      the file size with writes that extend a file. Without the fix the update
      of a file's size, as a result of a write beyond eof, is independent of
      when the cached data is flushed to disk. Often the file size update would
      be written to the filesystem log before the data is flushed to disk. When
      a system crashes between these two events and the filesystem log is
      replayed on mount the file's size will be set but since the contents never
      made it to disk the file is full of holes. If some of the cached data was
      flushed to disk then it may just be a section of the file at the end that
      has holes.
      
      There are existing fixes to help alleviate this problem, particularly in
      the case where a file has been truncated, that force cached data to be
      flushed to disk when the file is closed. If the system crashes while the
      file(s) are still open then this flushing will never occur.
      
      The fix that we have implemented is to introduce a second file size,
      called the in-memory file size, that represents the current file size as
      viewed by the user. The existing file size, called the on-disk file size,
      is the one that get's written to the filesystem log and we only update it
      when it is safe to do so. When we write to a file beyond eof we only
      update the in- memory file size in the write operation. Later when the I/O
      operation, that flushes the cached data to disk completes, an I/O
      completion routine will update the on-disk file size. The on-disk file
      size will be updated to the maximum offset of the I/O or to the value of
      the in-memory file size if the I/O includes eof.
      
      SGI-PV: 958522
      SGI-Modid: xfs-linux-melb:xfs-kern:28322a
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      ba87ea69
    • L
      [XFS] propogate return codes from flush routines · d3cf2094
      Lachlan McIlroy 提交于
      This patch handles error return values in fs_flush_pages and
      fs_flushinval_pages. It changes the prototype of fs_flushinval_pages so we
      can propogate the errors and handle them at higher layers. I also modified
      xfs_itruncate_start so that it could propogate the error further.
      
      SGI-PV: 961990
      SGI-Modid: xfs-linux-melb:xfs-kern:28231a
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NStewart Smith <stewart@flamingspork.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      d3cf2094
    • E
      [XFS] The last argument "lsn" of xfs_trans_commit() is always called with · 1c72bf90
      Eric Sandeen 提交于
      NULL.
      
      Patch provided by Eric Sandeen.
      
      SGI-PV: 961693
      SGI-Modid: xfs-linux-melb:xfs-kern:28199a
      Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      1c72bf90
  9. 10 2月, 2007 4 次提交