1. 12 10月, 2011 1 次提交
  2. 27 7月, 2011 1 次提交
  3. 26 7月, 2011 1 次提交
  4. 13 7月, 2011 1 次提交
    • C
      xfs: reshuffle dir2 headers · 57926640
      Christoph Hellwig 提交于
      Replace the current mess of dir2 headers with just three that have a clear
      purpose:
      
       - xfs_dir2_format.h for all format definitions, including the inline helpers
         to access our variable size structures
       - xfs_dir2_priv.h for all prototypes that are internal to the dir2 code
         and not needed by anything outside of the directory code.  For this
         purpose xfs_da_btree.c, and phase6.c in xfs_repair are considered part
         of the directory code.
       - xfs_dir2.h for the public interface to the directory code
      
      In addition to the reshuffle I have also update the comments to not only
      match the new file structure, but also to describe the directory format
      better.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      57926640
  5. 08 7月, 2011 2 次提交
  6. 25 5月, 2011 7 次提交
  7. 07 3月, 2011 4 次提交
  8. 23 2月, 2011 1 次提交
  9. 08 2月, 2011 2 次提交
  10. 28 1月, 2011 3 次提交
    • B
      xfs: xfs_bmap_add_extent_delay_real should init br_startblock · 24446fc6
      bpm@sgi.com 提交于
      When filling in the middle of a previous delayed allocation in
      xfs_bmap_add_extent_delay_real, set br_startblock of the new delay
      extent to the right to nullstartblock instead of 0 before inserting
      the extent into the ifork (xfs_iext_insert), rather than setting
      br_startblock afterward.
      
      Adding the extent into the ifork with br_startblock=0 can lead to
      the extent being copied into the btree by xfs_bmap_extent_to_btree
      if we happen to convert from extents format to btree format before
      updating br_startblock with the correct value.  The unexpected
      addition of this delay extent to the btree can cause subsequent
      XFS_WANT_CORRUPTED_GOTO filesystem shutdown in several
      xfs_bmap_add_extent_delay_real cases where we are converting a delay
      extent to real and unexpectedly find an extent already inserted.
      For example:
      
      911         case BMAP_LEFT_FILLING:
      912                 /*
      913                  * Filling in the first part of a previous delayed allocation.
      914                  * The left neighbor is not contiguous.
      915                  */
      916                 trace_xfs_bmap_pre_update(ip, idx, state, _THIS_IP_);
      917                 xfs_bmbt_set_startoff(ep, new_endoff);
      918                 temp = PREV.br_blockcount - new->br_blockcount;
      919                 xfs_bmbt_set_blockcount(ep, temp);
      920                 xfs_iext_insert(ip, idx, 1, new, state);
      921                 ip->i_df.if_lastex = idx;
      922                 ip->i_d.di_nextents++;
      923                 if (cur == NULL)
      924                         rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
      925                 else {
      926                         rval = XFS_ILOG_CORE;
      927                         if ((error = xfs_bmbt_lookup_eq(cur, new->br_startoff,
      928                                         new->br_startblock, new->br_blockcount,
      929                                         &i)))
      930                                 goto done;
      931                         XFS_WANT_CORRUPTED_GOTO(i == 0, done);
      
      With the bogus extent in the btree we shutdown the filesystem at
      931.  The conversion from extents to btree format happens when the
      number of extents in the inode increases above ip->i_df.if_ext_max.
      xfs_bmap_extent_to_btree copies extents from the ifork into the
      btree, ignoring all delalloc extents which are denoted by
      br_startblock having some value of nullstartblock.
      
      SGI-PV: 1013221
      Signed-off-by: NBen Myers <bpm@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      24446fc6
    • D
      xfs: prevent extsize alignment from exceeding maximum extent size · 4ce15989
      Dave Chinner 提交于
      When doing delayed allocation, if the allocation size is for a
      maximally sized extent, extent size alignment can push it over this
      limit. This results in an assert failure in xfs_bmbt_set_allf() as
      the extent length is too large to find in the extent record.
      
      Fix this by ensuring that we allow for space that extent size
      alignment requires (up to 2 * (extsize -1) blocks as we have to
      handle both head and tail alignment) when limiting the maximum size
      of the extent.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      4ce15989
    • D
      xfs: limit extent length for allocation to AG size · 14b064ce
      Dave Chinner 提交于
      Delayed allocation extents can be larger than AGs, so when trying to
      convert a large range we may scan every AG inside
      xfs_bmap_alloc_nullfb() trying to find an AG with a size larger than
      an AG. We should stop when we find the first AG with a maximum
      possible allocation size. This causes excessive CPU usage when there
      are lots of AGs.
      
      The same problem occurs when doing preallocation of a range larger
      than an AG.
      
      Fix the problem by limiting real allocation lengths to the maximum
      that an AG can support. This means if we have empty AGs, we'll stop
      the search at the first of them. If there are no empty AGs, we'll
      still scan them all, but that is a different problem....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      14b064ce
  11. 01 12月, 2010 2 次提交
    • D
      xfs: delayed alloc blocks beyond EOF are valid after writeback · 309c8480
      Dave Chinner 提交于
      There is an assumption in the parts of XFS that flushing a dirty
      file will make all the delayed allocation blocks disappear from an
      inode. That is, that after calling xfs_flush_pages() then
      ip->i_delayed_blks will be zero.
      
      This is an invalid assumption as we may have specualtive
      preallocation beyond EOF and they are recorded in
      ip->i_delayed_blks. A flush of the dirty pages of an inode will not
      change the state of these blocks beyond EOF, so a non-zero
      deeelalloc block count after a flush is valid.
      
      The bmap code has an invalid ASSERT() that needs to be removed, and
      the swapext code has a bug in that while it swaps the data forks
      around, it fails to swap the i_delayed_blks counter associated with
      the fork and hence can get the block accounting wrong.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      309c8480
    • D
      xfs: fix failed write truncation handling. · c726de44
      Dave Chinner 提交于
      Since the move to the new truncate sequence we call xfs_setattr to
      truncate down excessively instanciated blocks.  As shown by the testcase
      in kernel.org BZ #22452 that doesn't work too well.  Due to the confusion
      of the internal inode size, and the VFS inode i_size it zeroes data that
      it shouldn't.
      
      But full blown truncate seems like overkill here.  We only instanciate
      delayed allocations in the write path, and given that we never released
      the iolock we can't have converted them to real allocations yet either.
      
      The only nasty case is pre-existing preallocation which we need to skip.
      We already do this for page discard during writeback, so make the delayed
      allocation block punching a generic function and call it from the failed
      write path as well as xfs_aops_discard_page. The callers are
      responsible for ensuring that partial blocks are not truncated away,
      and that they hold the ilock.
      
      Based on a fix originally from Christoph Hellwig. This version used
      filesystem blocks as the range unit.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      c726de44
  12. 19 10月, 2010 2 次提交
  13. 03 9月, 2010 1 次提交
    • T
      xfs: Make fiemap work with sparse files · 9af25465
      Tao Ma 提交于
      In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want
      to return fi_extent_max extents, but actually it won't work for
      a sparse file. The reason is that in xfs_getbmap we will
      calculate holes and set it in 'out', while out is malloced by
      bmv_count(fi_extent_max+1) which didn't consider holes. So in the
      worst case, if 'out' vector looks like
      [hole, extent, hole, extent, hole, ... hole, extent, hole],
      we will only return half of fi_extent_max extents.
      
      This patch add a new parameter BMV_IF_NO_HOLES for bvm_iflags.
      So with this flags, we don't use our 'out' in xfs_getbmap for
      a hole. The solution is a bit ugly by just don't increasing
      index of 'out' vector. I felt that it is not easy to skip it
      at the very beginning since we have the complicated check and
      some function like xfs_getbmapx_fix_eof_hole to adjust 'out'.
      
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NTao Ma <tao.ma@oracle.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      9af25465
  14. 27 7月, 2010 7 次提交
  15. 19 5月, 2010 1 次提交
  16. 02 3月, 2010 1 次提交
  17. 20 1月, 2010 1 次提交
  18. 16 1月, 2010 2 次提交
    • D
      xfs: Replace per-ag array with a radix tree · 1c1c6ebc
      Dave Chinner 提交于
      The use of an array for the per-ag structures requires reallocation
      of the array when growing the filesystem. This requires locking
      access to the array to avoid use after free situations, and the
      locking is difficult to get right. To avoid needing to reallocate an
      array, change the per-ag structures to an allocated object per ag
      and index them using a tree structure.
      
      The AGs are always densely indexed (hence the use of an array), but
      the number supported is 2^32 and lookups tend to be random and hence
      indexing needs to scale. A simple choice is a radix tree - it works
      well with this sort of index.  This change also removes another
      large contiguous allocation from the mount/growfs path in XFS.
      
      The growing process now needs to change to only initialise the new
      AGs required for the extra space, and as such only needs to
      exclusively lock the tree for inserts. The rest of the code only
      needs to lock the tree while doing lookups, and hence this will
      remove all the deadlocks that currently occur on the m_perag_lock as
      it is now an innermost lock. The lock is also changed to a spinlock
      from a read/write lock as the hold time is now extremely short.
      
      To complete the picture, the per-ag structures will need to be
      reference counted to ensure that we don't free/modify them while
      they are still in use.  This will be done in subsequent patch.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      1c1c6ebc
    • D
      xfs: convert remaining direct references to m_perag · 44b56e0a
      Dave Chinner 提交于
      Convert the remaining direct lookups of the per ag structures to use
      get/put accesses. Ensure that the loops across AGs and prior users
      of the interface balance gets and puts correctly.
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      44b56e0a