1. 03 8月, 2016 23 次提交
  2. 22 7月, 2016 1 次提交
    • D
      libxfs: directory node splitting does not have an extra block · 160ae76f
      Dave Chinner 提交于
      xfsprogs source commit 4280e59dcbc4cd8e01585efe788a68eb378048e8
      
      xfs_da3_split() has to handle all three versions of the
      directory/attribute btree structure. The attr tree is v1, the dir
      tre is v2 or v3. The main difference between the v1 and v2/3 trees
      is the way tree nodes are split - in the v1 tree we can require a
      double split to occur because the object to be inserted may be
      larger than the space made by splitting a leaf. In this case we need
      to do a double split - one to split the full leaf, then another to
      allocate an empty leaf block in the correct location for the new
      entry.  This does not happen with dir (v2/v3) formats as the objects
      being inserted are always guaranteed to fit into the new space in
      the split blocks.
      
      Indeed, for directories they *may* be an extra block on this buffer
      pointer. However, it's guaranteed not to be a leaf block (i.e. a
      directory data block) - the directory code only ever places hash
      index or free space blocks in this pointer (as a cursor of
      sorts), and so to use it as a directory data block will immediately
      corrupt the directory.
      
      The problem is that the code assumes that there may be extra blocks
      that we need to link into the tree once we've split the root, but
      this is not true for either dir or attr trees, because the extra
      attr block is always consumed by the last node split before we split
      the root. Hence the linking in an extra block is always wrong at the
      root split level, and this manifests itself in repair as a directory
      corruption in a repaired directory, leaving the directory rebuild
      incomplete.
      
      This is a dir v2 zero-day bug - it was in the initial dir v2 commit
      that was made back in February 1998.
      
      Fix this by ensuring the linking of the blocks after the root split
      never tries to make use of the extra blocks that may be held in the
      cursor. They are held there for other purposes and should never be
      touched by the root splitting code.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      160ae76f
  3. 20 7月, 2016 5 次提交
  4. 21 6月, 2016 4 次提交
  5. 01 6月, 2016 2 次提交
  6. 18 5月, 2016 1 次提交
    • A
      xfs: optimise xfs_iext_destroy · 32b43ab6
      Alex Lyakas 提交于
      When unmounting XFS, we call:
      
      xfs_inode_free => xfs_idestroy_fork => xfs_iext_destroy
      
      This goes over the whole indirection array and calls
      xfs_iext_irec_remove for each one of the erps (from the last one to
      the first one). As a result, we keep shrinking (reallocating
      actually) the indirection array until we shrink out all of its
      elements. When we have files with huge numbers of extents, umount
      takes 30-80 sec, depending on the amount of files that XFS loaded
      and the amount of indirection entries of each file. The unmount
      stack looks like:
      
      [<ffffffffc0b6d200>] xfs_iext_realloc_indirect+0x40/0x60 [xfs]
      [<ffffffffc0b6cd8e>] xfs_iext_irec_remove+0xee/0xf0 [xfs]
      [<ffffffffc0b6cdcd>] xfs_iext_destroy+0x3d/0xb0 [xfs]
      [<ffffffffc0b6cef6>] xfs_idestroy_fork+0xb6/0xf0 [xfs]
      [<ffffffffc0b87002>] xfs_inode_free+0xb2/0xc0 [xfs]
      [<ffffffffc0b87260>] xfs_reclaim_inode+0x250/0x340 [xfs]
      [<ffffffffc0b87583>] xfs_reclaim_inodes_ag+0x233/0x370 [xfs]
      [<ffffffffc0b8823d>] xfs_reclaim_inodes+0x1d/0x20 [xfs]
      [<ffffffffc0b96feb>] xfs_unmountfs+0x7b/0x1a0 [xfs]
      [<ffffffffc0b98e4d>] xfs_fs_put_super+0x2d/0x70 [xfs]
      [<ffffffff811e9e36>] generic_shutdown_super+0x76/0x100
      [<ffffffff811ea207>] kill_block_super+0x27/0x70
      [<ffffffff811ea519>] deactivate_locked_super+0x49/0x60
      [<ffffffff811eaaee>] deactivate_super+0x4e/0x70
      [<ffffffff81207593>] cleanup_mnt+0x43/0x90
      [<ffffffff81207632>] __cleanup_mnt+0x12/0x20
      [<ffffffff8108f8e7>] task_work_run+0xa7/0xe0
      [<ffffffff81014ff7>] do_notify_resume+0x97/0xb0
      [<ffffffff81717c6f>] int_signal+0x12/0x17
      
      Further, this reallocation prevents us from freeing the extent list
      from a RCU callback as allocation can block. Hence if the extent
      list is in indirect format, optimise the freeing of the extent list
      to only use kmem_free calls by freeing entire extent buffer pages at
      a time, rather than extent by extent.
      
      [dchinner: simplified freeing loop based on Christoph's suggestion]
      Signed-off-by: NAlex Lyakas <alex@zadarastorage.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      
      32b43ab6
  7. 06 4月, 2016 4 次提交