1. 12 12月, 2009 1 次提交
  2. 30 10月, 2009 1 次提交
    • E
      xfs: free temporary cursor in xfs_dialloc · 3b826386
      Eric Sandeen 提交于
      Commit bd169565 seems
      to have a slight regression where this code path:
      
          if (!--searchdistance) {
              /*
               * Not in range - save last search
               * location and allocate a new inode
               */
              ...
              goto newino;
          }
      
      doesn't free the temporary cursor (tcur) that got dup'd in
      this function.
      
      This leaks an item in the xfs_btree_cur zone, and it's caught
      on module unload:
      
      ===========================================================
      BUG xfs_btree_cur: Objects remaining on kmem_cache_close()
      -----------------------------------------------------------
      
      It seems like maybe a single free at the end of the function might
      be cleaner, but for now put a del_cursor right in this code block
      similar to the handling in the rest of the function.
      Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      3b826386
  3. 02 9月, 2009 8 次提交
  4. 29 3月, 2009 1 次提交
  5. 09 2月, 2009 1 次提交
  6. 19 1月, 2009 1 次提交
  7. 16 1月, 2009 1 次提交
  8. 01 12月, 2008 8 次提交
  9. 30 10月, 2008 8 次提交
  10. 29 4月, 2008 1 次提交
    • D
      [XFS] Don't initialise new inode generation numbers to zero · 359346a9
      David Chinner 提交于
      When we allocation new inode chunks, we initialise the generation numbers
      to zero. This works fine until we delete a chunk and then reallocate it,
      resulting in the same inode numbers but with a reset generation count.
      This can result in inode/generation pairs of different inodes occurring
      relatively close together.
      
      Given that the inode/gen pair makes up the "unique" portion of an NFS
      filehandle on XFS, this can result in file handles cached on clients being
      seen on the wire from the server but refer to a different file. This
      causes .... issues for NFS clients.
      
      Hence we need a unique generation number initialisation for each inode to
      prevent reuse of a small portion of the generation number space. Use a
      random number to initialise the generation number so we don't need to keep
      any new state on disk whilst making the new number difficult to guess from
      previous allocations.
      
      SGI-PV: 979416
      SGI-Modid: xfs-linux-melb:xfs-kern:31001a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      359346a9
  11. 18 4月, 2008 1 次提交
    • D
      [XFS] Account for inode cluster alignment in all allocations · 75de2a91
      David Chinner 提交于
      At ENOSPC, we can get a filesystem shutdown due to a cancelling a dirty
      transaction in xfs_mkdir or xfs_create. This is due to the initial
      allocation attempt not taking into account inode alignment and hence we
      can prepare the AGF freelist for allocation when it's not actually
      possible to do an allocation. This results in inode allocation returning
      ENOSPC with a dirty transaction, and hence we shut down the filesystem.
      
      Because the first allocation is an exact allocation attempt, we must tell
      the allocator that the alignment does not affect the allocation attempt.
      i.e. we will accept any extent alignment as long as the extent starts at
      the block we want. Unfortunately, this means that if the longest free
      extent is less than the length + alignment necessary for fallback
      allocation attempts but is long enough to attempt a non-aligned
      allocation, we will modify the free list.
      
      If we then have the exact allocation fail, all other allocation attempts
      will also fail due to the alignment constraint being taken into account.
      Hence the initial attempt needs to set the "alignment slop" field so that
      alignment, while not required, must be taken into account when determining
      if there is enough space left in the AG to do the allocation.
      
      That means if the exact allocation fails, we will not dirty the freelist
      if there is not enough space available fo a subsequent allocation to
      succeed. Hence we get an ENOSPC error back to userspace without shutting
      down the filesystem.
      
      SGI-PV: 978886
      SGI-Modid: xfs-linux-melb:xfs-kern:30699a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      75de2a91
  12. 10 4月, 2008 1 次提交
  13. 29 2月, 2008 1 次提交
  14. 14 2月, 2008 1 次提交
  15. 15 10月, 2007 1 次提交
    • C
      [XFS] dinode endianess annotations · 347d1c01
      Christoph Hellwig 提交于
      Biggest bit is duplicating the dinode structure so we have one annotated for
      native endianess and one for disk endianess. The other significant change
      is that xfs_xlate_dinode_core is split into one helper per direction to
      allow for proper annotations, everything else is trivial.
      
      As a sidenode splitting out the incore dinode means we can move it into
      xfs_inode.h in a later patch and severely improving on the include hell in
      xfs.
      
      SGI-PV: 968563
      SGI-Modid: xfs-linux-melb:xfs-kern:29476a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      347d1c01
  16. 14 7月, 2007 1 次提交
    • D
      [XFS] Lazy Superblock Counters · 92821e2b
      David Chinner 提交于
      When we have a couple of hundred transactions on the fly at once, they all
      typically modify the on disk superblock in some way.
      create/unclink/mkdir/rmdir modify inode counts, allocation/freeing modify
      free block counts.
      
      When these counts are modified in a transaction, they must eventually lock
      the superblock buffer and apply the mods. The buffer then remains locked
      until the transaction is committed into the incore log buffer. The result
      of this is that with enough transactions on the fly the incore superblock
      buffer becomes a bottleneck.
      
      The result of contention on the incore superblock buffer is that
      transaction rates fall - the more pressure that is put on the superblock
      buffer, the slower things go.
      
      The key to removing the contention is to not require the superblock fields
      in question to be locked. We do that by not marking the superblock dirty
      in the transaction. IOWs, we modify the incore superblock but do not
      modify the cached superblock buffer. In short, we do not log superblock
      modifications to critical fields in the superblock on every transaction.
      In fact we only do it just before we write the superblock to disk every
      sync period or just before unmount.
      
      This creates an interesting problem - if we don't log or write out the
      fields in every transaction, then how do the values get recovered after a
      crash? the answer is simple - we keep enough duplicate, logged information
      in other structures that we can reconstruct the correct count after log
      recovery has been performed.
      
      It is the AGF and AGI structures that contain the duplicate information;
      after recovery, we walk every AGI and AGF and sum their individual
      counters to get the correct value, and we do a transaction into the log to
      correct them. An optimisation of this is that if we have a clean unmount
      record, we know the value in the superblock is correct, so we can avoid
      the summation walk under normal conditions and so mount/recovery times do
      not change under normal operation.
      
      One wrinkle that was discovered during development was that the blocks
      used in the freespace btrees are never accounted for in the AGF counters.
      This was once a valid optimisation to make; when the filesystem is full,
      the free space btrees are empty and consume no space. Hence when it
      matters, the "accounting" is correct. But that means the when we do the
      AGF summations, we would not have a correct count and xfs_check would
      complain. Hence a new counter was added to track the number of blocks used
      by the free space btrees. This is an *on-disk format change*.
      
      As a result of this, lazy superblock counters are a mkfs option and at the
      moment on linux there is no way to convert an old filesystem. This is
      possible - xfs_db can be used to twiddle the right bits and then
      xfs_repair will do the format conversion for you. Similarly, you can
      convert backwards as well. At some point we'll add functionality to
      xfs_admin to do the bit twiddling easily....
      
      SGI-PV: 964999
      SGI-Modid: xfs-linux-melb:xfs-kern:28652a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      92821e2b
  17. 10 2月, 2007 1 次提交
    • D
      [XFS] Keep stack usage down for 4k stacks by using noinline. · 7989cb8e
      David Chinner 提交于
      gcc-4.1 and more recent aggressively inline static functions which
      increases XFS stack usage by ~15% in critical paths. Prevent this from
      occurring by adding noinline to the STATIC definition.
      
      Also uninline some functions that are too large to be inlined and were
      causing problems with CONFIG_FORCED_INLINING=y.
      
      Finally, clean up all the different users of inline, __inline and
      __inline__ and put them under one STATIC_INLINE macro. For debug kernels
      the STATIC_INLINE macro uninlines those functions.
      
      SGI-PV: 957159
      SGI-Modid: xfs-linux-melb:xfs-kern:27585a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NDavid Chatterton <chatz@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      7989cb8e
  18. 28 9月, 2006 2 次提交