1. 28 11月, 2014 2 次提交
  2. 25 6月, 2014 1 次提交
    • D
      xfs: global error sign conversion · 2451337d
      Dave Chinner 提交于
      Convert all the errors the core XFs code to negative error signs
      like the rest of the kernel and remove all the sign conversion we
      do in the interface layers.
      
      Errors for conversion (and comparison) found via searches like:
      
      $ git grep " E" fs/xfs
      $ git grep "return E" fs/xfs
      $ git grep " E[A-Z].*;$" fs/xfs
      
      Negation points found via searches like:
      
      $ git grep "= -[a-z,A-Z]" fs/xfs
      $ git grep "return -[a-z,A-D,F-Z]" fs/xfs
      $ git grep " -[a-z].*;" fs/xfs
      
      [ with some bits I missed from Brian Foster ]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2451337d
  3. 13 12月, 2013 2 次提交
  4. 24 10月, 2013 1 次提交
    • D
      xfs: decouple log and transaction headers · 239880ef
      Dave Chinner 提交于
      xfs_trans.h has a dependency on xfs_log.h for a couple of
      structures. Most code that does transactions doesn't need to know
      anything about the log, but this dependency means that they have to
      include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
      files and clean up the includes to be in dependency order.
      
      In doing this, remove the direct include of xfs_trans_reserve.h from
      xfs_trans.h so that we remove the dependency between xfs_trans.h and
      xfs_mount.h. Hence the xfs_trans.h include can be moved to the
      indicate the actual dependencies other header files have on it.
      
      Note that these are kernel only header files, so this does not
      translate to any userspace changes at all.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      239880ef
  5. 14 8月, 2013 1 次提交
  6. 25 5月, 2013 1 次提交
  7. 21 5月, 2013 1 次提交
  8. 06 4月, 2013 1 次提交
    • D
      xfs: don't free EFIs before the EFDs are committed · 666d644c
      Dave Chinner 提交于
      Filesystems are occasionally being shut down with this error:
      
      xfs_trans_ail_delete_bulk: attempting to delete a log item that is
      not in the AIL.
      
      It was diagnosed to be related to the EFI/EFD commit order when the
      EFI and EFD are in different checkpoints and the EFD is committed
      before the EFI here:
      
      http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
      
      The real problem is that a single bit cannot fully describe the
      states that the EFI/EFD processing can be in. These completion
      states are:
      
      EFI			EFI in AIL	EFD		Result
      committed/unpinned	Yes		committed	OK
      committed/pinned	No		committed	Shutdown
      uncommitted		No		committed	Shutdown
      
      
      Note that the "result" field is what should happen, not what does
      happen. The current logic is broken and handles the first two cases
      correctly by luck.  That is, the code will free the EFI if the
      XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
      inverted logic "works" because if both EFI and EFD are committed,
      then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
      bit, and the second frees the EFI item. Hence as long as
      xfs_efi_item_committed() has been called, everything appears to be
      fine.
      
      It is the third case where the logic fails - where
      xfs_efd_item_committed() is called before xfs_efi_item_committed(),
      and that results in the EFI being freed before it has been
      committed. That is the bug that triggered the shutdown, and hence
      keeping track of whether the EFI has been committed or not is
      insufficient to correctly order the EFI/EFD operations w.r.t. the
      AIL.
      
      What we really want is this: the EFI is always placed into the
      AIL before the last reference goes away. The only way to guarantee
      that is that the EFI is not freed until after it has been unpinned
      *and* the EFD has been committed. That is, restructure the logic so
      that the only case that can occur is the first case.
      
      This can be done easily by replacing the XFS_EFI_COMMITTED with an
      EFI reference count. The EFI is initialised with it's own count, and
      that is not released until it is unpinned. However, there is a
      complication to this method - the high level EFI/EFD code in
      xfs_bmap_finish() does not hold direct references to the EFI
      structure, and runs a transaction commit between the EFI and EFD
      processing. Hence the EFI can be freed even before the EFD is
      created using such a method.
      
      Further, log recovery uses the AIL for tracking EFI/EFDs that need
      to be recovered, but it uses the AIL *differently* to the EFI
      transaction commit. Hence log recovery never pins or unpins EFIs, so
      we can't drop the EFI reference count indirectly to free the EFI.
      
      However, this doesn't prevent us from using a reference count here.
      There is a 1:1 relationship between EFIs and EFDs, so when we
      initialise the EFI we can take a reference count for the EFD as
      well. This solves the xfs_bmap_finish() issue - the EFI will never
      be freed until the EFD is processed. In terms of log recovery,
      during the committing of the EFD we can look for the
      XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
      thereby ensuring everything works correctly there as well.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      666d644c
  9. 15 5月, 2012 3 次提交
    • D
      xfs: move xfsagino_t to xfs_types.h · 60a34607
      Dave Chinner 提交于
      Untangle the header file includes a bit by moving the definition of
      xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on
      xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include
      xfs_ag.h.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      60a34607
    • D
      xfs: pass shutdown method into xfs_trans_ail_delete_bulk · 04913fdd
      Dave Chinner 提交于
      xfs_trans_ail_delete_bulk() can be called from different contexts so
      if the item is not in the AIL we need different shutdown for each
      context.  Pass in the shutdown method needed so the correct action
      can be taken.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      04913fdd
    • C
      xfs: on-stack delayed write buffer lists · 43ff2122
      Christoph Hellwig 提交于
      Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
      and write back the buffers per-process instead of by waking up xfsbufd.
      
      This is now easily doable given that we have very few places left that write
      delwri buffers:
      
       - log recovery:
      	Only done at mount time, and already forcing out the buffers
      	synchronously using xfs_flush_buftarg
      
       - quotacheck:
      	Same story.
      
       - dquot reclaim:
      	Writes out dirty dquots on the LRU under memory pressure.  We might
      	want to look into doing more of this via xfsaild, but it's already
      	more optimal than the synchronous inode reclaim that writes each
      	buffer synchronously.
      
       - xfsaild:
      	This is the main beneficiary of the change.  By keeping a local list
      	of buffers to write we reduce latency of writing out buffers, and
      	more importably we can remove all the delwri list promotions which
      	were hitting the buffer cache hard under sustained metadata loads.
      
      The implementation is very straight forward - xfs_buf_delwri_queue now gets
      a new list_head pointer that it adds the delwri buffers to, and all callers
      need to eventually submit the list using xfs_buf_delwi_submit or
      xfs_buf_delwi_submit_nowait.  Buffers that already are on a delwri list are
      skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
      list.  The biggest change to pass down the buffer list was done to the AIL
      pushing. Now that we operate on buffers the trylock, push and pushbuf log
      item methods are merged into a single push routine, which tries to lock the
      item, and if possible add the buffer that needs writeback to the buffer list.
      This leads to much simpler code than the previous split but requires the
      individual IOP_PUSH instances to unlock and reacquire the AIL around calls
      to blocking routines.
      
      Given that xfsailds now also handle writing out buffers, the conditions for
      log forcing and the sleep times needed some small changes.  The most
      important one is that we consider an AIL busy as long we still have buffers
      to push, and the other one is that we do increment the pushed LSN for
      buffers that are under flushing at this moment, but still count them towards
      the stuck items for restart purposes.  Without this we could hammer on stuck
      items without ever forcing the log and not make progress under heavy random
      delete workloads on fast flash storage devices.
      
      [ Dave Chinner:
      	- rebase on previous patches.
      	- improved comments for XBF_DELWRI_Q handling
      	- fix XBF_ASYNC handling in queue submission (test 106 failure)
      	- rename delwri submit function buffer list parameters for clarity
      	- xfs_efd_item_push() should return XFS_ITEM_PINNED ]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      43ff2122
  10. 09 11月, 2011 1 次提交
  11. 28 1月, 2011 1 次提交
    • D
      xfs: fix efi item leak on forced shutdown · e34a314c
      Dave Chinner 提交于
      After test 139, kmemleak shows:
      
      unreferenced object 0xffff880078b405d8 (size 400):
        comm "xfs_io", pid 4904, jiffies 4294909383 (age 1186.728s)
        hex dump (first 32 bytes):
          60 c1 17 79 00 88 ff ff 60 c1 17 79 00 88 ff ff  `..y....`..y....
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff81afb04d>] kmemleak_alloc+0x2d/0x60
          [<ffffffff8115c6cf>] kmem_cache_alloc+0x13f/0x2b0
          [<ffffffff814aaa97>] kmem_zone_alloc+0x77/0xf0
          [<ffffffff814aab2e>] kmem_zone_zalloc+0x1e/0x50
          [<ffffffff8147cd6b>] xfs_efi_init+0x4b/0xb0
          [<ffffffff814a4ee8>] xfs_trans_get_efi+0x58/0x90
          [<ffffffff81455fab>] xfs_bmap_finish+0x8b/0x1d0
          [<ffffffff814851b4>] xfs_itruncate_finish+0x2c4/0x5d0
          [<ffffffff814a970f>] xfs_setattr+0x8df/0xa70
          [<ffffffff814b5c7b>] xfs_vn_setattr+0x1b/0x20
          [<ffffffff8117dc00>] notify_change+0x170/0x2e0
          [<ffffffff81163bf6>] do_truncate+0x66/0xa0
          [<ffffffff81163d0b>] sys_ftruncate+0xdb/0xe0
          [<ffffffff8103a002>] system_call_fastpath+0x16/0x1b
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      The cause of the leak is that the "remove" parameter of IOP_UNPIN()
      is never set when a CIL push is aborted. This means that the EFI
      item is never freed if it was in the push being cancelled. The
      problem is specific to delayed logging, but has uncovered a couple
      of problems with the handling of IOP_UNPIN(remove).
      
      Firstly, we cannot safely call xfs_trans_del_item() from IOP_UNPIN()
      in the CIL commit failure path or the iclog write failure path
      because for delayed loging we have no transaction context. Hence we
      must only call xfs_trans_del_item() if the log item being unpinned
      has an active log item descriptor.
      
      Secondly, xfs_trans_uncommit() does not handle log item descriptor
      freeing during the traversal of log items on a transaction. It can
      reference a freed log item descriptor when unpinning an EFI item.
      Hence it needs to use a safe list traversal method to allow items to
      be removed from the transaction during IOP_UNPIN().
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      e34a314c
  12. 20 12月, 2010 2 次提交
    • D
      xfs: Pull EFI/EFD handling out from under the AIL lock · b199c8a4
      Dave Chinner 提交于
      EFI/EFD interactions are protected from races by the AIL lock. They
      are the only type of log items that require the the AIL lock to
      serialise internal state, so they need to be separated from the AIL
      lock before we can do bulk insert operations on the AIL.
      
      To acheive this, convert the counter of the number of extents in the
      EFI to an atomic so it can be safely manipulated by EFD processing
      without locks. Also, convert the EFI state flag manipulations to use
      atomic bit operations so no locks are needed to record state
      changes. Finally, use the state bits to determine when it is safe to
      free the EFI and clean up the code to do this neatly.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      b199c8a4
    • D
      xfs: fix EFI transaction cancellation. · 9c5f8414
      Dave Chinner 提交于
      XFS_EFI_CANCELED has not been set in the code base since
      xfs_efi_cancel() was removed back in 2006 by commit
      065d312e ("[XFS] Remove unused
      iop_abort log item operation), and even then xfs_efi_cancel() was
      never called. I haven't tracked it back further than that (beyond
      git history), but it indicates that the handling of EFIs in
      cancelled transactions has been broken for a long time.
      
      Basically, when we get an IOP_UNPIN(lip, 1); call from
      xfs_trans_uncommit() (i.e. remove == 1), if we don't free the log
      item descriptor we leak it. Fix the behviour to be correct and kill
      the XFS_EFI_CANCELED flag.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      9c5f8414
  13. 27 7月, 2010 5 次提交
    • C
      xfs: fix the xfs_log_iovec i_addr type · 4e0d5f92
      Christoph Hellwig 提交于
      By making this member a void pointer we can get rid of a lot of pointless
      casts.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      4e0d5f92
    • C
      xfs: give xfs_item_ops methods the correct prototypes · 7bfa31d8
      Christoph Hellwig 提交于
      Stop the function pointer casting madness and give all the xfs_item_ops the
      correct prototypes.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      7bfa31d8
    • C
      xfs: merge iop_unpin_remove into iop_unpin · 9412e318
      Christoph Hellwig 提交于
      The unpin_remove item operation instances always share most of the
      implementation with the respective unpin implementation.  So instead
      of keeping two different entry points add a remove flag to the unpin
      operation and share the code more easily.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      9412e318
    • C
      xfs: simplify log item descriptor tracking · e98c414f
      Christoph Hellwig 提交于
      Currently we track log item descriptor belonging to a transaction using a
      complex opencoded chunk allocator.  This code has been there since day one
      and seems to work around the lack of an efficient slab allocator.
      
      This patch replaces it with dynamically allocated log item descriptors
      from a dedicated slab pool, linked to the transaction by a linked list.
      
      This allows to greatly simplify the log item descriptor tracking to the
      point where it's just a couple hundred lines in xfs_trans.c instead of
      a separate file.  The external API has also been simplified while we're
      at it - the xfs_trans_add_item and xfs_trans_del_item functions to add/
      delete items from a transaction have been simplified to the bare minium,
      and the xfs_trans_find_item function is replaced with a direct dereference
      of the li_desc field.  All debug code walking the list of log items in
      a transaction is down to a simple list_for_each_entry.
      
      Note that we could easily use a singly linked list here instead of the
      double linked list from list.h as the fastpath only does deletion from
      sequential traversal.  But given that we don't have one available as
      a library function yet I use the list.h functions for simplicity.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      e98c414f
    • C
      xfs: drop dmapi hooks · 288699fe
      Christoph Hellwig 提交于
      Dmapi support was never merged upstream, but we still have a lot of hooks
      bloating XFS for it, all over the fast pathes of the filesystem.
      
      This patch drops over 700 lines of dmapi overhead.  If we'll ever get HSM
      support in mainline at least the namespace events can be done much saner
      in the VFS instead of the individual filesystem, so it's not like this
      is much help for future work.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      288699fe
  14. 19 5月, 2010 2 次提交
  15. 22 1月, 2010 1 次提交
  16. 30 10月, 2008 3 次提交
  17. 28 7月, 2008 1 次提交
  18. 07 2月, 2008 1 次提交
  19. 15 10月, 2007 1 次提交
    • D
      [XFS] Radix tree based inode caching · da353b0d
      David Chinner 提交于
      One of the perpetual scaling problems XFS has is indexing it's incore
      inodes. We currently uses hashes and the default hash sizes chosen can
      only ever be a tradeoff between memory consumption and the maximum
      realistic size of the cache.
      
      As a result, anyone who has millions of inodes cached on a filesystem
      needs to tunes the size of the cache via the ihashsize mount option to
      allow decent scalability with inode cache operations.
      
      A further problem is the separate inode cluster hash, whose size is based
      on the ihashsize but is smaller, and so under certain conditions (sparse
      cluster cache population) this can become a limitation long before the
      inode hash is causing issues.
      
      The following patchset removes the inode hash and cluster hash and
      replaces them with radix trees to avoid the scalability limitations of the
      hashes. It also reduces the size of the inodes by 3 pointers....
      
      SGI-PV: 969561
      SGI-Modid: xfs-linux-melb:xfs-kern:29481a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      da353b0d
  20. 10 2月, 2007 1 次提交
    • D
      [XFS] Keep stack usage down for 4k stacks by using noinline. · 7989cb8e
      David Chinner 提交于
      gcc-4.1 and more recent aggressively inline static functions which
      increases XFS stack usage by ~15% in critical paths. Prevent this from
      occurring by adding noinline to the STATIC definition.
      
      Also uninline some functions that are too large to be inlined and were
      causing problems with CONFIG_FORCED_INLINING=y.
      
      Finally, clean up all the different users of inline, __inline and
      __inline__ and put them under one STATIC_INLINE macro. For debug kernels
      the STATIC_INLINE macro uninlines those functions.
      
      SGI-PV: 957159
      SGI-Modid: xfs-linux-melb:xfs-kern:27585a
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NDavid Chatterton <chatz@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      7989cb8e
  21. 28 9月, 2006 1 次提交
  22. 20 6月, 2006 1 次提交
  23. 09 6月, 2006 1 次提交
  24. 02 11月, 2005 2 次提交
  25. 02 9月, 2005 1 次提交
  26. 21 6月, 2005 2 次提交