1. 18 12月, 2012 2 次提交
  2. 16 11月, 2012 2 次提交
    • D
      xfs: convert buffer verifiers to an ops structure. · 1813dd64
      Dave Chinner 提交于
      To separate the verifiers from iodone functions and associate read
      and write verifiers at the same time, introduce a buffer verifier
      operations structure to the xfs_buf.
      
      This avoids the need for assigning the write verifier, clearing the
      iodone function and re-running ioend processing in the read
      verifier, and gets rid of the nasty "b_pre_io" name for the write
      verifier function pointer. If we ever need to, it will also be
      easier to add further content specific callbacks to a buffer with an
      ops structure in place.
      
      We also avoid needing to export verifier functions, instead we
      can simply export the ops structures for those that are needed
      outside the function they are defined in.
      
      This patch also fixes a directory block readahead verifier issue
      it exposed.
      
      This patch also adds ops callbacks to the inode/alloc btree blocks
      initialised by growfs. These will need more work before they will
      work with CRCs.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NPhil White <pwhite@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      1813dd64
    • D
      xfs: make buffer read verication an IO completion function · c3f8fc73
      Dave Chinner 提交于
      Add a verifier function callback capability to the buffer read
      interfaces.  This will be used by the callers to supply a function
      that verifies the contents of the buffer when it is read from disk.
      This patch does not provide callback functions, but simply modifies
      the interfaces to allow them to be called.
      
      The reason for adding this to the read interfaces is that it is very
      difficult to tell fom the outside is a buffer was just read from
      disk or whether we just pulled it out of cache. Supplying a callbck
      allows the buffer cache to use it's internal knowledge of the buffer
      to execute it only when the buffer is read from disk.
      
      It is intended that the verifier functions will mark the buffer with
      an EFSCORRUPTED error when verification fails. This allows the
      reading context to distinguish a verification error from an IO
      error, and potentially take further actions on the buffer (e.g.
      attempt repair) based on the error reported.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NPhil White <pwhite@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      c3f8fc73
  3. 02 7月, 2012 1 次提交
  4. 15 5月, 2012 10 次提交
    • D
      xfs: make XBF_MAPPED the default behaviour · 611c9946
      Dave Chinner 提交于
      Rather than specifying XBF_MAPPED for almost all buffers, introduce
      XBF_UNMAPPED for the couple of users that use unmapped buffers.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      611c9946
    • D
      xfs: clean up xfs_bit.h includes · ad1e95c5
      Dave Chinner 提交于
      With the removal of xfs_rw.h and other changes over time, xfs_bit.h
      is being included in many files that don't actually need it. Clean
      up the includes as necessary.
      
      Also move the only-used-once xfs_ialloc_find_free() static inline
      function out of a header file that is widely included to reduce
      the number of needless dependencies on xfs_bit.h.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      ad1e95c5
    • D
      xfs: move xfs_get_extsz_hint() and kill xfs_rw.h · 2a0ec1d9
      Dave Chinner 提交于
      The only thing left in xfs_rw.h is a function prototype for an inode
      function.  Move that to xfs_inode.h, and kill xfs_rw.h.
      
      Also move the function implementing the prototype from xfs_rw.c to
      xfs_inode.c so we only have one function left in xfs_rw.c
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      2a0ec1d9
    • D
      xfs: move xfsagino_t to xfs_types.h · 60a34607
      Dave Chinner 提交于
      Untangle the header file includes a bit by moving the definition of
      xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on
      xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include
      xfs_ag.h.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      60a34607
    • D
      xfs: kill XBF_DONTBLOCK · aa5c158e
      Dave Chinner 提交于
      Just about all callers of xfs_buf_read() and xfs_buf_get() use XBF_DONTBLOCK.
      This is used to make memory allocation use GFP_NOFS rather than GFP_KERNEL to
      avoid recursion through memory reclaim back into the filesystem.
      
      All the blocking get calls in growfs occur inside a transaction, even though
      they are no part of the transaction, so all allocation will be GFP_NOFS due to
      the task flag PF_TRANS being set. The blocking read calls occur during log
      recovery, so they will probably be unaffected by converting to GFP_NOFS
      allocations.
      
      Hence make XBF_DONTBLOCK behaviour always occur for buffers and kill the flag.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      aa5c158e
    • D
      xfs: kill xfs_read_buf() · 7ca790a5
      Dave Chinner 提交于
      xfs_read_buf() is effectively the same as xfs_trans_read_buf() when called
      outside a transaction context. The error handling is slightly different in that
      xfs_read_buf stales the errored buffer it gets back, but there is probably good
      reason for xfs_trans_read_buf() for doing this.
      
      Hence update xfs_trans_read_buf() to the same error handling as xfs_read_buf(),
      and convert all the callers of xfs_read_buf() to use the former function. We can
      then remove xfs_read_buf().
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      7ca790a5
    • D
      xfs: kill XBF_LOCK · a8acad70
      Dave Chinner 提交于
      Buffers are always returned locked from the lookup routines. Hence
      we don't need to tell the lookup routines to return locked buffers,
      on to try and lock them. Remove XBF_LOCK from all the callers and
      from internal buffer cache usage.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a8acad70
    • D
      xfs: use blocks for storing the desired IO size · aa0e8833
      Dave Chinner 提交于
      Now that we pass block counts everywhere, and index buffers by block
      number and length in units of blocks, convert the desired IO size
      into block counts rather than bytes. Convert the code to use block
      counts, and those that need byte counts get converted at the time of
      use.
      
      Rename the b_desired_count variable to something closer to it's
      purpose - b_io_length - as it is only used to specify the length of
      an IO for a subset of the buffer.  The only time this is used is for
      log IO - both writing iclogs and during log recovery. In all other
      cases, the b_io_length matches b_length, and hence a lot of code
      confuses the two. e.g. the buf item code uses the io count
      exclusively when it should be using the buffer length. Fix these
      apprpriately as they are found.
      
      Also, remove the XFS_BUF_{SET_}COUNT() macros that are just wrappers
      around the desired IO length. They only serve to make the code
      shouty loud, don't actually add any real value, and are often used
      incorrectly.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      aa0e8833
    • C
      xfs: on-stack delayed write buffer lists · 43ff2122
      Christoph Hellwig 提交于
      Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
      and write back the buffers per-process instead of by waking up xfsbufd.
      
      This is now easily doable given that we have very few places left that write
      delwri buffers:
      
       - log recovery:
      	Only done at mount time, and already forcing out the buffers
      	synchronously using xfs_flush_buftarg
      
       - quotacheck:
      	Same story.
      
       - dquot reclaim:
      	Writes out dirty dquots on the LRU under memory pressure.  We might
      	want to look into doing more of this via xfsaild, but it's already
      	more optimal than the synchronous inode reclaim that writes each
      	buffer synchronously.
      
       - xfsaild:
      	This is the main beneficiary of the change.  By keeping a local list
      	of buffers to write we reduce latency of writing out buffers, and
      	more importably we can remove all the delwri list promotions which
      	were hitting the buffer cache hard under sustained metadata loads.
      
      The implementation is very straight forward - xfs_buf_delwri_queue now gets
      a new list_head pointer that it adds the delwri buffers to, and all callers
      need to eventually submit the list using xfs_buf_delwi_submit or
      xfs_buf_delwi_submit_nowait.  Buffers that already are on a delwri list are
      skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
      list.  The biggest change to pass down the buffer list was done to the AIL
      pushing. Now that we operate on buffers the trylock, push and pushbuf log
      item methods are merged into a single push routine, which tries to lock the
      item, and if possible add the buffer that needs writeback to the buffer list.
      This leads to much simpler code than the previous split but requires the
      individual IOP_PUSH instances to unlock and reacquire the AIL around calls
      to blocking routines.
      
      Given that xfsailds now also handle writing out buffers, the conditions for
      log forcing and the sleep times needed some small changes.  The most
      important one is that we consider an AIL busy as long we still have buffers
      to push, and the other one is that we do increment the pushed LSN for
      buffers that are under flushing at this moment, but still count them towards
      the stuck items for restart purposes.  Without this we could hammer on stuck
      items without ever forcing the log and not make progress under heavy random
      delete workloads on fast flash storage devices.
      
      [ Dave Chinner:
      	- rebase on previous patches.
      	- improved comments for XBF_DELWRI_Q handling
      	- fix XBF_ASYNC handling in queue submission (test 106 failure)
      	- rename delwri submit function buffer list parameters for clarity
      	- xfs_efd_item_push() should return XFS_ITEM_PINNED ]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      43ff2122
    • C
      xfs: do not add buffers to the delwri queue until pushed · 960c60af
      Christoph Hellwig 提交于
      Instead of adding buffers to the delwri list as soon as they are logged,
      even if they can't be written until commited because they are pinned
      defer adding them to the delwri list until xfsaild pushes them.  This
      makes the code more similar to other log items and prepares for writing
      buffers directly from xfsaild.
      
      The complication here is that we need to fail buffers that were added
      but not logged yet in xfs_buf_item_unpin, borrowing code from
      xfs_bioerror.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      960c60af
  5. 23 2月, 2012 1 次提交
  6. 12 10月, 2011 4 次提交
  7. 26 7月, 2011 4 次提交
  8. 13 7月, 2011 3 次提交
  9. 08 7月, 2011 1 次提交
  10. 26 3月, 2011 1 次提交
    • D
      xfs: xfs_trans_read_buf() should return an error on failure · 7401aafd
      Dave Chinner 提交于
      When inside a transaction and we fail to read a buffer,
      xfs_trans_read_buf returns a null buffer pointer and no error.
      xfs_do_da_buf() checks the error return, but not the buffer, and as
      a result this read failure condition causes a panic when it attempts
      to dereference the non-existant buffer.
      
      Make xfs_trans_read_buf() return the same error for this situation
      regardless of whether it is in a transaction or not. This means
      every caller does not need to check both the error return and the
      buffer before proceeding to use the buffer.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      7401aafd
  11. 07 3月, 2011 1 次提交
  12. 19 10月, 2010 1 次提交
  13. 27 7月, 2010 4 次提交
    • C
      xfs: give li_cb callbacks the correct prototype · ca30b2a7
      Christoph Hellwig 提交于
      Stop the function pointer casting madness and give all the li_cb instances
      correct prototype.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      ca30b2a7
    • C
      xfs: simplify log item descriptor tracking · e98c414f
      Christoph Hellwig 提交于
      Currently we track log item descriptor belonging to a transaction using a
      complex opencoded chunk allocator.  This code has been there since day one
      and seems to work around the lack of an efficient slab allocator.
      
      This patch replaces it with dynamically allocated log item descriptors
      from a dedicated slab pool, linked to the transaction by a linked list.
      
      This allows to greatly simplify the log item descriptor tracking to the
      point where it's just a couple hundred lines in xfs_trans.c instead of
      a separate file.  The external API has also been simplified while we're
      at it - the xfs_trans_add_item and xfs_trans_del_item functions to add/
      delete items from a transaction have been simplified to the bare minium,
      and the xfs_trans_find_item function is replaced with a direct dereference
      of the li_desc field.  All debug code walking the list of log items in
      a transaction is down to a simple list_for_each_entry.
      
      Note that we could easily use a singly linked list here instead of the
      double linked list from list.h as the fastpath only does deletion from
      sequential traversal.  But given that we don't have one available as
      a library function yet I use the list.h functions for simplicity.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      e98c414f
    • C
      xfs: remove unneeded #include statements · 3400777f
      Christoph Hellwig 提交于
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <david@fromorbit.com>
      3400777f
    • C
      xfs: drop dmapi hooks · 288699fe
      Christoph Hellwig 提交于
      Dmapi support was never merged upstream, but we still have a lot of hooks
      bloating XFS for it, all over the fast pathes of the filesystem.
      
      This patch drops over 700 lines of dmapi overhead.  If we'll ever get HSM
      support in mainline at least the namespace events can be done much saner
      in the VFS instead of the individual filesystem, so it's not like this
      is much help for future work.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      288699fe
  14. 24 5月, 2010 2 次提交
    • D
      xfs: Ensure inode allocation buffers are fully replayed · ccf7c23f
      Dave Chinner 提交于
      With delayed logging, we can get inode allocation buffers in the
      same transaction inode unlink buffers. We don't currently mark inode
      allocation buffers in the log, so inode unlink buffers take
      precedence over allocation buffers.
      
      The result is that when they are combined into the same checkpoint,
      only the unlinked inode chain fields are replayed, resulting in
      uninitialised inode buffers being detected when the next inode
      modification is replayed.
      
      To fix this, we need to ensure that we do not set the inode buffer
      flag in the buffer log item format flags if the inode allocation has
      not already hit the log. To avoid requiring a change to log
      recovery, we really need to make this a modification that relies
      only on in-memory sate.
      
      We can do this by checking during buffer log formatting (while the
      CIL cannot be flushed) if we are still in the same sequence when we
      commit the unlink transaction as the inode allocation transaction.
      If we are, then we do not add the inode buffer flag to the buffer
      log format item flags. This means the entire buffer will be
      replayed, not just the unlinked fields. We do this while
      CIL flusheѕ are locked out to ensure that we don't race with the
      sequence numbers changing and hence fail to put the inode buffer
      flag in the buffer format flags when we really need to.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      ccf7c23f
    • D
      xfs: Clean up XFS_BLI_* flag namespace · c1155410
      Dave Chinner 提交于
      Clean up the buffer log format (XFS_BLI_*) flags because they have a
      polluted namespace. They XFS_BLI_ prefix is used for both in-memory
      and on-disk flag feilds, but have overlapping values for different
      flags. Rename the buffer log format flags to use the XFS_BLF_*
      prefix to avoid confusing them with the in-memory XFS_BLI_* prefixed
      flags.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      c1155410
  15. 19 5月, 2010 2 次提交
    • C
      xfs: simplify buffer to transaction matching · 4a5224d7
      Christoph Hellwig 提交于
      We currenly have a routine xfs_trans_buf_item_match_all which checks
      if any log item in a transaction contains a given buffer, and a
      second one that only does this check for the first, embedded chunk
      of log items.  We only use the second routine if we know we only
      have that log item chunk, so get rid of the limited routine and
      always use the more complete one.
      
      Also rename the old xfs_trans_buf_item_match_all to
      xfs_trans_buf_item_match and update various surrounding comments,
      and move the remaining xfs_trans_buf_item_match on top of the file
      to avoid a forward prototype.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      4a5224d7
    • D
      xfs: remove stale parameter from ->iop_unpin method · 8e123850
      Dave Chinner 提交于
      The staleness of a object being unpinned can be directly derived
      from the object itself - there is no need to extract it from the
      object then pass it as a parameter into IOP_UNPIN().
      
      This means we can kill the XFS_LID_BUF_STALE flag - it is set,
      checked and cleared in the same places XFS_BLI_STALE flag in the
      xfs_buf_log_item so it is now redundant and hence safe to remove.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      8e123850
  16. 02 3月, 2010 1 次提交