1. 22 6月, 2015 3 次提交
  2. 04 6月, 2015 1 次提交
  3. 22 1月, 2015 1 次提交
    • D
      xfs: consolidate superblock logging functions · 61e63ecb
      Dave Chinner 提交于
      We now have several superblock loggin functions that are identical
      except for the transaction reservation and whether it shoul dbe a
      synchronous transaction or not. Consolidate these all into a single
      function, a single reserveration and a sync flag and call it
      xfs_sync_sb().
      
      Also, xfs_mod_sb() is not really a modification function - it's the
      operation of logging the superblock buffer. hence change the name of
      it to reflect this.
      
      Note that we have to change the mp->m_update_flags that are passed
      around at mount time to a boolean simply to indicate a superblock
      update is needed.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      61e63ecb
  4. 24 12月, 2014 2 次提交
  5. 04 12月, 2014 1 次提交
    • B
      xfs: split metadata and log buffer completion to separate workqueues · b29c70f5
      Brian Foster 提交于
      XFS traditionally sends all buffer I/O completion work to a single
      workqueue. This includes metadata buffer completion and log buffer
      completion. The log buffer completion requires a high priority queue to
      prevent stalls due to log forces getting stuck behind other queued work.
      
      Rather than continue to prioritize all buffer I/O completion due to the
      needs of log completion, split log buffer completion off to
      m_log_workqueue and move the high priority flag from m_buf_workqueue to
      m_log_workqueue.
      
      Add a b_ioend_wq wq pointer to xfs_buf to allow completion workqueue
      customization on a per-buffer basis. Initialize b_ioend_wq to
      m_buf_workqueue by default in the generic buffer I/O submission path.
      Finally, override the default wq with the high priority m_log_workqueue
      in the log buffer I/O submission path.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      b29c70f5
  6. 28 11月, 2014 3 次提交
  7. 02 10月, 2014 3 次提交
    • D
      xfs: introduce xfs_buf_submit[_wait] · 595bff75
      Dave Chinner 提交于
      There is a lot of cookie-cutter code that looks like:
      
      	if (shutdown)
      		handle buffer error
      	xfs_buf_iorequest(bp)
      	error = xfs_buf_iowait(bp)
      	if (error)
      		handle buffer error
      
      spread through XFS. There's significant complexity now in
      xfs_buf_iorequest() to specifically handle this sort of synchronous
      IO pattern, but there's all sorts of nasty surprises in different
      error handling code dependent on who owns the buffer references and
      the locks.
      
      Pull this pattern into a single helper, where we can hide all the
      synchronous IO warts and hence make the error handling for all the
      callers much saner. This removes the need for a special extra
      reference to protect IO completion processing, as we can now hold a
      single reference across dispatch and waiting, simplifying the sync
      IO smeantics and error handling.
      
      In doing this, also rename xfs_buf_iorequest to xfs_buf_submit and
      make it explicitly handle on asynchronous IO. This forces all users
      to be switched specifically to one interface or the other and
      removes any ambiguity between how the interfaces are to be used. It
      also means that xfs_buf_iowait() goes away.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      595bff75
    • D
      xfs: xfs_buf_ioend and xfs_buf_iodone_work duplicate functionality · e8aaba9a
      Dave Chinner 提交于
      We do some work in xfs_buf_ioend, and some work in
      xfs_buf_iodone_work, but much of that functionality is the same.
      This work can all be done in a single function, leaving
      xfs_buf_iodone just a wrapper to determine if we should execute it
      by workqueue or directly. hence rename xfs_buf_iodone_work to
      xfs_buf_ioend(), and add a new xfs_buf_ioend_async() for places that
      need async processing.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      e8aaba9a
    • D
      xfs: force the log before shutting down · a870fe6d
      Dave Chinner 提交于
      When we have marked the filesystem for shutdown, we want to prevent
      any further buffer IO from being submitted. However, we currently
      force the log after marking the filesystem as shut down, hence
      allowing IO to the log *after* we have marked both the filesystem
      and the log as in an error state.
      
      Clean this up by forcing the log before we mark the filesytem with
      an error. This replaces the pure CIL flush that we currently have
      which works around this same issue (i.e the CIL can't be flushed
      once the shutdown flags are set) and hence enables us to clean up
      the logic substantially.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      a870fe6d
  8. 04 8月, 2014 1 次提交
    • D
      xfs: catch buffers written without verifiers attached · 400b9d88
      Dave Chinner 提交于
      We recently had a bug where buffers were slipping through log
      recovery without any verifier attached to them. This was resulting
      in on-disk CRC mismatches for valid data. Add some warning code to
      catch this occurrence so that we catch such bugs during development
      rather than not being aware they exist.
      
      Note that we cannot do this verification unconditionally as non-CRC
      filesystems don't always attach verifiers to the buffers being
      written. e.g. during log recovery we cannot identify all the
      different types of buffers correctly on non-CRC filesystems, so we
      can't attach the correct verifiers in all cases and so we don't
      attach any. Hence we don't want on non-CRC filesystems to avoid
      spamming the logs with false indications.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      400b9d88
  9. 15 7月, 2014 1 次提交
  10. 25 6月, 2014 1 次提交
    • D
      xfs: global error sign conversion · 2451337d
      Dave Chinner 提交于
      Convert all the errors the core XFs code to negative error signs
      like the rest of the kernel and remove all the sign conversion we
      do in the interface layers.
      
      Errors for conversion (and comparison) found via searches like:
      
      $ git grep " E" fs/xfs
      $ git grep "return E" fs/xfs
      $ git grep " E[A-Z].*;$" fs/xfs
      
      Negation points found via searches like:
      
      $ git grep "= -[a-z,A-Z]" fs/xfs
      $ git grep "return -[a-z,A-D,F-Z]" fs/xfs
      $ git grep " -[a-z].*;" fs/xfs
      
      [ with some bits I missed from Brian Foster ]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      2451337d
  11. 22 6月, 2014 1 次提交
  12. 06 6月, 2014 1 次提交
  13. 07 5月, 2014 1 次提交
    • D
      xfs: don't sleep in xlog_cil_force_lsn on shutdown · ac983517
      Dave Chinner 提交于
      Reports of a shutdown hang when fsyncing a directory have surfaced,
      such as this:
      
      [ 3663.394472] Call Trace:
      [ 3663.397199]  [<ffffffff815f1889>] schedule+0x29/0x70
      [ 3663.402743]  [<ffffffffa01feda5>] xlog_cil_force_lsn+0x185/0x1a0 [xfs]
      [ 3663.416249]  [<ffffffffa01fd3af>] _xfs_log_force_lsn+0x6f/0x2f0 [xfs]
      [ 3663.429271]  [<ffffffffa01a339d>] xfs_dir_fsync+0x7d/0xe0 [xfs]
      [ 3663.435873]  [<ffffffff811df8c5>] do_fsync+0x65/0xa0
      [ 3663.441408]  [<ffffffff811dfbc0>] SyS_fsync+0x10/0x20
      [ 3663.447043]  [<ffffffff815fc7d9>] system_call_fastpath+0x16/0x1b
      
      If we trigger a shutdown in xlog_cil_push() from xlog_write(), we
      will never wake waiters on the current push sequence number, so
      anything waiting in xlog_cil_force_lsn() for that push sequence
      number to come up will not get woken and hence stall the shutdown.
      
      Fix this by ensuring we call wake_up_all(&cil->xc_commit_wait) in
      the push abort handling, in the log shutdown code when waking all
      waiters, and adding a shutdown check in the sequence completion wait
      loops to ensure they abort when a wakeup due to a shutdown occurs.
      Reported-by: NBoris Ranto <branto@redhat.com>
      Reported-by: NEric Sandeen <esandeen@redhat.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      ac983517
  14. 05 5月, 2014 1 次提交
    • D
      xfs: fully support v5 format filesystems · c99d609a
      Dave Chinner 提交于
      We have had this code in the kernel for over a year now and have
      shaken all the known issues out of the code over the past few
      releases. It's now time to remove the experimental warnings during
      mount and fully support the new filesystem format in production
      systems.
      
      Remove the experimental warning, and add a version number to the
      initial "mounting filesystem" message to tell use what type of
      filesystem is being mounted. Also, remove the temporary inode
      cluster size output at mount time now we know that this code works
      fine.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      c99d609a
  15. 17 4月, 2014 1 次提交
    • D
      xfs: unmount does not wait for shutdown during unmount · 9c23eccc
      Dave Chinner 提交于
      And interesting situation can occur if a log IO error occurs during
      the unmount of a filesystem. The cases reported have the same
      signature - the update of the superblock counters fails due to a log
      write IO error:
      
      XFS (dm-16): xfs_do_force_shutdown(0x2) called from line 1170 of file fs/xfs/xfs_log.c.  Return address = 0xffffffffa08a44a1
      XFS (dm-16): Log I/O Error Detected.  Shutting down filesystem
      XFS (dm-16): Unable to update superblock counters. Freespace may not be correct on next mount.
      XFS (dm-16): xfs_log_force: error 5 returned.
      XFS (¿-¿¿¿): Please umount the filesystem and rectify the problem(s)
      
      It can be seen that the last line of output contains a corrupt
      device name - this is because the log and xfs_mount structures have
      already been freed by the time this message is printed. A kernel
      oops closely follows.
      
      The issue is that the shutdown is occurring in a separate IO
      completion thread to the unmount. Once the shutdown processing has
      started and all the iclogs are marked with XLOG_STATE_IOERROR, the
      log shutdown code wakes anyone waiting on a log force so they can
      process the shutdown error. This wakes up the unmount code that
      is doing a synchronous transaction to update the superblock
      counters.
      
      The unmount path now sees all the iclogs are marked with
      XLOG_STATE_IOERROR and so never waits on them again, knowing that if
      it does, there will not be a wakeup trigger for it and we will hang
      the unmount if we do. Hence the unmount runs through all the
      remaining code and frees all the filesystem structures while the
      xlog_iodone() is still processing the shutdown. When the log
      shutdown processing completes, xfs_do_force_shutdown() emits the
      "Please umount the filesystem and rectify the problem(s)" message,
      and xlog_iodone() then aborts all the objects attached to the iclog.
      An iclog that has already been freed....
      
      The real issue here is that there is no serialisation point between
      the log IO and the unmount. We have serialisations points for log
      writes, log forces, reservations, etc, but we don't actually have
      any code that wakes for log IO to fully complete. We do that for all
      other types of object, so why not iclogbufs?
      
      Well, it turns out that we can easily do this. We've got xfs_buf
      handles, and that's what everyone else uses for IO serialisation.
      i.e. bp->b_sema. So, lets hold iclogbufs locked over IO, and only
      release the lock in xlog_iodone() when we are finished with the
      buffer. That way before we tear down the iclog, we can lock and
      unlock the buffer to ensure IO completion has finished completely
      before we tear it down.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Tested-by: NMike Snitzer <snitzer@redhat.com>
      Tested-by: NBob Mastors <bob.mastors@solidfire.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      9c23eccc
  16. 07 11月, 2013 1 次提交
  17. 31 10月, 2013 1 次提交
  18. 24 10月, 2013 3 次提交
    • D
      xfs: decouple inode and bmap btree header files · a4fbe6ab
      Dave Chinner 提交于
      Currently the xfs_inode.h header has a dependency on the definition
      of the BMAP btree records as the inode fork includes an array of
      xfs_bmbt_rec_host_t objects in it's definition.
      
      Move all the btree format definitions from xfs_btree.h,
      xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
      xfs_format.h to continue the process of centralising the on-disk
      format definitions. With this done, the xfs inode definitions are no
      longer dependent on btree header files.
      
      The enables a massive culling of unnecessary includes, with close to
      200 #include directives removed from the XFS kernel code base.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      a4fbe6ab
    • D
      xfs: decouple log and transaction headers · 239880ef
      Dave Chinner 提交于
      xfs_trans.h has a dependency on xfs_log.h for a couple of
      structures. Most code that does transactions doesn't need to know
      anything about the log, but this dependency means that they have to
      include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
      files and clean up the includes to be in dependency order.
      
      In doing this, remove the direct include of xfs_trans_reserve.h from
      xfs_trans.h so that we remove the dependency between xfs_trans.h and
      xfs_mount.h. Hence the xfs_trans.h include can be moved to the
      indicate the actual dependencies other header files have on it.
      
      Note that these are kernel only header files, so this does not
      translate to any userspace changes at all.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      239880ef
    • D
      xfs: create a shared header file for format-related information · 70a9883c
      Dave Chinner 提交于
      All of the buffer operations structures are needed to be exported
      for xfs_db, so move them all to a common location rather than
      spreading them all over the place. They are verifying the on-disk
      format, so while xfs_format.h might be a good place, it is not part
      of the on disk format.
      
      Hence we need to create a new header file that we centralise these
      related definitions. Start by moving the bffer operations
      structures, and then also move all the other definitions that have
      crept into xfs_log_format.h and xfs_format.h as there was no other
      shared header file to put them in.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      70a9883c
  19. 18 10月, 2013 1 次提交
  20. 17 10月, 2013 1 次提交
    • D
      xfs: prevent deadlock trying to cover an active log · 2c6e24ce
      Dave Chinner 提交于
      Recent analysis of a deadlocked XFS filesystem from a kernel
      crash dump indicated that the filesystem was stuck waiting for log
      space. The short story of the hang on the RHEL6 kernel is this:
      
      	- the tail of the log is pinned by an inode
      	- the inode has been pushed by the xfsaild
      	- the inode has been flushed to it's backing buffer and is
      	  currently flush locked and hence waiting for backing
      	  buffer IO to complete and remove it from the AIL
      	- the backing buffer is marked for write - it is on the
      	  delayed write queue
      	- the inode buffer has been modified directly and logged
      	  recently due to unlinked inode list modification
      	- the backing buffer is pinned in memory as it is in the
      	  active CIL context.
      	- the xfsbufd won't start buffer writeback because it is
      	  pinned
      	- xfssyncd won't force the log because it sees the log as
      	  needing to be covered and hence wants to issue a dummy
      	  transaction to move the log covering state machine along.
      
      Hence there is no trigger to force the CIL to the log and hence
      unpin the inode buffer and therefore complete the inode IO, remove
      it from the AIL and hence move the tail of the log along, allowing
      transactions to start again.
      
      Mainline kernels also have the same deadlock, though the signature
      is slightly different - the inode buffer never reaches the delayed
      write lists because xfs_buf_item_push() sees that it is pinned and
      hence never adds it to the delayed write list that the xfsaild
      flushes.
      
      There are two possible solutions here. The first is to simply force
      the log before trying to cover the log and so ensure that the CIL is
      emptied before we try to reserve space for the dummy transaction in
      the xfs_log_worker(). While this might work most of the time, it is
      still racy and is no guarantee that we don't get stuck in
      xfs_trans_reserve waiting for log space to come free. Hence it's not
      the best way to solve the problem.
      
      The second solution is to modify xfs_log_need_covered() to be aware
      of the CIL. We only should be attempting to cover the log if there
      is no current activity in the log - covering the log is the process
      of ensuring that the head and tail in the log on disk are identical
      (i.e. the log is clean and at idle). Hence, by definition, if there
      are items in the CIL then the log is not at idle and so we don't
      need to attempt to cover it.
      
      When we don't need to cover the log because it is active or idle, we
      issue a log force from xfs_log_worker() - if the log is idle, then
      this does nothing.  However, if the log is active due to there being
      items in the CIL, it will force the items in the CIL to the log and
      unpin them.
      
      In the case of the above deadlock scenario, instead of
      xfs_log_worker() getting stuck in xfs_trans_reserve() attempting to
      cover the log, it will instead force the log, thereby unpinning the
      inode buffer, allowing IO to be issued and complete and hence
      removing the inode that was pinning the tail of the log from the
      AIL. At that point, everything will start moving along again. i.e.
      the xfs_log_worker turns back into a watchdog that can alleviate
      deadlocks based around pinned items that prevent the tail of the log
      from being moved...
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      2c6e24ce
  21. 10 9月, 2013 1 次提交
  22. 21 8月, 2013 2 次提交
  23. 13 8月, 2013 2 次提交
  24. 23 7月, 2013 1 次提交
    • C
      xfs: Fix a deadlock in xfs_log_commit_cil() code path · 297aa637
      Chandra Seetharaman 提交于
      While testing and rearranging pquota/gquota code, I stumbled
      on a xfs_shutdown() during a mount. But the mount just hung.
      
      Debugged and found that there is a deadlock involving
      &log->l_cilp->xc_ctx_lock.
      
      It is in a code path where &log->l_cilp->xc_ctx_lock is first
      acquired in read mode and some levels down the same semaphore
      is being acquired in write mode causing a deadlock.
      
      This is the stack:
      xfs_log_commit_cil -> acquires &log->l_cilp->xc_ctx_lock in read mode
        xlog_print_tic_res
          xfs_force_shutdown
            xfs_log_force_umount
              xlog_cil_force
                xlog_cil_force_lsn
                  xlog_cil_push_foreground
                    xlog_cil_push - tries to acquire same semaphore in write mode
      
      This patch fixes the deadlock by changing the reason code for
      xfs_force_shutdown in xlog_print_tic_res() to SHUTDOWN_LOG_IO_ERROR.
      
      SHUTDOWN_LOG_IO_ERROR is the right reason code to be set since
      we are in the log path.
      
      Thanks to Dave for suggesting this solution.
      Signed-off-by: NChandra Seetharaman <sekharan@us.ibm.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      297aa637
  25. 28 6月, 2013 1 次提交
    • D
      xfs: Introduce ordered log vector support · fd63875c
      Dave Chinner 提交于
      And "ordered log vector" is a log vector that is used for
      tracking a log item through the CIL and into the AIL as part of the
      log checkpointing. These ordered log vectors are special in that
      they are not written to to journal in any way, and are not accounted
      to the checkpoint being written.
      
      The reason for this behaviour is to allow operations to attach items
      to transactions and have them follow the normal transactional
      lifecycle without actually having to write them to the journal. This
      allows logging of items that track high level logical changes and
      writing them to the log, while the physical items being modified
      pass through into the AIL and pin the tail of the log (and therefore
      the logical item in the log) until all the modified items are
      physically written to disk.
      
      IOWs, it allows us to write metadata without physically logging
      every individual change but still maintain the full transactional
      integrity guarantees we currently have w.r.t. crash recovery.
      
      This change modifies some of the CIL item insertion loops, as
      ordered log vectors introduce some new constraints as they don't
      track any data. One advantage of this change is that it combines
      two log vector chain walks into a single pass, so there is less
      overhead in the transaction commit pass as well. It also kills some
      unused code in the log vector walk loop when committing the CIL.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      fd63875c
  26. 08 3月, 2013 1 次提交
  27. 19 1月, 2013 1 次提交
  28. 18 1月, 2013 1 次提交
  29. 04 12月, 2012 1 次提交
    • D
      xfs: fix sparse reported log CRC endian issue · f9668a09
      Dave Chinner 提交于
      Not a bug as such, just warning noise from the xlog_cksum()
      returning a __be32 type when it should be returning a __le32 type.
      
      On Wed, Nov 28, 2012 at 08:30:59AM -0500, Christoph Hellwig wrote:
      > But why are we storing the crc field little endian while all other on
      > disk formats are big endian? (And yes I realize it might as well have
      > been me who did that back in the idea, but I still have no idea why)
      
      Because the CRC always returns the calcuation LE format, even on BE
      systems. So rather than always having to byte swap it everywhere and
      have all the force casts and anootations for sparse, it seems simpler to
      just make it a __le32 everywhere....
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBen Myers <bpm@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NMark Tinguely <tinguely@sgi.com>
      Signed-off-by: NBen Myers <bpm@sgi.com>
      f9668a09
新手
引导
客服 返回
顶部