1. 12 1月, 2011 1 次提交
    • D
      xfs: prevent NMI timeouts in cmn_err · 73efe4a4
      Dave Chinner 提交于
      We currently have a global error message buffer in cmn_err that is
      protected by a spin lock that disables interrupts.  Recently there
      have been reports of NMI timeouts occurring when the console is
      being flooded by SCSI error reports due to cmn_err() getting stuck
      trying to print to the console while holding this lock (i.e. with
      interrupts disabled). The NMI watchdog is seeing this CPU as
      non-responding and so is triggering a panic.  While the trigger for
      the reported case is SCSI errors, pretty much anything that spams
      the kernel log could cause this to occur.
      
      Realistically the only reason that we have the intemediate message
      buffer is to prepend the correct kernel log level prefix to the log
      message. The only reason we have the lock is to protect the global
      message buffer and the only reason the message buffer is global is
      to keep it off the stack. Hence if we can avoid needing a global
      message buffer we avoid needing the lock, and we can do this with a
      small amount of cleanup and some preprocessor tricks:
      
      	1. clean up xfs_cmn_err() panic mask functionality to avoid
      	   needing debug code in xfs_cmn_err()
      	2. remove the couple of "!" message prefixes that still exist that
      	   the existing cmn_err() code steps over.
      	3. redefine CE_* levels directly to KERN_*
      	4. redefine cmn_err() and friends to use printk() directly
      	   via variable argument length macros.
      
      By doing this, we can completely remove the cmn_err() code and the
      lock that is causing the problems, and rely solely on printk()
      serialisation to ensure that we don't get garbled messages.
      
      A series of followup patches is really needed to clean up all the
      cmn_err() calls and related messages properly, but that results in a
      series that is not easily back portable to enterprise kernels. Hence
      this initial fix is only to address the direct problem in the lowest
      impact way possible.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      73efe4a4
  2. 21 12月, 2010 1 次提交
    • D
      xfs: convert l_tail_lsn to an atomic variable. · 1c3cb9ec
      Dave Chinner 提交于
      log->l_tail_lsn is currently protected by the log grant lock. The
      lock is only needed for serialising readers against writers, so we
      don't really need the lock if we make the l_tail_lsn variable an
      atomic. Converting the l_tail_lsn variable to an atomic64_t means we
      can start to peel back the grant lock from various operations.
      
      Also, provide functions to safely crack an atomic LSN variable into
      it's component pieces and to recombined the components into an
      atomic variable. Use them where appropriate.
      
      This also removes the need for explicitly holding a spinlock to read
      the l_tail_lsn on 32 bit platforms.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      
      1c3cb9ec
  3. 03 12月, 2010 1 次提交
    • D
      xfs: convert l_last_sync_lsn to an atomic variable · 84f3c683
      Dave Chinner 提交于
      log->l_last_sync_lsn is updated in only one critical spot - log
      buffer Io completion - and is protected by the grant lock here. This
      requires the grant lock to be taken for every log buffer IO
      completion. Converting the l_last_sync_lsn variable to an atomic64_t
      means that we do not need to take the grant lock in log buffer IO
      completion to update it.
      
      This also removes the need for explicitly holding a spinlock to read
      the l_last_sync_lsn on 32 bit platforms.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      84f3c683
  4. 21 12月, 2010 1 次提交
  5. 20 12月, 2010 2 次提交
    • D
      xfs: use AIL bulk update function to implement single updates · e6059949
      Dave Chinner 提交于
      We now have two copies of AIL insert operations that are mostly
      duplicate functionality. The single log item updates can be
      implemented via the bulk updates by turning xfs_trans_ail_update()
      into a simple wrapper. This removes all the duplicate insert
      functionality and associated helpers.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      e6059949
    • D
      xfs: Pull EFI/EFD handling out from under the AIL lock · b199c8a4
      Dave Chinner 提交于
      EFI/EFD interactions are protected from races by the AIL lock. They
      are the only type of log items that require the the AIL lock to
      serialise internal state, so they need to be separated from the AIL
      lock before we can do bulk insert operations on the AIL.
      
      To acheive this, convert the counter of the number of extents in the
      EFI to an atomic so it can be safely manipulated by EFD processing
      without locks. Also, convert the EFI state flag manipulations to use
      atomic bit operations so no locks are needed to record state
      changes. Finally, use the state bits to determine when it is safe to
      free the EFI and clean up the code to do this neatly.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      b199c8a4
  6. 17 12月, 2010 4 次提交
  7. 19 10月, 2010 3 次提交
  8. 27 7月, 2010 3 次提交
  9. 24 6月, 2010 1 次提交
  10. 29 5月, 2010 1 次提交
  11. 24 5月, 2010 1 次提交
  12. 19 5月, 2010 12 次提交
    • A
      xfs: kill off l_sectbb_mask · 48389ef1
      Alex Elder 提交于
      There remains only one user of the l_sectbb_mask field in the log
      structure.  Just kill it off and compute the mask where needed from
      the power-of-2 sector size.
      
      (Only update from last post is to accomodate the changes in the
      previous patch in the series.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      48389ef1
    • A
      xfs: record log sector size rather than log2(that) · 69ce58f0
      Alex Elder 提交于
      Change struct log so it keeps track of the size (in basic blocks) of
      a log sector in l_sectBBsize rather than the log-base-2 of that
      value (previously, l_sectbb_log).  The name was chosen for
      consistency with the other fields in the structure that represent
      a number of basic blocks.
      
      (Updated so that a variable used in computing and verifying a log's
      sector size is named "log2_size".  Also added the "BB" to the
      structure field name, based on feedback from Eric Sandeen.  Also
      dropped some superfluous parentheses.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
      69ce58f0
    • C
      xfs: remove dead XFS_LOUD_RECOVERY code · 1414a604
      Christoph Hellwig 提交于
      This can't be enabled through the build system and has been dead for
      ages.  Note that the CRC patches add back log checksumming, but the
      code is quite different from the version removed here anyway.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <david@fromorbit.com>
      1414a604
    • A
      xfs: minor odds and ends in xfs_log_recover.c · 3f943d85
      Alex Elder 提交于
      Odds and ends in "xfs_log_recover.c".  This patch just contains some
      minor things that didn't seem to warrant their own individual
      patches:
      - In xlog_bread_noalign(), drop an assertion that a pointer is
        non-null (the crash will tell us it was a bad pointer).
      - Add a more descriptive header comment for xlog_find_verify_cycle().
      - Make a few additions to the comments in xlog_find_head().  Also
        rearrange some expressions in a few spots to produce the same
        result, but in a way that seems more clear what's being computed.
      
      (Updated in response to Dave's review comments.  Note I did not
      split this patch like I said I would.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      3f943d85
    • A
      xfs: avoid repeated pointer dereferences · e3bb2e30
      Alex Elder 提交于
      In xlog_find_cycle_start() use a local variable for some repeated
      operations rather than constantly accessing the memory location
      whose address is passed in.
      
      (This version drops an assertion that a pointer is non-null.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      e3bb2e30
    • A
      xfs: change a few labels in xfs_log_recover.c · 9db127ed
      Alex Elder 提交于
      Rename a label used in xlog_find_head() that I thought was poorly
      chosen.  Also combine two adjacent labels xlog_find_tail() into a
      single label, and give it a more generic name.
      
      (Now using Dave's suggested "validate_head" name for first label.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      9db127ed
    • A
      xfs: nothing special about 1-block log sector · 36adecff
      Alex Elder 提交于
      There are a number of places where a log sector size of 1 uses
      special case code.  The round_up() and round_down() macros
      produce the correct result even when the log sector size is 1, and
      this eliminates the need for treating this as a special case.
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      36adecff
    • A
      xfs: encapsulate bbcount validity checking · ff30a622
      Alex Elder 提交于
      Define a function that encapsulates checking the validity of a log
      block count.
      
      (Updated from previous version--no longer includes error reporting in the
      encapsulated validation function.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      ff30a622
    • A
      xfs: kill XLOG_SECTOR_ROUND*() · 5c17f533
      Alex Elder 提交于
      XLOG_SECTOR_ROUNDUP_BBCOUNT() and XLOG_SECTOR_ROUNDDOWN_BLKNO()
      are now fairly simple macro translations.  Just get rid of them in
      favor of the round_up() and round_down() macro calls they represent.
      
      Also, in spots in xlog_get_bp() and xlog_write_log_records(),
      round_up() was being called with value 1, which just evaluates
      to the macro's second argument; so just use that instead.
      In the latter case, make use of that value, as long as it's
      already been computed.
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      5c17f533
    • A
      xfs: simplify XLOG_SECTOR_ROUND*() · 8511998b
      Alex Elder 提交于
      XLOG_SECTOR_ROUNDUP_BBCOUNT() is defined in "fs/xfs/xfs_log_recover.c"
      in an overly-complicated way.  It is basically roundup(), but that
      is not at all clear from its definition.  (Actually, there is
      another macro round_up() that applies for power-of-two-based masks
      which I'll be using here.)
      
      The operands in XLOG_SECTOR_ROUNDUP_BBCOUNT() are basically the
      block number (bbs) and the log sector basic block mask
      (log->l_sectbb_mask).  I'll call them B and M for this discussion.
      
      The macro computes is value this way:
      	M && (B & M) ? (B + M + 1) & ~M : B
      
      Put another way, we can break it into 3 cases:
      	1)  ! M          -> B			# 0 mask, no effect
      	2)  ! (B & M)    -> B			# sector aligned
      	3)  M && (B & M) -> (B + M + 1) & ~M	# round up otherwise
      
      The round_up() macro is cleverly defined using a value, v, and a
      power-of-2, p, and the result is the nearest multiple of p greater
      than or equal to v.  Its value is computed something like this:
      	((v - 1) | (p - 1)) + 1
      Let's consider using this in the context of the 3 cases above.
      
      When p = 2^0 = 1, the result boils down to ((v - 1) | 0) + 1, so it
      just translates any value v to itself.  That handles case (1) above.
      
      When p = 2^n, n > 0, we know that (p - 1) will be a mask with all n
      bits 0..n-1 set.  The condition in this case occurs when none of
      those mask bits is set in the value v provided.  If that is the
      case, subtracting 1 from v will have 1's in all those lower bits (at
      least).  Therefore, OR-ing the mask with that decremented value has
      no effect, so adding the 1 back again will just translate the v to
      itself.  This handles case (2).
      
      Otherwise, the value v is greater than some multiple of p, and
      decrementing it will produce a result greater than or equal to that
      multiple.  OR-ing in the mask will produce a value 1 less than the
      next multiple of p, so finally adding 1 back will result in the
      desired rounded-up value.  This handles case (3).
      
      Hopefully this is convincing.
      
      While I was at it, I converted XLOG_SECTOR_ROUNDDOWN_BLKNO() to use
      the round_down() macro.
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      8511998b
    • A
      xfs: fix min bufsize bugs in two places · 6881a229
      Alex Elder 提交于
      This fixes a bug in two places that I found by inspection.  In
      xlog_find_verify_cycle() and xlog_write_log_records(), the code
      attempts to allocate a buffer to hold as many blocks as possible.
      It gives up if the number of blocks to be allocated gets too small.
      Right now it uses log->l_sectbb_log as that lower bound, but I'm
      sure it's supposed to be the actual log sector size instead.  That
      is, the lower bound should be (1 << log->l_sectbb_log).
      
      Also define a simple macro xlog_sectbb(log) to represent the number
      of basic blocks in a sector for the given log.
      
      (No change from original submission; I have implemented Christoph's
      suggestion about storing l_sectsize rather than l_sectbb_log in
      a new, separate patch in this series.)
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      6881a229
    • D
      xfs: add log item recovery tracing · 9abbc539
      Dave Chinner 提交于
      Currently there is no tracing in log recovery, so it is difficult to
      determine what is going on when something goes wrong.
      
      Add tracing for log item recovery to provide visibility into the log
      recovery process. The tracing added shows regions being extracted
      from the log transactions and added to the transaction hash forming
      recovery items, followed by the reordering, cancelling and finally
      recovery of the items.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      9abbc539
  13. 22 1月, 2010 2 次提交
  14. 16 1月, 2010 2 次提交
  15. 16 12月, 2009 1 次提交
  16. 15 12月, 2009 1 次提交
    • C
      xfs: event tracing support · 0b1b213f
      Christoph Hellwig 提交于
      Convert the old xfs tracing support that could only be used with the
      out of tree kdb and xfsidbg patches to use the generic event tracer.
      
      To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
      all xfs trace channels by:
      
         echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
      
      or alternatively enable single events by just doing the same in one
      event subdirectory, e.g.
      
         echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable
      
      or set more complex filters, etc. In Documentation/trace/events.txt
      all this is desctribed in more detail.  To reads the events do a
      
         cat /sys/kernel/debug/tracing/trace
      
      Compared to the last posting this patch converts the tracing mostly to
      the one tracepoint per callsite model that other users of the new
      tracing facility also employ.  This allows a very fine-grained control
      of the tracing, a cleaner output of the traces and also enables the
      perf tool to use each tracepoint as a virtual performance counter,
           allowing us to e.g. count how often certain workloads git various
           spots in XFS.  Take a look at
      
          http://lwn.net/Articles/346470/
      
      for some examples.
      
      Also the btree tracing isn't included at all yet, as it will require
      additional core tracing features not in mainline yet, I plan to
      deliver it later.
      
      And the really nice thing about this patch is that it actually removes
      many lines of code while adding this nice functionality:
      
       fs/xfs/Makefile                |    8
       fs/xfs/linux-2.6/xfs_acl.c     |    1
       fs/xfs/linux-2.6/xfs_aops.c    |   52 -
       fs/xfs/linux-2.6/xfs_aops.h    |    2
       fs/xfs/linux-2.6/xfs_buf.c     |  117 +--
       fs/xfs/linux-2.6/xfs_buf.h     |   33
       fs/xfs/linux-2.6/xfs_fs_subr.c |    3
       fs/xfs/linux-2.6/xfs_ioctl.c   |    1
       fs/xfs/linux-2.6/xfs_ioctl32.c |    1
       fs/xfs/linux-2.6/xfs_iops.c    |    1
       fs/xfs/linux-2.6/xfs_linux.h   |    1
       fs/xfs/linux-2.6/xfs_lrw.c     |   87 --
       fs/xfs/linux-2.6/xfs_lrw.h     |   45 -
       fs/xfs/linux-2.6/xfs_super.c   |  104 ---
       fs/xfs/linux-2.6/xfs_super.h   |    7
       fs/xfs/linux-2.6/xfs_sync.c    |    1
       fs/xfs/linux-2.6/xfs_trace.c   |   75 ++
       fs/xfs/linux-2.6/xfs_trace.h   | 1369 +++++++++++++++++++++++++++++++++++++++++
       fs/xfs/linux-2.6/xfs_vnode.h   |    4
       fs/xfs/quota/xfs_dquot.c       |  110 ---
       fs/xfs/quota/xfs_dquot.h       |   21
       fs/xfs/quota/xfs_qm.c          |   40 -
       fs/xfs/quota/xfs_qm_syscalls.c |    4
       fs/xfs/support/ktrace.c        |  323 ---------
       fs/xfs/support/ktrace.h        |   85 --
       fs/xfs/xfs.h                   |   16
       fs/xfs/xfs_ag.h                |   14
       fs/xfs/xfs_alloc.c             |  230 +-----
       fs/xfs/xfs_alloc.h             |   27
       fs/xfs/xfs_alloc_btree.c       |    1
       fs/xfs/xfs_attr.c              |  107 ---
       fs/xfs/xfs_attr.h              |   10
       fs/xfs/xfs_attr_leaf.c         |   14
       fs/xfs/xfs_attr_sf.h           |   40 -
       fs/xfs/xfs_bmap.c              |  507 +++------------
       fs/xfs/xfs_bmap.h              |   49 -
       fs/xfs/xfs_bmap_btree.c        |    6
       fs/xfs/xfs_btree.c             |    5
       fs/xfs/xfs_btree_trace.h       |   17
       fs/xfs/xfs_buf_item.c          |   87 --
       fs/xfs/xfs_buf_item.h          |   20
       fs/xfs/xfs_da_btree.c          |    3
       fs/xfs/xfs_da_btree.h          |    7
       fs/xfs/xfs_dfrag.c             |    2
       fs/xfs/xfs_dir2.c              |    8
       fs/xfs/xfs_dir2_block.c        |   20
       fs/xfs/xfs_dir2_leaf.c         |   21
       fs/xfs/xfs_dir2_node.c         |   27
       fs/xfs/xfs_dir2_sf.c           |   26
       fs/xfs/xfs_dir2_trace.c        |  216 ------
       fs/xfs/xfs_dir2_trace.h        |   72 --
       fs/xfs/xfs_filestream.c        |    8
       fs/xfs/xfs_fsops.c             |    2
       fs/xfs/xfs_iget.c              |  111 ---
       fs/xfs/xfs_inode.c             |   67 --
       fs/xfs/xfs_inode.h             |   76 --
       fs/xfs/xfs_inode_item.c        |    5
       fs/xfs/xfs_iomap.c             |   85 --
       fs/xfs/xfs_iomap.h             |    8
       fs/xfs/xfs_log.c               |  181 +----
       fs/xfs/xfs_log_priv.h          |   20
       fs/xfs/xfs_log_recover.c       |    1
       fs/xfs/xfs_mount.c             |    2
       fs/xfs/xfs_quota.h             |    8
       fs/xfs/xfs_rename.c            |    1
       fs/xfs/xfs_rtalloc.c           |    1
       fs/xfs/xfs_rw.c                |    3
       fs/xfs/xfs_trans.h             |   47 +
       fs/xfs/xfs_trans_buf.c         |   62 -
       fs/xfs/xfs_vnodeops.c          |    8
       70 files changed, 2151 insertions(+), 2592 deletions(-)
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      0b1b213f
  17. 12 12月, 2009 2 次提交
    • C
      xfs: simplify xfs_buf_get / xfs_buf_read interfaces · 6ad112bf
      Christoph Hellwig 提交于
      Currently the low-level buffer cache interfaces are highly confusing
      as we have a _flags variant of each that does actually respect the
      flags, and one without _flags which has a flags argument that gets
      ignored and overriden with a default set.  Given that very few places
      use the default arguments get rid of the duplication and convert all
      callers to pass the flags explicitly.  Also remove the now confusing
      _flags postfix.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      6ad112bf
    • A
      xfs: Wrapped journal record corruption on read at recovery · fc5bc4c8
      Andy Poling 提交于
      Summary of problem:
      
      If a journal record wraps at the physical end of the journal, it has to be
      read in two parts in xlog_do_recovery_pass(): a read at the physical end and a
      read at the physical beginning.  If xlog_bread() has to re-align the first
      read, the second read request does not take that re-alignment into account.
      If the first read was re-aligned, the second read over-writes the end of the
      data from the first read, effectively corrupting it.  This can happen either
      when reading the record header or reading the record data.
      
      The first sanity check in xlog_recover_process_data() is to check for a valid
      clientid, so that is the error reported.
      
      Summary of fix:
      
      If there was a first read at the physical end, XFS_BUF_PTR() returns where the
      data was requested to begin.  Conversely, because it is the result of
      xlog_align(), offset indicates where the requested data for the first read
      actually begins - whether or not xlog_bread() has re-aligned it.
      
      Using offset as the base for the calculation of where to place the second read
      data ensures that it will be correctly placed immediately following the data
      from the first read instead of sometimes over-writing the end of it.
      
      The attached patch has resolved the reported problem of occasional inability
      to recover the journal (reporting "bad clientid").
      Signed-off-by: NAndy Poling <andy@realbig.com>
      Reviewed-by: NAlex Elder <aelder@sgi.com>
      Signed-off-by: NAlex Elder <aelder@sgi.com>
      fc5bc4c8
  18. 18 11月, 2009 1 次提交