1. 06 5月, 2009 3 次提交
    • S
      tracing: export stats of ring buffers to userspace · c8d77183
      Steven Rostedt 提交于
      This patch adds stats to the ftrace ring buffers:
      
       # cat /debugfs/tracing/per_cpu/cpu0/stats
       entries: 42360
       overrun: 30509326
       commit overrun: 0
       nmi dropped: 0
      
      Where entries are the total number of data entries in the buffer.
      
      overrun is the number of entries not consumed and were overwritten by
      the writer.
      
      commit overrun is the number of entries dropped due to nested writers
      wrapping the buffer before the initial writer finished the commit.
      
      nmi dropped is the number of entries dropped due to the ring buffer
      lock being held when an nmi was going to write to the ring buffer.
      Note, this field will be meaningless and will go away when the ring
      buffer becomes lockless.
      
      [ Impact: let userspace know what is happening in the ring buffers ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c8d77183
    • S
      ring-buffer: add counters for commit overrun and nmi dropped entries · f0d2c681
      Steven Rostedt 提交于
      The WARN_ON in the ring buffer when a commit is preempted and the
      buffer is filled by preceding writes can happen in normal operations.
      The WARN_ON makes it look like a bug, not to mention, because
      it does not stop tracing and calls printk which can also recurse, this
      is prone to deadlock (the WARN_ON is not in a position to recurse).
      
      This patch removes the WARN_ON and replaces it with a counter that
      can be retrieved by a tracer. This counter is called commit_overrun.
      
      While at it, I added a nmi_dropped counter to count any time an NMI entry
      is dropped because the NMI could not take the spinlock.
      
      [ Impact: prevent deadlock by printing normal case warning ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f0d2c681
    • S
      ring-buffer: export symbols · d6ce96da
      Steven Rostedt 提交于
      I'm adding a module to do a series of tests on the ring buffer as well
      as benchmarks. This module needs to have more of the ring buffer API
      exported. There's nothing wrong with reading the ring buffer from a
      module.
      
      [ Impact: allow modules to read pages from the ring buffer ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d6ce96da
  2. 29 4月, 2009 7 次提交
    • T
      tracing/filters: a better event parser · 8b372562
      Tom Zanussi 提交于
      Replace the current event parser hack with a better one.  Filters are
      no longer specified predicate by predicate, but all at once and can
      use parens and any of the following operators:
      
      numeric fields:
      
      ==, !=, <, <=, >, >=
      
      string fields:
      
      ==, !=
      
      predicates can be combined with the logical operators:
      
      &&, ||
      
      examples:
      
      "common_preempt_count > 4" > filter
      
      "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
      
      If there was an error, the erroneous string along with an error
      message can be seen by looking at the filter e.g.:
      
      ((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
      ^
      parse_error: Field not found
      
      Currently the caret for an error always appears at the beginning of
      the filter; a real position should be used, but the error message
      should be useful even without it.
      
      To clear a filter, '0' can be written to the filter file.
      
      Filters can also be set or cleared for a complete subsystem by writing
      the same filter as would be written to an individual event to the
      filter file at the root of the subsytem.  Note however, that if any
      event in the subsystem lacks a field specified in the filter being
      set, the set will fail and all filters in the subsytem are
      automatically cleared.  This change from the previous version was made
      because using only the fields that happen to exist for a given event
      would most likely result in a meaningless filter.
      
      Because the logical operators are now implemented as predicates, the
      maximum number of predicates in a filter was increased from 8 to 16.
      
      [ Impact: add new, extended trace-filter implementation ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905899.6416.121.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b372562
    • T
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi 提交于
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a118e4d1
    • T
      tracing/filters: move preds into event_filter object · 30e673b2
      Tom Zanussi 提交于
      Create a new event_filter object, and move the pred-related members
      out of the call and subsystem objects and into the filter object - the
      details of the filter implementation don't need to be exposed in the
      call and subsystem in any case, and it will also help make the new
      parser implementation a little cleaner.
      
      [ Impact: refactor trace-filter code to prepare for new features ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905887.6416.119.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      30e673b2
    • S
      ring-buffer: fix printk output · 7d7d2b80
      Steven Rostedt 提交于
      The warning output in trace_recursive_lock uses %d for a long when
      it should be %ld.
      
      [ Impact: fix compile warning ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7d7d2b80
    • S
      tracing: have splice only copy full pages · f2957f1f
      Steven Rostedt 提交于
      Splice works with pages, it is much more effecient to use an entire
      page than to copy bits over several pages.
      
      Using logdev to trace the internals of the splice mechanism, I was
      able to see that splice can be very aggressive. When tracing is
      occurring, and the reader caught up to the writer, and the writer
      is on the reader page, the reader will copy what is there into the
      splice page. Splice may iterate over several pages and if the
      writer is still writing to the page, the reader will keep copying
      bits to new pages to pass to userspace.
      
      This patch changes it to only pass data to userspace if the page
      is full (the writer has left the page). This has a small side effect
      that splice can not read a partial page, and must wait for the
      page to fill. This should not be an issue. If tracing has stopped,
      then a use of "read" will still read all of the page.
      
      [ Impact: better performance for ring buffer splice code ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f2957f1f
    • S
      tracing: only add splice page if entries exist · 93459c6c
      Steven Rostedt 提交于
      The splice code allocates a page even when the ring buffer is empty.
      It detects the ring buffer being empty when it it fails to copy
      anything from the ring buffer into the page.
      
      This patch adds a check to see if there is anything in the ring buffer
      before allocating a page.
      
      Thanks to logdev for letting me trace the tracer to find this.
      
      [ Impact: speed up due to removing unnecessary allocation ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      93459c6c
    • S
      tracing: fix ref count in splice pages · 5beae6ef
      Steven Rostedt 提交于
      The pages allocated for the splice binary buffer did not initialize
      the ref count correctly. This caused pages not to be freed and causes
      a drastic memory leak.
      
      Thanks to logdev I was able to trace the tracer to find where the leak
      was.
      
      [ Impact: stop memory leak when using splice ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5beae6ef
  3. 28 4月, 2009 1 次提交
    • S
      tracing: convert ftrace_dump spinlocks to raw · cd891ae0
      Steven Rostedt 提交于
      ftrace_dump is used for printing out the contents of the ftrace ring buffer
      to the console on failure. Currently it uses a spinlock to synchronize
      the output from multiple failures on different CPUs. This spin lock
      currently is a normal spinlock and can cause issues with lockdep and
      lock tracing.
      
      This patch converts it to raw since it is for error handling only.
      The lock is local to the ftrace_dump and is not used by any other
      infrastructure.
      
      [ Impact: prevent ftrace_dump from locking up by internal tracing ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cd891ae0
  4. 26 4月, 2009 1 次提交
    • S
      tracing/events: make modules have their own file_operations structure · 701970b3
      Steven Rostedt 提交于
      For proper module reference counting, the file_operations that modules use
      must have the "owner" field set to the module. Unfortunately, the trace events
      use share file_operations. The same file_operations are used by all both
      kernel core and all modules.
      
      This patch makes the modules allocate their own file_operations and
      copies the functions from the core kernel. This allows those file
      operations to be owned by the module.
      
      Care is taken to free this code on module unload.
      
      Thanks to Greg KH for reminding me that file_operations must be owned
      by the module to have reference counting take place.
      
      [ Impact: fix modular tracepoints / potential crash ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      701970b3
  5. 25 4月, 2009 1 次提交
    • S
      tracing/events: reuse trace event ids after overflow · 060fa5c8
      Steven Rostedt 提交于
      With modules being able to add trace events, and the max trace event
      counter is 16 bits (65536) we can overflow the counter easily
      with a simple while loop adding and removing modules that contain
      trace events.
      
      This patch links together the registered trace events and on overflow
      searches for available trace event ids. It will still fail if
      over 65536 events are registered, but considering that a typical
      kernel only has 22000 functions, 65000 events should be sufficient.
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      060fa5c8
  6. 24 4月, 2009 4 次提交
    • L
      ring_buffer: compressed event header · 334d4169
      Lai Jiangshan 提交于
      RB_MAX_SMALL_DATA = 28bytes is too small for most tracers, it wastes
      an 'u32' to save the actually length for events which data size > 28.
      
      This fix uses compressed event header and enlarges RB_MAX_SMALL_DATA.
      
      [ Impact: saves about 0%-12.5%(depends on tracer) memory in ring_buffer ]
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <49F13189.3090000@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      334d4169
    • S
      tracing: add size checks for exported ftrace internal structures · 75db37d2
      Steven Rostedt 提交于
      The events exported by TRACE_EVENT are automated and are guaranteed
      to be correct when used.
      
      The internal ftrace structures on the other hand are more manually
      exported. These require the ftrace maintainer to make sure they
      are up to date.
      
      This patch adds a size check to help flag when a type changes in
      an internal ftrace data structure, and the update needs to be reflected
      in the export.
      
      If a export is incorrect, then the only harm is that the user space
      tools will not know how to correctly read the internal structures of
      ftrace.
      
      [ Impact: help prevent inconsistent ftrace format print outs ]
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      75db37d2
    • S
      tracing: increase size of number of possible events · 89ec0dee
      Steven Rostedt 提交于
      With the new event tracing registration, we must increase the number
      of events that can be registered. Currently the type field is only
      one byte, which leaves us only 256 possible events.
      
      Since we do not save the CPU number in the tracer anymore (it is determined
      by the per cpu ring buffer that is used) we have an extra byte to use.
      
      This patch increases the size of type from 1 byte (256 events) to
      2 bytes (65,536 events).
      
      It also adds a WARN_ON_ONCE if we exceed that limit.
      
      [ Impact: allow more than 255 events ]
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      89ec0dee
    • S
      tracing/wakeup: move access to wakeup_cpu into spinlock · 9be24414
      Steven Rostedt 提交于
      The code had the following outside the lock:
      
              if (next != wakeup_task)
                      return;
      
              pc = preempt_count();
      
              /* The task we are waiting for is waking up */
              data = wakeup_trace->data[wakeup_cpu];
      
      On initialization, wakeup_task is NULL and wakeup_cpu -1. This code
      is not under a lock. If wakeup_task is set on another CPU as that
      task is waking up, we can see the wakeup_task before wakeup_cpu is
      set. If we read wakeup_cpu while it is still -1 then we will have
      a bad data pointer.
      
      This patch moves the reading of wakeup_cpu within the protection of
      the spinlock used to protect the writing of wakeup_cpu and wakeup_task.
      
      [ Impact: remove possible race causing invalid pointer dereference ]
      Reported-by: NManeesh Soni <maneesh@in.ibm.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      9be24414
  7. 22 4月, 2009 1 次提交
    • L
      tracing/events: make struct trace_entry->type to be int type · 7a4f453b
      Li Zefan 提交于
      struct trace_entry->type is unsigned char, while trace event's id is
      int type, thus for a event with id >= 256, it's entry->type is cast
      to (id % 256), and then we can't see the trace output of this event.
      
       # insmod trace-events-sample.ko
       # echo foo_bar > /mnt/tracing/set_event
       # cat /debug/tracing/events/trace-events-sample/foo_bar/id
       256
       # cat /mnt/tracing/trace_pipe
                 <...>-3548  [001]   215.091142: Unknown type 0
                 <...>-3548  [001]   216.089207: Unknown type 0
                 <...>-3548  [001]   217.087271: Unknown type 0
                 <...>-3548  [001]   218.085332: Unknown type 0
      
      [ Impact: fix output for trace events with id >= 256 ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a4f453b
  8. 21 4月, 2009 8 次提交
    • S
      ring-buffer: only warn on wrap if buffer is bigger than two pages · 3554228d
      Steven Rostedt 提交于
      On boot up, to save memory, ftrace allocates the minimum buffer
      which is two pages. Ftrace also goes through a series of tests
      (when configured) on boot up. These tests can fill up a page within
      a single interrupt.
      
      The ring buffer also has a WARN_ON when it detects that the buffer was
      completely filled within a single commit (other commits are allowed to
      be nested).
      
      Combine the small buffer on start up, with the tests that can fill more
      than a single page within an interrupt, this can trigger the WARN_ON.
      
      This patch makes the WARN_ON only happen when the ring buffer consists
      of more than two pages.
      
      [ Impact: prevent false WARN_ON in ftrace startup tests ]
      Reported-by: NIngo Molnar <mingo@elte.hu>
      LKML-Reference: <20090421094616.GA14561@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3554228d
    • L
      tracing/filters: allow user-input to be integer-like string · f66578a7
      Li Zefan 提交于
      Suppose we would like to trace all tasks named '123', but this
      will fail:
      
       # echo 'parent_comm == 123' > events/sched/sched_process_fork/filter
       bash: echo: write error: Invalid argument
      
      Don't guess the type of the filter pred in filter_parse(), but instead
      we check it in __filter_add_pred().
      
      [ Impact: extend allowed filter field string values ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49ED8DEB.6000700@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f66578a7
    • L
      tracing/filters: don't remove old filters when failed to write subsys->filter · e8082f3f
      Li Zefan 提交于
      If writing subsys->filter returns EINVAL or ENOSPC, the original
      filters in subsys/ and subsys/events/ will be removed. This is
      definitely wrong.
      
      [ Impact: fix filter setting semantics on error condition ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49ED8DD2.2070700@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8082f3f
    • S
      tracing: use nowakeup version of commit for function event trace tests · cb4764a6
      Steven Rostedt 提交于
      The startup tests for the event tracer also runs with the function
      tracer enabled. The "wakeup" version of the trace commit was used
      which can grab spinlocks. If a task was preempted by an NMI
      that called a function being traced, it could deadlock due to the
      function tracer trying to grab the same lock.
      
      Thanks to Frederic Weisbecker for pointing out where the bug was.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cb4764a6
    • S
      tracing: use recursive counter over irq level · aa18efb2
      Steven Rostedt 提交于
      Althought using the irq level (hardirq_count, softirq_count and in_nmi)
      was nice to detect bad recursion right away, but since the counters are
      not atomically updated with respect to the interrupts, the function tracer
      might trigger the test from an interrupt handler before the hardirq_count
      is updated. This will trigger a false warning.
      
      This patch converts the recursive detection to a simple counter.
      If the depth is greater than 16 then the recursive detection will trigger.
      16 is more than enough for any nested interrupts.
      
      [ Impact: fix false positive trace recursion detection ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      aa18efb2
    • S
      tracing: remove recursive test from ring_buffer_event_discard · e395898e
      Steven Rostedt 提交于
      The ring_buffer_event_discard is not tied to ring_buffer_lock_reserve.
      It can be called inside or outside the reserve/commit. Even if it
      is called inside the reserve/commit the commit part must also be called.
      
      Only ring_buffer_discard_commit can be used as a replacement for
      ring_buffer_unlock_commit.
      
      This patch removes the trace_recursive_unlock from ring_buffer_event_discard
      since it would be the wrong place to do so.
      
      [Impact: prevent breakage in trace recursive testing ]
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e395898e
    • S
      tracing: fix recursive test level calculation · 17487bfe
      Steven Rostedt 提交于
      The recursive tests to detect same level recursion in the ring buffers
      did not account for the hard/softirq_counts to be shifted. Thus the
      numbers could be larger than then mask to be tested.
      
      This patch includes the shift for the calculation of the irq depth.
      
      [ Impact: stop false positives in trace recursion detection ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      17487bfe
    • S
      tracing/events: call the correct event trace selftest init function · 28d20e2d
      Steven Rostedt 提交于
      The late_initcall calls a helper function instead of the proper
      init event selftest function.
      
      This update may have been lost due to conflicting merges.
      
      [ Impact: fix compiler warning and call extended event trace self tests ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      28d20e2d
  9. 20 4月, 2009 5 次提交
    • S
      tracing: rename EVENT_TRACER config to ENABLE_EVENT_TRACING · a7abe97f
      Steven Rostedt 提交于
      Currently we have two configs: EVENT_TRACING and EVENT_TRACER.
      All tracers enable EVENT_TRACING. The EVENT_TRACER is only a
      convenience to enable the EVENT_TRACING when no other tracers
      are enabled.
      
      The names EVENT_TRACER and EVENT_TRACING are too similar and confusing.
      This patch renames EVENT_TRACER to ENABLE_EVENT_TRACING to be more
      appropriate to what it actually does, as well as add a comment in
      the help menu to explain the option's purpose.
      
      [ Impact: rename config option to reduce confusion ]
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a7abe97f
    • S
      tracing: create menuconfig for tracing infrastructure · 4ed9f071
      Steven Rostedt 提交于
      During testing we often use randconfig to test various kernels.
      The current configuration set up does not give an easy way to disable
      all tracing with a single config. The case where randconfig would
      test all tracing disabled is very unlikely.
      
      This patch adds a config option to enable or disable all tracing.
      It is hooked into the tracing menu just like other submenus are done.
      
      [ Impact: allow randconfig to easily produce all traces disabled ]
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4ed9f071
    • S
      tracing: change branch profiling to a choice selection · 9ae5b879
      Steven Rostedt 提交于
      This patch makes the branch profiling into a choice selection:
      
        None               - no branch profiling
        likely/unlikely    - only profile likely/unlikely branches
        all                - profile all branches
      
      The all profiler will also enable the likely/unlikely branches.
      
      This does not change the way the profiler works or the dependencies
      between the profilers.
      
      What this patch does, is keep the branch profiling from being selected
      by an allyesconfig make. The branch profiler is very intrusive and
      it is known to break various architecture builds when selected as an
      allyesconfig.
      
      [ Impact: prevent branch profiler from being selected in allyesconfig ]
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Reported-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9ae5b879
    • F
      tracing/ring-buffer: Add unlock recursion protection on discard · f3b9aae1
      Frederic Weisbecker 提交于
      The pair of helpers trace_recursive_lock() and trace_recursive_unlock()
      have been introduced recently to provide generic tracing recursion
      protection.
      
      They are used in a symetric way:
      
       - trace_recursive_lock() on buffer reserve
       - trace_recursive_unlock() on buffer commit
      
      However sometimes, we don't commit but discard on entry
      to the buffer, ie: in case of filter checking.
      
      Then we must also unlock the recursion protection on discard time,
      otherwise the tracing gets definitely deactivated and a warning
      is raised spuriously, such as:
      
      111.119821] ------------[ cut here ]------------
      [  111.119829] WARNING: at kernel/trace/ring_buffer.c:1498 ring_buffer_lock_reserve+0x1b7/0x1d0()
      [  111.119835] Hardware name: AMILO Li 2727
      [  111.119839] Modules linked in:
      [  111.119846] Pid: 5731, comm: Xorg Tainted: G        W  2.6.30-rc1 #69
      [  111.119851] Call Trace:
      [  111.119863]  [<ffffffff8025ce68>] warn_slowpath+0xd8/0x130
      [  111.119873]  [<ffffffff8028a30f>] ? __lock_acquire+0x19f/0x1ae0
      [  111.119882]  [<ffffffff8028a30f>] ? __lock_acquire+0x19f/0x1ae0
      [  111.119891]  [<ffffffff802199b0>] ? native_sched_clock+0x20/0x70
      [  111.119899]  [<ffffffff80286dee>] ? put_lock_stats+0xe/0x30
      [  111.119906]  [<ffffffff80286eb8>] ? lock_release_holdtime+0xa8/0x150
      [  111.119913]  [<ffffffff802c8ae7>] ring_buffer_lock_reserve+0x1b7/0x1d0
      [  111.119921]  [<ffffffff802cd110>] trace_buffer_lock_reserve+0x30/0x70
      [  111.119930]  [<ffffffff802ce000>] trace_current_buffer_lock_reserve+0x20/0x30
      [  111.119939]  [<ffffffff802474e8>] ftrace_raw_event_sched_switch+0x58/0x100
      [  111.119948]  [<ffffffff808103b7>] __schedule+0x3a7/0x4cd
      [  111.119957]  [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b
      [  111.119964]  [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b
      [  111.119971]  [<ffffffff80810c08>] schedule+0x18/0x40
      [  111.119977]  [<ffffffff80810e09>] preempt_schedule+0x39/0x60
      [  111.119985]  [<ffffffff80813bd3>] _read_unlock+0x53/0x60
      [  111.119993]  [<ffffffff807259d2>] sock_def_readable+0x72/0x80
      [  111.120002]  [<ffffffff807ad5ed>] unix_stream_sendmsg+0x24d/0x3d0
      [  111.120011]  [<ffffffff807219a3>] sock_aio_write+0x143/0x160
      [  111.120019]  [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b
      [  111.120026]  [<ffffffff80721860>] ? sock_aio_write+0x0/0x160
      [  111.120033]  [<ffffffff80721860>] ? sock_aio_write+0x0/0x160
      [  111.120042]  [<ffffffff8031c283>] do_sync_readv_writev+0xf3/0x140
      [  111.120049]  [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b
      [  111.120057]  [<ffffffff80276ff0>] ? autoremove_wake_function+0x0/0x40
      [  111.120067]  [<ffffffff8045d489>] ? cap_file_permission+0x9/0x10
      [  111.120074]  [<ffffffff8045c1e6>] ? security_file_permission+0x16/0x20
      [  111.120082]  [<ffffffff8031cab4>] do_readv_writev+0xd4/0x1f0
      [  111.120089]  [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b
      [  111.120097]  [<ffffffff80211b56>] ? ftrace_call+0x5/0x2b
      [  111.120105]  [<ffffffff8031cc18>] vfs_writev+0x48/0x70
      [  111.120111]  [<ffffffff8031cd65>] sys_writev+0x55/0xc0
      [  111.120119]  [<ffffffff80211e32>] system_call_fastpath+0x16/0x1b
      [  111.120125] ---[ end trace 15605f4e98d5ccb5 ]---
      
      [ Impact: fix spurious warning triggering tracing shutdown ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      f3b9aae1
    • F
      tracing/core: Add current context on tracing recursion warning · e057a5e5
      Frederic Weisbecker 提交于
      In case of tracing recursion detection, we only get the stacktrace.
      But the current context may be very useful to debug the issue.
      
      This patch adds the softirq/hardirq/nmi context with the warning
      using lockdep context display to have a familiar output.
      
      v2: Use printk_once()
      v3: drop {hardirq,softirq}_context which depend on lockdep,
          only keep what is part of current->trace_recursion,
          sufficient to debug the warning source.
      
      [ Impact: print context necessary to debug recursion ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      e057a5e5
  10. 18 4月, 2009 5 次提交
    • S
      tracing: protect trace_printk from recursion · 3189cdb3
      Steven Rostedt 提交于
      trace_printk can be called from any context, including NMIs.
      If this happens, then we must test for for recursion before
      grabbing any spinlocks.
      
      This patch prevents trace_printk from being called recursively.
      
      [ Impact: prevent hard lockup in lockdep event tracer ]
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3189cdb3
    • S
      tracing: add same level recursion detection · 261842b7
      Steven Rostedt 提交于
      The tracing infrastructure allows for recursion. That is, an interrupt
      may interrupt the act of tracing an event, and that interrupt may very well
      perform its own trace. This is a recursive trace, and is fine to do.
      
      The problem arises when there is a bug, and the utility doing the trace
      calls something that recurses back into the tracer. This recursion is not
      caused by an external event like an interrupt, but by code that is not
      expected to recurse. The result could be a lockup.
      
      This patch adds a bitmask to the task structure that keeps track
      of the trace recursion. To find the interrupt depth, the following
      algorithm is used:
      
        level = hardirq_count() + softirq_count() + in_nmi;
      
      Here, level will be the depth of interrutps and softirqs, and even handles
      the nmi. Then the corresponding bit is set in the recursion bitmask.
      If the bit was already set, we know we had a recursion at the same level
      and we warn about it and fail the writing to the buffer.
      
      After the data has been committed to the buffer, we clear the bit.
      No atomics are needed. The only races are with interrupts and they reset
      the bitmask before returning anywy.
      
      [ Impact: detect same irq level trace recursion ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      261842b7
    • S
      tracing: add EXPORT_SYMBOL_GPL for trace commits · 12acd473
      Steven Rostedt 提交于
      Not all the necessary symbols were exported to allow for tracing
      by modules. This patch adds them in.
      
      [ Impact: allow modules to commit data to the ring buffer ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      12acd473
    • T
      tracing/filters: add filter_mutex to protect filter predicates · ac1adc55
      Tom Zanussi 提交于
      This patch adds a filter_mutex to prevent the filter predicates from
      being accessed concurrently by various external functions.
      
      It's based on a previous patch by Li Zefan:
              "[PATCH 7/7] tracing/filters: make filter preds RCU safe"
      
      v2 changes:
      
      - fixed wrong value returned in a add_subsystem_pred() failure case
        noticed by Li Zefan.
      
      [ Impact: fix trace filter corruption/crashes on parallel access ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Reviewed-by: NLi Zefan <lizf@cn.fujitsu.com>
      Tested-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1239946028.6639.13.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ac1adc55
    • L
      tracing: fix file mode of trace and README · 339ae5d3
      Li Zefan 提交于
      trace is read-write and README is read-only.
      
      [ Impact: fix /debug/tracing/ file permissions. ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <49E7EAB6.4070605@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      339ae5d3
  11. 17 4月, 2009 4 次提交
    • S
      tracing/events: perform function tracing in event selftests · 9ea21c1e
      Steven Rostedt 提交于
      We can find some bugs in the trace events if we stress the writes as well.
      The function tracer is a good way to stress the events.
      
      [ Impact: extend scope of event tracer self-tests ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090416161746.604786131@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9ea21c1e
    • A
      tracing: add saved_cmdlines file to show cached task comms · 69abe6a5
      Avadh Patel 提交于
      Export the cached task comms to userspace. This allows user apps to translate
      the pids from a trace into their respective task command lines.
      
      [ Impact: let userspace apps reading binary buffer know comm's of pids ]
      Signed-off-by: NAvadh Patel <avadh4all@gmail.com>
      [ added error checking and use of buf pointer to index file_buf ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      69abe6a5
    • S
      tracing/events/ring-buffer: expose format of ring buffer headers to users · d1b182a8
      Steven Rostedt 提交于
      Currently, every thing needed to read the binary output from the
      ring buffers is available, with the exception of the way the ring
      buffers handles itself internally.
      
      This patch creates two special files in the debugfs/tracing/events
      directory:
      
       # cat /debug/tracing/events/header_page
              field: u64 timestamp;   offset:0;       size:8;
              field: local_t commit;  offset:8;       size:8;
              field: char data;       offset:16;      size:4080;
      
       # cat /debug/tracing/events/header_event
              type        :    2 bits
              len         :    3 bits
              time_delta  :   27 bits
              array       :   32 bits
      
              padding     : type == 0
              time_extend : type == 1
              data        : type == 3
      
      This is to allow a userspace app to see if the ring buffer format changes
      or not.
      
      [ Impact: allow userspace apps to know of ringbuffer format changes ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d1b182a8
    • S
      tracing/events: add startup tests for events · e6187007
      Steven Rostedt 提交于
      As events start to become popular, and the new way to add tracing
      infrastructure into ftrace, it is important to catch any problems
      that might happen with a mistake in the TRACE_EVENT macro.
      
      This patch introduces a startup self test on the registered trace
      events. Note, it can only do a generic test, any type of testing that
      needs more involement is needed to be implemented by the tracepoint
      creators.
      
      The test goes down one by one enabling a trace point and running
      some random tasks (random in the sense that I just made them up).
      Those tasks are creating threads, grabbing mutexes and spinlocks
      and using workqueues.
      
      After testing each event individually, it does the same test after
      enabling each system of trace points. Like sched, irq, lockdep.
      
      Then finally it enables all tracepoints and performs the tasks again.
      The output to the console on bootup will look like this when everything
      works:
      
      Running tests on trace events:
      Testing event kfree_skb: OK
      Testing event kmalloc: OK
      Testing event kmem_cache_alloc: OK
      Testing event kmalloc_node: OK
      Testing event kmem_cache_alloc_node: OK
      Testing event kfree: OK
      Testing event kmem_cache_free: OK
      Testing event irq_handler_exit: OK
      Testing event irq_handler_entry: OK
      Testing event softirq_entry: OK
      Testing event softirq_exit: OK
      Testing event lock_acquire: OK
      Testing event lock_release: OK
      Testing event sched_kthread_stop: OK
      Testing event sched_kthread_stop_ret: OK
      Testing event sched_wait_task: OK
      Testing event sched_wakeup: OK
      Testing event sched_wakeup_new: OK
      Testing event sched_switch: OK
      Testing event sched_migrate_task: OK
      Testing event sched_process_free: OK
      Testing event sched_process_exit: OK
      Testing event sched_process_wait: OK
      Testing event sched_process_fork: OK
      Testing event sched_signal_send: OK
      Running tests on trace event systems:
      Testing event system skb: OK
      Testing event system kmem: OK
      Testing event system irq: OK
      Testing event system lockdep: OK
      Testing event system sched: OK
      Running tests on all trace events:
      Testing all events: OK
      
      [ folded in:
      
        tracing: add #include <linux/delay.h> to fix build failure in test_work()
      
        This build failure occured on a few rare configs:
      
         kernel/trace/trace_events.c: In function ‘test_work’:
         kernel/trace/trace_events.c:975: error: implicit declaration of function ‘udelay’
         kernel/trace/trace_events.c:980: error: implicit declaration of function ‘msleep’
      
        delay.h is included in way too many other headers, hiding cases
        where new usage is added without header inclusion.
      
        [ Impact: build fix ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ]
      
      [ Impact: add event tracer self-tests ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e6187007