1. 13 3月, 2010 1 次提交
    • S
      tracing: Use same local variable when resetting the ring buffer · 283740c6
      Steven Rostedt 提交于
      In the ftrace code that resets the ring buffer it references the
      buffer with a local variable, but then uses the tr->buffer as the
      parameter to reset. If the wakeup tracer is running, which can
      switch the tr->buffer with the max saved buffer, this can break
      the requirement of disabling the buffer before the reset.
      
         buffer = tr->buffer;
         ring_buffer_record_disable(buffer);
         synchronize_sched();
         __tracing_reset(tr->buffer, cpu);
      
      If the tr->buffer is swapped, then the reset is not happening to the
      buffer that was disabled. This will cause the ring buffer to fail.
      
      Found with Li Zefan's ftrace_stress_test.
      
      Cc: stable@kernel.org
      Reported-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      283740c6
  2. 06 3月, 2010 2 次提交
    • T
      function-graph: Add tracing_thresh support to function_graph tracer · 0e950173
      Tim Bird 提交于
      Add support for tracing_thresh to the function_graph tracer.  This
      version of this feature isolates the checks into new entry and
      return functions, to avoid adding more conditional code into the
      main function_graph paths.
      
      When the tracing_thresh is set and the function graph tracer is
      enabled, only the functions that took longer than the time in
      microseconds that was set in tracing_thresh are recorded. To do this
      efficiently, only the function exits are recorded:
      
       [tracing]# echo 100 > tracing_thresh
       [tracing]# echo function_graph > current_tracer
       [tracing]# cat trace
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
        1) ! 119.214 us  |  } /* smp_apic_timer_interrupt */
        1)   <========== |
        0) ! 101.527 us  |              } /* __rcu_process_callbacks */
        0) ! 126.461 us  |            } /* rcu_process_callbacks */
        0) ! 145.111 us  |          } /* __do_softirq */
        0) ! 149.667 us  |        } /* do_softirq */
        0) ! 168.817 us  |      } /* irq_exit */
        0) ! 248.254 us  |    } /* smp_apic_timer_interrupt */
      
      Also, add support for specifying tracing_thresh on the kernel
      command line.  When used like so: "tracing_thresh=200 ftrace=function_graph"
      this can be used to analyse system startup.  It is important to disable
      tracing soon after boot, in order to avoid losing the trace data.
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NTim Bird <tim.bird@am.sony.com>
      LKML-Reference: <4B87098B.4040308@am.sony.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0e950173
    • A
      tracing: Update the comm field in the right variable in update_max_tr · 1acaa1b2
      Arnaldo Carvalho de Melo 提交于
      The latency output showed:
      
       #    | task: -3 (uid:0 nice:0 policy:1 rt_prio:99)
      
      The comm is missing in the "task:" and it looks like a minus 3 is
      the output. The correct display should be:
      
       #    | task: migration/0-3 (uid:0 nice:0 policy:1 rt_prio:99)
      
      The problem is that the comm is being stored in the wrong data
      structure. The max_tr.data[cpu] is what stores the comm, not the
      tr->data[cpu].
      
      Before this patch the max_tr.data[cpu]->comm was zeroed and the /debug/trace
      ended up showing just the '-' sign followed by the pid.
      
      Also remove a needless initialization of max_data.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1267824230-23861-1-git-send-email-acme@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1acaa1b2
  3. 03 3月, 2010 1 次提交
    • L
      tracing: Fix warning in s_next of trace file ops · ac91d854
      Lai Jiangshan 提交于
      This warning in s_next() can be triggered by lseek():
       [<c018b3f7>] ? s_next+0x77/0x80
       [<c013e3c1>] warn_slowpath_common+0x81/0xa0
       [<c018b3f7>] ? s_next+0x77/0x80
       [<c013e3fa>] warn_slowpath_null+0x1a/0x20
       [<c018b3f7>] s_next+0x77/0x80
       [<c01efa77>] traverse+0x117/0x200
       [<c01eff13>] seq_lseek+0xa3/0x120
       [<c01efe70>] ? seq_lseek+0x0/0x120
       [<c01d7081>] vfs_llseek+0x41/0x50
       [<c01d8116>] sys_llseek+0x66/0xa0
       [<c0102bd0>] sysenter_do_call+0x12/0x26
      
      The iterator "leftover" variable is zeroed in the opening of the trace
      file. But lseek can call s_start() which will call s_next() without
      reseting the "leftover" variable back to zero, which might trigger
      the WARN_ON_ONCE(iter->leftover) that is in s_next().
      
      Cc: stable@kernel.org
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B8CE06A.9090207@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ac91d854
  4. 26 1月, 2010 1 次提交
    • S
      tracing: Prevent kernel oops with corrupted buffer · 74bf4076
      Steven Rostedt 提交于
      If the contents of the ftrace ring buffer gets corrupted and the trace
      file is read, it could create a kernel oops (usualy just killing the user
      task thread). This is caused by the checking of the pid in the buffer.
      If the pid is negative, it still references the cmdline cache array,
      which could point to an invalid address.
      
      The simple fix is to test for negative PIDs.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      74bf4076
  5. 07 1月, 2010 2 次提交
    • S
      tracing: Add stack dump to trace_printk if stacktrace option is set · d931369b
      Steven Rostedt 提交于
      If the ftrace stacktrace option is set, then add the stack dumps to
      trace_printk.
      Requested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d931369b
    • L
      tracing: Consolidate protection of reader access to the ring buffer · 7e53bd42
      Lai Jiangshan 提交于
      At the beginning, access to the ring buffer was fully serialized
      by trace_types_lock. Patch d7350c3f gives more freedom to readers,
      and patch b04cc6b1 adds code to protect trace_pipe and cpu#/trace_pipe.
      
      But actually it is not enough, ring buffer readers are not always
      read-only, they may consume data.
      
      This patch makes accesses to trace, trace_pipe, trace_pipe_raw
      cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
      And removes tracing_reader_cpumask which is used to protect trace_pipe.
      
      Details:
      
      Ring buffer serializes readers, but it is low level protection.
      The validity of the events (which returns by ring_buffer_peek() ..etc)
      are not protected by ring buffer.
      
      The content of events may become garbage if we allow another process to consume
      these events concurrently:
        A) the page of the consumed events may become a normal page
           (not reader page) in ring buffer, and this page will be rewritten
           by the events producer.
        B) The page of the consumed events may become a page for splice_read,
           and this page will be returned to system.
      
      This patch adds trace_access_lock() and trace_access_unlock() primitives.
      
      These primitives allow multi process access to different cpu ring buffers
      concurrently.
      
      These primitives don't distinguish read-only and read-consume access.
      Multi read-only access is also serialized.
      
      And we don't use these primitives when we open files,
      we only use them when we read files.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7e53bd42
  6. 22 12月, 2009 1 次提交
  7. 16 12月, 2009 1 次提交
  8. 15 12月, 2009 4 次提交
  9. 14 12月, 2009 4 次提交
  10. 12 12月, 2009 1 次提交
  11. 11 12月, 2009 1 次提交
    • S
      tracing: Add trace_dump_stack() · 03889384
      Steven Rostedt 提交于
      I've been asked a few times about how to find out what is calling
      some location in the kernel. One way is to use dynamic function tracing
      and implement the func_stack_trace. But this only finds out who is
      calling a particular function. It does not tell you who is calling
      that function and entering a specific if conditional.
      
      I have myself implemented a quick version of trace_dump_stack() for
      this purpose a few times, and just needed it now. This is when I realized
      that this would be a good tool to have in the kernel like trace_printk().
      
      Using trace_dump_stack() is similar to dump_stack() except that it
      writes to the trace buffer instead and can be used in critical locations.
      
      For example:
      
      @@ -5485,8 +5485,12 @@ need_resched_nonpreemptible:
       	if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
       		if (unlikely(signal_pending_state(prev->state, prev)))
       			prev->state = TASK_RUNNING;
      -		else
      +		else {
       			deactivate_task(rq, prev, 1);
      +			trace_printk("Deactivating task %s:%d\n",
      +				     prev->comm, prev->pid);
      +			trace_dump_stack();
      +		}
       		switch_count = &prev->nvcsw;
       	}
      
      Produces:
      
                 <...>-3249  [001]   296.105269: schedule: Deactivating task ntpd:3249
                 <...>-3249  [001]   296.105270: <stack trace>
       => schedule
       => schedule_hrtimeout_range
       => poll_schedule_timeout
       => do_select
       => core_sys_select
       => sys_select
       => system_call_fastpath
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      03889384
  12. 10 12月, 2009 3 次提交
    • C
      tracing: Remove comparing of NULL to va_list in trace_array_vprintk() · f2942487
      Carsten Emde 提交于
      Olof Johansson stated the following:
      
        Comparing a va_list with NULL is bogus. It's supposed to be treated like
        an opaque type and only be manipulated with va_* accessors.
      
      Olof noticed that this code broke the ARM builds:
      
          kernel/trace/trace.c: In function 'trace_array_vprintk':
          kernel/trace/trace.c:1364: error: invalid operands to binary == (have 'va_list' and 'void *')
          kernel/trace/trace.c: In function 'tracing_mark_write':
          kernel/trace/trace.c:3349: error: incompatible type for argument 3 of 'trace_vprintk'
      
      This patch partly reverts c13d2f7c and
      re-installs the original mark_printk() mechanism.
      Reported-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NCarsten Emde <C.Emde@osadl.org>
      LKML-Reference: <4B1BAB74.104@osadl.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f2942487
    • S
      tracing: Buffer the output of seq_file in case of filled buffer · a63ce5b3
      Steven Rostedt 提交于
      If the seq_read fills the buffer it will call s_start again on the next
      itertation with the same position. This causes a problem with the
      function_graph tracer because it consumes the iteration in order to
      determine leaf functions.
      
      What happens is that the iterator stores the entry, and the function
      graph plugin will look at the next entry. If that next entry is a return
      of the same function and task, then the function is a leaf and the
      function_graph plugin calls ring_buffer_read which moves the ring buffer
      iterator forward (the trace iterator still points to the function start
      entry).
      
      The copying of the trace_seq to the seq_file buffer will fail if the
      seq_file buffer is full. The seq_read will not show this entry.
      The next read by userspace will cause seq_read to again call s_start
      which will reuse the trace iterator entry (the function start entry).
      But the function return entry was already consumed. The function graph
      plugin will think that this entry is a nested function and not a leaf.
      
      To solve this, the trace code now checks the return status of the
      seq_printf (trace_print_seq). If the writing to the seq_file buffer
      fails, we set a flag in the iterator (leftover) and we do not reset
      the trace_seq buffer. On the next call to s_start, we check the leftover
      flag, and if it is set, we just reuse the trace_seq buffer and do not
      call into the plugin print functions.
      
      Before this patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)               |          __fsnotify_parent() {
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      After the patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)   0.548 us    |          __fsnotify_parent();
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      [
        Updated the patch to fix a missing return 0 from the trace_print_seq()
        stub when CONFIG_TRACING is disabled.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      ]
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a63ce5b3
    • S
      tracing: Only call pipe_close if pipe_close is defined · 29bf4a5e
      Steven Rostedt 提交于
      This fixes a cut and paste error that had pipe_close get called
      if pipe_open was defined (not pipe_close).
      Reported-by: NKosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20091209153204.F4CD.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      29bf4a5e
  13. 08 12月, 2009 1 次提交
    • S
      tracing: Add pipe_close interface · c521efd1
      Steven Rostedt 提交于
      An ftrace plugin can add a pipe_open interface when the user opens
      trace_pipe. But if the plugin allocates something within the pipe_open
      it can not free it because there exists no pipe_close. The hook to
      the trace file open has a corresponding close. The closing of the
      trace_pipe file should also have a corresponding close.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c521efd1
  14. 17 11月, 2009 1 次提交
  15. 12 11月, 2009 1 次提交
  16. 29 10月, 2009 1 次提交
    • T
      percpu: make percpu symbols in tracer unique · 9705f69e
      Tejun Heo 提交于
      This patch updates percpu related symbols in kernel tracer such that
      percpu symbols are unique and don't clash with local symbols.  This
      serves two purposes of decreasing the possibility of global percpu
      symbol collision and allowing dropping per_cpu__ prefix from percpu
      symbols.
      
      * kernel/trace/trace.c: s/max_data/max_tr_data/
      * kernel/trace/trace_hw_branches: s/tracer/hwb_tracer/, s/buffer/hwb_buffer/
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      9705f69e
  17. 24 10月, 2009 1 次提交
  18. 15 10月, 2009 1 次提交
  19. 13 10月, 2009 1 次提交
  20. 12 10月, 2009 1 次提交
  21. 09 10月, 2009 1 次提交
  22. 24 9月, 2009 1 次提交
  23. 23 9月, 2009 1 次提交
  24. 22 9月, 2009 1 次提交
  25. 21 9月, 2009 1 次提交
  26. 19 9月, 2009 1 次提交
  27. 13 9月, 2009 2 次提交
  28. 12 9月, 2009 2 次提交
    • J
      tracing: create generic trace parser · b63f39ea
      jolsa@redhat.com 提交于
      Create a "trace_parser" that can parse the user space input for
      separate words.
      
      struct trace_parser is the descriptor.
      
      Generic "trace_get_user" function that can be a helper to read multiple
      words passed in by user space.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <1252682969-3366-2-git-send-email-jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b63f39ea
    • S
      tracing: add lock depth to entries · 637e7e86
      Steven Rostedt 提交于
      This patch adds the lock depth of the big kernel lock to the generic
      entry header. This way we can see the depth of the lock and help
      in removing the BKL.
      
      Example:
      
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
         <idle>-0       2.N..3 5902255250us+: lock_acquire: read rcu_read_lock
         <idle>-0       2.N..3 5902255253us+: lock_release: rcu_read_lock
         <idle>-0       2dN..3 5902255257us+: lock_acquire: xtime_lock
         <idle>-0       2dN..4 5902255259us : lock_acquire: clocksource_lock
         <idle>-0       2dN..4 5902255261us+: lock_release: clocksource_lock
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      637e7e86