1. 26 1月, 2009 1 次提交
    • A
      blktrace: add ftrace plugin · c71a8961
      Arnaldo Carvalho de Melo 提交于
      Impact: New way of using the blktrace infrastructure
      
      This drops the requirement of userspace utilities to use the blktrace
      facility.
      
      Configuration is done thru sysfs, adding a "trace" directory to the
      partition directory where blktrace can be enabled for the associated
      request_queue.
      
      The same filters present in the IOCTL interface are present as sysfs
      device attributes.
      
      The /sys/block/sdX/sdXN/trace/enable file allows tracing without any
      filters.
      
      The other files in this directory: pid, act_mask, start_lba and end_lba
      can be used with the same meaning as with the IOCTL interface.
      
      Using the sysfs interface will only setup the request_queue->blk_trace
      fields, tracing will only take place when the "blk" tracer is selected
      via the ftrace interface, as in the following example:
      
      To see the trace, one can use the /d/tracing/trace file or the
      /d/tracign/trace_pipe file, with semantics defined in the ftrace
      documentation in Documentation/ftrace.txt.
      
      [root@f10-1 ~]# cat /t/trace
             kjournald-305   [000]  3046.491224:   8,1    A WBS 6367 + 8 <- (8,1) 6304
             kjournald-305   [000]  3046.491227:   8,1    Q   R 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491236:   8,1    G  RB 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491239:   8,1    P  NS [kjournald]
             kjournald-305   [000]  3046.491242:   8,1    I RBS 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491251:   8,1    D  WB 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491610:   8,1    U  WS [kjournald] 1
                <idle>-0     [000]  3046.511914:   8,1    C  RS 6367 + 8 [6367]
      [root@f10-1 ~]#
      
      The default line context (prefix) format is the one described in the ftrace
      documentation, with the blktrace specific bits using its existing format,
      described in blkparse(8).
      
      If one wants to have the classic blktrace formatting, this is possible by
      using:
      
      [root@f10-1 ~]# echo blk_classic > /t/trace_options
      [root@f10-1 ~]# cat /t/trace
        8,1    0  3046.491224   305  A WBS 6367 + 8 <- (8,1) 6304
        8,1    0  3046.491227   305  Q   R 6367 + 8 [kjournald]
        8,1    0  3046.491236   305  G  RB 6367 + 8 [kjournald]
        8,1    0  3046.491239   305  P  NS [kjournald]
        8,1    0  3046.491242   305  I RBS 6367 + 8 [kjournald]
        8,1    0  3046.491251   305  D  WB 6367 + 8 [kjournald]
        8,1    0  3046.491610   305  U  WS [kjournald] 1
        8,1    0  3046.511914     0  C  RS 6367 + 8 [6367]
      [root@f10-1 ~]#
      
      Using the ftrace standard format allows more flexibility, such
      as the ability of asking for backtraces via trace_options:
      
      [root@f10-1 ~]# echo noblk_classic > /t/trace_options
      [root@f10-1 ~]# echo stacktrace > /t/trace_options
      
      [root@f10-1 ~]# cat /t/trace
             kjournald-305   [000]  3318.826779:   8,1    A WBS 6375 + 8 <- (8,1) 6312
             kjournald-305   [000]  3318.826782:
       <= submit_bio
       <= submit_bh
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
       <= kthread
       <= child_rip
             kjournald-305   [000]  3318.826836:   8,1    Q   R 6375 + 8 [kjournald]
             kjournald-305   [000]  3318.826837:
       <= generic_make_request
       <= submit_bio
       <= submit_bh
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
       <= kthread
      
      Please read the ftrace documentation to use aditional, standardized
      tracing filters such as /d/tracing/trace_cpumask, etc.
      
      See also /d/tracing/trace_mark to add comments in the trace stream,
      that is equivalent to the /d/block/sdaN/msg interface.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c71a8961
  2. 20 1月, 2009 1 次提交
  3. 16 1月, 2009 2 次提交
    • S
      ftrace: remove static from function tracer functions · a225cdd2
      Steven Rostedt 提交于
      Impact: clean up
      
      After reorganizing the functions in trace.c and trace_function.c,
      they no longer need to be in global context. This patch makes the
      functions and one variable into static.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a225cdd2
    • S
      ftrace: add stack trace to function tracer · 53614991
      Steven Rostedt 提交于
      Impact: new feature to stack trace any function
      
      Chris Mason asked about being able to pick and choose a function
      and get a stack trace from it. This feature enables his request.
      
       # echo io_schedule > /debug/tracing/set_ftrace_filter
       # echo function > /debug/tracing/current_tracer
       # echo func_stack_trace > /debug/tracing/trace_options
      
      Produces the following in /debug/tracing/trace:
      
             kjournald-702   [001]   135.673060: io_schedule <-sync_buffer
             kjournald-702   [002]   135.673671:
       <= sync_buffer
       <= __wait_on_bit
       <= out_of_line_wait_on_bit
       <= __wait_on_buffer
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
      
      Note, be careful about turning this on without filtering the functions.
      You may find that you have a 10 second lag between typing and seeing
      what you typed. This is why the stack trace for the function tracer
      does not use the same stack_trace flag as the other tracers use.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53614991
  4. 14 1月, 2009 1 次提交
  5. 11 1月, 2009 1 次提交
    • F
      tracing/ftrace: handle more than one stat file per tracer · 034939b6
      Frederic Weisbecker 提交于
      Impact: new API for tracers
      
      Make the stat tracing API reentrant. And also provide the new directory
      /debugfs/tracing/trace_stat which will contain all the stat files for the
      current active tracer.
      
      Now a tracer will, if desired, want to provide a zero terminated array of
      tracer_stat structures.
      Each one contains the callbacks necessary for one stat file.
      It have to provide at least a name for its stat file, an iterator with
      stat_start/start_next callback and an output callback for one stat entry.
      
      Also adapt the branch tracer to this new API.
      We create two files "all" and "annotated" inside the /debugfs/tracing/trace_stat
      directory, making the both stats simultaneously available instead of needing
      to change an option to switch from one stat file to another.
      
      The output of these stats haven't changed.
      
      Changes in v2:
      
      _ Apply the previous memory leak fix (rebase against tip/master)
      
      Changes in v3:
      
      _ Merge the patch that adapted the branch tracer to this Api in this patch to
        not break the kernel build.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      034939b6
  6. 01 1月, 2009 1 次提交
  7. 30 12月, 2008 1 次提交
    • F
      tracing/kmemtrace: normalize the raw tracer event to the unified tracing API · 36994e58
      Frederic Weisbecker 提交于
      Impact: new tracer plugin
      
      This patch adapts kmemtrace raw events tracing to the unified tracing API.
      
      To enable and use this tracer, just do the following:
      
       echo kmemtrace > /debugfs/tracing/current_tracer
       cat /debugfs/tracing/trace
      
      You will have the following output:
      
       # tracer: kmemtrace
       #
       #
       # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
       # FREE   |      |     |       |              |   |            |        |
       # |
      
      type_id 1 call_site 18446744071565527833 ptr 18446612134395152256
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      
      That was to stay backward compatible with the format output produced in
      inux/tracepoint.h.
      
      This is the default ouput, but note that I tried something else.
      
      If you change an option:
      
      echo kmem_minimalistic > /debugfs/trace_options
      
      and then cat /debugfs/trace, you will have the following output:
      
       # tracer: kmemtrace
       #
       #
       # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
       # FREE   |      |     |       |              |   |            |        |
       # |
      
         -      C                            0xffff88007c088780          file_free_rcu
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc780     -1   d_alloc
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc870     -1   d_alloc
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc960     -1   d_alloc
         +      K   1304   1312   000000d0   0xffff8800791d7340     -1   reiserfs_alloc_inode
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         -      C                            0xffff88007cad6000          putname
         +      K    992   1000   000000d0   0xffff880079045b58     -1   alloc_inode
         +      K    768   1024   000080d0   0xffff88007c096400     -1   alloc_pipe_info
         +      K    240    240   000000d0   0xffff8800790dca50     -1   d_alloc
         +      K    272    320   000080d0   0xffff88007c088780     -1   get_empty_filp
         +      K    272    320   000080d0   0xffff88007c088000     -1   get_empty_filp
      
      Yeah I shall confess kmem_minimalistic should be: kmem_alternative.
      
      Whatever, I find it more readable but this a personal opinion of course.
      We can drop it if you want.
      
      On the ALLOC/FREE column, + means an allocation and - a free.
      
      On the type column, you have K = kmalloc, C = cache, P = page
      
      I would like the flags to be GFP_* strings but that would not be easy to not
      break the column with strings....
      
      About the node...it seems to always be -1. I don't know why but that shouldn't
      be difficult to find.
      
      I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would
      be more easy to find the tracer headers if they are all in their common
      directory.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      36994e58
  8. 29 12月, 2008 4 次提交
    • I
      tracing/ftrace: make trace_find_cmdline() generally available · f7d48cbd
      Ingo Molnar 提交于
      Impact: build fix
      
      On !CONFIG_CONTEXT_SWITCH_TRACER trace_find_cmdline() is not defined:
      
       kernel/trace/trace_output.c: In function 'trace_ctxwake_print':
       kernel/trace/trace_output.c:499: error: implicit declaration of function 'trace_find_cmdline'
       kernel/trace/trace_output.c:499: warning: assignment makes pointer from integer without a cast
      
      Move it to the generic section in trace.h.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f7d48cbd
    • F
      tracing/ftrace: provide the base infrastructure for histogram tracing · dbd0b4b3
      Frederic Weisbecker 提交于
      Impact: extend the tracing API
      
      The goal of this patch is to normalize and make more easy the
      implementation of statistical (histogram) tracing.
      
      It implements a trace_stat file into the /debugfs/tracing directory where
      one can print a one-shot output of statistics/histogram entries.
      
      A tracer has to provide two basic iterator callbacks:
      
        stat_start() => the first entry
        stat_next(prev, idx) => the next one.
      
      Note that it is adapted for arrays or hash tables or lists.... since it
      provides a pointer to the previous entry and the current index of the
      iterator.
      
      These two callbacks are called to get a snapshot of the statistics at each
      opening of the trace_stat file because. The values are so updated between
      two "cat trace_stat". And the tracer is free to lock its datas during the
      iteration to keep consistent values.
      
      Since it is almost always interesting to sort statisticals values to
      address the problems by priority, this infrastructure provides a "sorting"
      of the stat entries too if desired. A tracer has just to provide a
      stat_cmp callback to compare two entries and the stat tracing
      infrastructure will build a sorted list of the given entries.
      
      A last callback, called stat_headers, can be implemented by a tracer to
      output headers on its trace.
      
      If one of these callbacks is changed on runtime, it just have to signal it
      to the stat tracing API by calling the init_tracer_stat() helper.
      
      Changes in V2:
      
      - Fix a memory leak if the user opens multiple times the trace_stat file
        without closing it. Now we always free our list before rebuilding it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dbd0b4b3
    • S
      ftrace: set up trace event hash infrastructure · f0868d1e
      Steven Rostedt 提交于
      Impact: simplify/generalize/refactor trace.c
      
      The trace.c file is becoming more difficult to maintain due to the
      growing number of events. There is several formats that an event may
      be printed. This patch sets up the infrastructure of an event hash to
      allow for events to register how they should be printed.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f0868d1e
    • S
      ftrace: remove obsolete print continue functionality · c47956d9
      Steven Rostedt 提交于
      Impact: cleanup, remove obsolete code
      
      Now that the ring buffer used by ftrace allows for variable length
      entries, we do not need the 'cont' feature of the buffer.  This code
      makes other parts of ftrace more complex and by removing this it
      simplifies the ftrace code.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c47956d9
  9. 19 12月, 2008 1 次提交
  10. 17 12月, 2008 1 次提交
    • F
      tracing/ftrace: add the printk-msg-only option · 66896a85
      Frederic Weisbecker 提交于
      Impact: display ftrace_printk messages "as is"
      
      By default, ftrace_printk() messages find their output with some other
      informations like pid, caller, ...
      Sometimes a developer just want to have the ftrace_printk left "as is", without
      other information.
      
      This is done by providing a default-off option called printk-msg-only.
      To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options`
      
      Before the patch:
      
                 <...>-2739  [000]   145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep
                 <...>-2739  [000]   145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep
      
      After the patch and the printk-msg-only option enabled:
      
      I'm an ftrace_printk msg in __might_sleep
      I'm another ftrace_printk msg in __might_sleep
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66896a85
  11. 12 12月, 2008 1 次提交
  12. 05 12月, 2008 1 次提交
    • F
      tracing/ftrace: fix the check of ftrace_trace_task · 77d683f3
      Frederic Weisbecker 提交于
      Impact: fix default empty traces on function-graph-tracer
      
      The actual ftrace_trace_task() checks if ftrace_pid_trace is allocated
      and return 1 if it is true.
      If it is NULL, it will check the bit of pid tracing flag for the current
      task (which are not set by default).
      So by default, a task is not traced.
      Actually all tasks should be traced by default and filter_by_pid when
      ftrace_pid_trace is allocated.
      
      The appropriate condition should be to return 1 if filter_by_pid is
      set.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acke-dby: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77d683f3
  13. 04 12月, 2008 5 次提交
    • F
      tracing/function-graph-tracer: handle ftrace_printk entries · 1fd8f2a3
      Frederic Weisbecker 提交于
      Handle the TRACE_PRINT entries from the function grapg tracer
      and output them as a C comment just below the function that called
      it, as if it was a comment inside this function.
      
      Example with an ftrace_printk inside might_sleep() function:
      
      void __might_sleep(char *file, int line)
      {
      	static unsigned long prev_jiffy;	/* ratelimiting */
      
      	ftrace_printk("Hi I'm a comment in might_sleep() :-)");
      
      A chunk of a resulting trace:
      
       0)               |        _reiserfs_free_block() {
       0)               |          reiserfs_read_bitmap_block() {
       0)               |            __bread() {
       0)               |              __getblk() {
       0)               |                __find_get_block() {
       0)   0.698 us    |                  mark_page_accessed();
       0)   2.267 us    |                }
       0)               |                __might_sleep() {
       0)               |                  /* Hi I'm a comment in might_sleep() :-) */
       0)   1.321 us    |                }
       0)   5.872 us    |              }
       0)   7.313 us    |            }
       0)   8.718 us    |          }
      
      And this patch brings two minor fixes:
      
      - The newline after a switch-out task has disappeared
      - The "|" sign just before the cpu number on task-switch has been deleted.
      
       0)   0.616 us    |                pick_next_task_rt();
       0)   1.457 us    |                _spin_trylock();
       0)   0.653 us    |                _spin_unlock();
       0)   0.728 us    |                _spin_trylock();
       0)   0.631 us    |                _spin_unlock();
       0)   0.729 us    |                native_load_sp0();
       0)   0.593 us    |                native_load_tls();
       ------------------------------------------
       0)    cat-2834    =>   migrati-3
       ------------------------------------------
      
       0)               |    finish_task_switch() {
       0)   0.841 us    |      _spin_unlock_irq();
       0)   0.616 us    |      post_schedule_rt();
       0)   3.882 us    |    }
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fd8f2a3
    • I
      tracing: fix typo and missing inline function · 6b253930
      Ingo Molnar 提交于
      Impact: fix build bugs
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6b253930
    • S
      ftrace: use struct pid · 978f3a45
      Steven Rostedt 提交于
      Impact: clean up, extend PID filtering to PID namespaces
      
      Eric Biederman suggested using the struct pid for filtering on
      pids in the kernel. This patch is based off of a demonstration
      of an implementation that Eric sent me in an email.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      978f3a45
    • S
      ftrace: trace single pid for function graph tracer · 804a6851
      Steven Rostedt 提交于
      Impact: New feature
      
      This patch makes the changes to set_ftrace_pid apply to the function
      graph tracer.
      
        # echo $$ > /debugfs/tracing/set_ftrace_pid
        # echo function_graph > /debugfs/tracing/current_tracer
      
      Will cause only the current task to be traced. Note, the trace flags are
      also inherited by child processes, so the children of the shell
      will also be traced.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      804a6851
    • S
      ftrace: graph of a single function · ea4e2bc4
      Steven Rostedt 提交于
      This patch adds the file:
      
         /debugfs/tracing/set_graph_function
      
      which can be used along with the function graph tracer.
      
      When this file is empty, the function graph tracer will act as
      usual. When the file has a function in it, the function graph
      tracer will only trace that function.
      
      For example:
      
       # echo blk_unplug > /debugfs/tracing/set_graph_function
       # cat /debugfs/tracing/trace
       [...]
       ------------------------------------------
       | 2)  make-19003  =>  kjournald-2219
       ------------------------------------------
      
       2)               |  blk_unplug() {
       2)               |    dm_unplug_all() {
       2)               |      dm_get_table() {
       2)      1.381 us |        _read_lock();
       2)      0.911 us |        dm_table_get();
       2)      1. 76 us |        _read_unlock();
       2) +   12.912 us |      }
       2)               |      dm_table_unplug_all() {
       2)               |        blk_unplug() {
       2)      0.778 us |          generic_unplug_device();
       2)      2.409 us |        }
       2)      5.992 us |      }
       2)      0.813 us |      dm_table_put();
       2) +   29. 90 us |    }
       2) +   34.532 us |  }
      
      You can add up to 32 functions into this file. Currently we limit it
      to 32, but this may change with later improvements.
      
      To add another function, use the append '>>':
      
        # echo sys_read >> /debugfs/tracing/set_graph_function
        # cat /debugfs/tracing/set_graph_function
        blk_unplug
        sys_read
      
      Using the '>' will clear out the function and write anew:
      
        # echo sys_write > /debug/tracing/set_graph_function
        # cat /debug/tracing/set_graph_function
        sys_write
      
      Note, if you have function graph running while doing this, the small
      time between clearing it and updating it will cause the graph to
      record all functions. This should not be an issue because after
      it sets the filter, only those functions will be recorded from then on.
      If you need to only record a particular function then set this
      file first before starting the function graph tracer. In the future
      this side effect may be corrected.
      
      The set_graph_function file is similar to the set_ftrace_filter but
      it does not take wild cards nor does it allow for more than one
      function to be set with a single write. There is no technical reason why
      this is the case, I just do not have the time yet to implement that.
      
      Note, dynamic ftrace must be enabled for this to appear because it
      uses the dynamic ftrace records to match the name to the mcount
      call sites.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea4e2bc4
  14. 03 12月, 2008 1 次提交
  15. 26 11月, 2008 6 次提交
    • A
      tracing: add "power-tracer": C/P state tracer to help power optimization · f3f47a67
      Arjan van de Ven 提交于
      Impact: new "power-tracer" ftrace plugin
      
      This patch adds a C/P-state ftrace plugin that will generate
      detailed statistics about the C/P-states that are being used,
      so that we can look at detailed decisions that the C/P-state
      code is making, rather than the too high level "average"
      that we have today.
      
      An example way of using this is:
      
       mount -t debugfs none /sys/kernel/debug
       echo cstate > /sys/kernel/debug/tracing/current_tracer
       echo 1 > /sys/kernel/debug/tracing/tracing_enabled
       sleep 1
       echo 0 > /sys/kernel/debug/tracing/tracing_enabled
       cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f3f47a67
    • S
      ftrace: add thread comm to function graph tracer · 660c7f9b
      Steven Rostedt 提交于
      Impact: enhancement to function graph tracer
      
      Export the trace_find_cmdline so the function graph tracer can
      use it to print the comms of the threads.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      660c7f9b
    • F
      tracing/function-return-tracer: set a more human readable output · 287b6e68
      Frederic Weisbecker 提交于
      Impact: feature
      
      This patch sets a C-like output for the function graph tracing.
      For this aim, we now call two handler for each function: one on the entry
      and one other on return. This way we can draw a well-ordered call stack.
      
      The pid of the previous trace is loosely stored to be compared against
      the one of the current trace to see if there were a context switch.
      
      Without this little feature, the call tree would seem broken at
      some locations.
      We could use the sched_tracer to capture these sched_events but this
      way of processing is much more simpler.
      
      2 spaces have been chosen for indentation to fit the screen while deep
      calls. The time of execution in nanosecs is printed just after closed
      braces, it seems more easy this way to find the corresponding function.
      If the time was printed as a first column, it would be not so easy to
      find the corresponding function if it is called on a deep depth.
      
      I plan to output the return value but on 32 bits CPU, the return value
      can be 32 or 64, and its difficult to guess on which case we are.
      I don't know what would be the better solution on X86-32: only print
      eax (low-part) or even edx (high-part).
      
      Actually it's thee same problem when a function return a 8 bits value, the
      high part of eax could contain junk values...
      
      Here is an example of trace:
      
      sys_read() {
        fget_light() {
        } 526
        vfs_read() {
          rw_verify_area() {
            security_file_permission() {
              cap_file_permission() {
              } 519
            } 1564
          } 2640
          do_sync_read() {
            pipe_read() {
              __might_sleep() {
              } 511
              pipe_wait() {
                prepare_to_wait() {
                } 760
                deactivate_task() {
                  dequeue_task() {
                    dequeue_task_fair() {
                      dequeue_entity() {
                        update_curr() {
                          update_min_vruntime() {
                          } 504
                        } 1587
                        clear_buddies() {
                        } 512
                        add_cfs_task_weight() {
                        } 519
                        update_min_vruntime() {
                        } 511
                      } 5602
                      dequeue_entity() {
                        update_curr() {
                          update_min_vruntime() {
                          } 496
                        } 1631
                        clear_buddies() {
                        } 496
                        update_min_vruntime() {
                        } 527
                      } 4580
                      hrtick_update() {
                        hrtick_start_fair() {
                        } 488
                      } 1489
                    } 13700
                  } 14949
                } 16016
                msecs_to_jiffies() {
                } 496
                put_prev_task_fair() {
                } 504
                pick_next_task_fair() {
                } 489
                pick_next_task_rt() {
                } 496
                pick_next_task_fair() {
                } 489
                pick_next_task_idle() {
                } 489
      
      ------------8<---------- thread 4 ------------8<----------
      
      finish_task_switch() {
      } 1203
      do_softirq() {
        __do_softirq() {
          __local_bh_disable() {
          } 669
          rcu_process_callbacks() {
            __rcu_process_callbacks() {
              cpu_quiet() {
                rcu_start_batch() {
                } 503
              } 1647
            } 3128
            __rcu_process_callbacks() {
            } 542
          } 5362
          _local_bh_enable() {
          } 587
        } 8880
      } 9986
      kthread_should_stop() {
      } 669
      deactivate_task() {
        dequeue_task() {
          dequeue_task_fair() {
            dequeue_entity() {
              update_curr() {
                calc_delta_mine() {
                } 511
                update_min_vruntime() {
                } 511
              } 2813
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      287b6e68
    • F
      tracing/function-return-tracer: change the name into function-graph-tracer · fb52607a
      Frederic Weisbecker 提交于
      Impact: cleanup
      
      This patch changes the name of the "return function tracer" into
      function-graph-tracer which is a more suitable name for a tracing
      which makes one able to retrieve the ordered call stack during
      the code flow.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fb52607a
    • M
      x86, bts, ftrace: a BTS ftrace plug-in prototype · 1e9b51c2
      Markus Metzger 提交于
      Impact: add new ftrace plugin
      
      A prototype for a BTS ftrace plug-in.
      
      The tracer collects branch trace in a cyclic buffer for each cpu.
      
      The tracer is not configurable and the trace for each snapshot is
      appended when doing cat /debug/tracing/trace.
      
      This is a proof of concept that will be extended with future patches
      to become a (hopefully) useful tool.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1e9b51c2
    • M
      x86, ftrace: call trace->open() before stopping tracing; add trace->print_header() · 8bba1bf5
      Markus Metzger 提交于
      Add a callback to allow an ftrace plug-in to write its own header.
      
      Move the call to trace->open() up a few lines.
      
      The changes are required by the BTS ftrace plug-in.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8bba1bf5
  16. 23 11月, 2008 2 次提交
    • T
      tracing: identify which executable object the userspace address belongs to · b54d3de9
      Török Edwin 提交于
      Impact: modify+improve the userstacktrace tracing visualization feature
      
      Store thread group leader id, and use it to lookup the address in the
      process's map. We could have looked up the address on thread's map,
      but the thread might not exist by the time we are called. The process
      might not exist either, but if you are reading trace_pipe, that is
      unlikely.
      
      Example usage:
      
       mount -t debugfs nodev /sys/kernel/debug
       cd /sys/kernel/debug/tracing
       echo userstacktrace >iter_ctrl
       echo sym-userobj >iter_ctrl
       echo sched_switch >current_tracer
       echo 1 >tracing_enabled
       cat trace_pipe >/tmp/trace&
       .... run application ...
       echo 0 >tracing_enabled
       cat /tmp/trace
      
      You'll see stack entries like:
      
         /lib/libpthread-2.7.so[+0xd370]
      
      You can convert them to function/line using:
      
         addr2line -fie /lib/libpthread-2.7.so 0xd370
      
      Or:
      
         addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370
      
      For non-PIC/PIE executables this won't work:
      
         a.out[+0x73b]
      
      You need to run the following: addr2line -fie a.out 0x40073b
      (where 0x400000 is the default load address of a.out)
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b54d3de9
    • T
      tracing: add support for userspace stacktraces in tracing/iter_ctrl · 02b67518
      Török Edwin 提交于
      Impact: add new (default-off) tracing visualization feature
      
      Usage example:
      
       mount -t debugfs nodev /sys/kernel/debug
       cd /sys/kernel/debug/tracing
       echo userstacktrace >iter_ctrl
       echo sched_switch >current_tracer
       echo 1 >tracing_enabled
       .... run application ...
       echo 0 >tracing_enabled
      
      Then read one of 'trace','latency_trace','trace_pipe'.
      
      To get the best output you can compile your userspace programs with
      frame pointers (at least glibc + the app you are tracing).
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      02b67518
  17. 18 11月, 2008 2 次提交
    • F
      tracing/function-return-tracer: add the overrun field · 0231022c
      Frederic Weisbecker 提交于
      Impact: help to find the better depth of trace
      
      We decided to arbitrary define the depth of function return trace as
      "20". Perhaps this is not enough. To help finding an optimal depth, we
      measure now the overrun: the number of functions that have been missed
      for the current thread. By default this is not displayed, we have to
      do set a particular flag on the return tracer: echo overrun >
      /debug/tracing/trace_options And the overrun will be printed on the
      right.
      
      As the trace shows below, the current 20 depth is not enough.
      
      update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838)
      update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838)
      do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838)
      tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838)
      tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838)
      vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274)
      vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274)
      set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274)
      con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274)
      release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274)
      release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274)
      con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274)
      con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274)
      n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274)
      irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274)
      ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274)
      ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274)
      ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274)
      hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0231022c
    • F
      tracing/ftrace: implement a set_flag callback for tracers · adf9f195
      Frederic Weisbecker 提交于
      Impact: give a way to send specific messages to tracers
      
      The current implementation of tracing uses some flags to control the
      output of general tracers. But we have no way to implement custom
      flags handling for a specific tracer. This patch proposes a new
      callback for the struct tracer which called set_flag and a structure
      that represents a 32 bits variable flag.
      
      A tracer can implement a struct tracer_flags on which it puts the
      initial value of the flag integer. Than it can place a range of flags
      with their name and their flag mask on the flag integer. The structure
      that implement a single flag is called struct tracer_opt.
      
      These custom flags will be available through the trace_options file
      like the general tracing flags. Changing their value is done like the
      other general flags. For example if you have a flag that calls "foo",
      you can activate it by writing "foo" or "nofoo" on trace_options.
      
      Note that the set_flag callback is optional and is only needed if you
      want the flags changing to be signaled to your tracer and let it to
      accept or refuse their assignment.
      
      V2: Some arrangements in coding style....
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      adf9f195
  18. 17 11月, 2008 1 次提交
  19. 16 11月, 2008 1 次提交
  20. 13 11月, 2008 4 次提交
    • S
      ftrace: CPU buffer start annotation clean ups · 12ef7d44
      Steven Rostedt 提交于
      Impact: better handling of CPU buffer start annotation
      
      Because of the confusion with the per CPU buffers wrapping where
      one CPU might be more active at the end of the trace than the other
      CPUs causing that one CPU to have a shorter history. Kernel
      developers were confused by the "missing" data of that one CPU
      at the beginning of the trace output. An annotation was added to
      the trace output to show that the buffer had started:
      
       # tracer: function
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
       ##### CPU 3 buffer started ####
                <idle>-0     [003]   158.192959: smp_apic_timer_interrupt
       [...]
                 <idle>-0     [003]   161.556520: default_idle
       ##### CPU 1 buffer started ####
                 <idle>-0     [001]   161.592494: hrtimer_force_reprogram
       [etc]
      
      But this annotation gets a bit messy when tracers do not fill the
      buffers. This patch does a couple of things:
      
       One) it adds a flag to trace_options to disable these annotations
      
       Two) it does not annotate if the tracer did not overflow its buffer.
      
      This makes the output much cleaner.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12ef7d44
    • S
      ftrace: add tracer called branch · 80e5ea45
      Steven Rostedt 提交于
      Impact: added new branch tracer
      
      Currently the tracing of branch profiling (unlikelys and likelys hit)
      is only activated by the iter_ctrl. This patch adds a tracer called
      "branch" that will just trace the branch profiling. The advantage
      of adding this tracer is that it can be added to the ftrace selftests
      on startup.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      80e5ea45
    • S
      ftrace: rename unlikely iter_ctrl to branch · 9f029e83
      Steven Rostedt 提交于
      Impact: rename of iter_ctrl unlikely to branch
      
      The unlikely name is ugly. This patch converts the iter_ctrl command
      "unlikely" and "nounlikely" to "branch" and "nobranch" respectively.
      
      It also renames a lot of internal functions to use "branch" instead
      of "unlikely".
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f029e83
    • S
      trace: rename unlikely profiler to branch profiler · 2ed84eeb
      Steven Rostedt 提交于
      Impact: name change of unlikely tracer and profiler
      
      Ingo Molnar suggested changing the config from UNLIKELY_PROFILE
      to BRANCH_PROFILING. I never did like the "unlikely" name so I
      went one step farther, and renamed all the unlikely configurations
      to a "BRANCH" variant.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ed84eeb
  21. 12 11月, 2008 2 次提交
    • S
      tracing: likely/unlikely branch annotation tracer · 52f232cb
      Steven Rostedt 提交于
      Impact: new likely/unlikely branch tracer
      
      This patch adds a way to record the instances of the likely() and unlikely()
      branch condition annotations.
      
      When "unlikely" is set in /debugfs/tracing/iter_ctrl the unlikely conditions
      will be added to any of the ftrace tracers. The change takes effect when
      a new tracer is passed into the current_tracer file.
      
      For example:
      
       bash-3471  [003]   357.014755: [INCORRECT] sched_info_dequeued:sched_stats.h:177
       bash-3471  [003]   357.014756: [correct] update_curr:sched_fair.c:489
       bash-3471  [003]   357.014758: [correct] calc_delta_fair:sched_fair.c:411
       bash-3471  [003]   357.014759: [correct] account_group_exec_runtime:sched_stats.h:356
       bash-3471  [003]   357.014761: [correct] update_curr:sched_fair.c:489
       bash-3471  [003]   357.014763: [INCORRECT] calc_delta_fair:sched_fair.c:411
       bash-3471  [003]   357.014765: [correct] calc_delta_mine:sched.c:1279
      
      Which shows the normal tracer heading, as well as whether the condition was
      correct "[correct]" or was mistaken "[INCORRECT]", followed by the function,
      file name and line number.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      52f232cb
    • F
      tracing/fastboot: Use the ring-buffer timestamp for initcall entries · 74239072
      Frederic Weisbecker 提交于
      Impact: Split the boot tracer entries in two parts: call and return
      
      Now that we are using the sched tracer from the boot tracer, we want
      to use the same timestamp than the ring-buffer to have consistent time
      captures between sched events and initcall events.
      
      So we get rid of the old time capture by the boot tracer and split the
      initcall events in two parts: call and return. This way we have the
      ring buffer timestamp of both.
      
      An example trace:
      
      [   27.904149584] calling  net_ns_init+0x0/0x1c0 @ 1
      [   27.904429624] initcall net_ns_init+0x0/0x1c0 returned 0 after 0 msecs
      [   27.904575926] calling  reboot_init+0x0/0x20 @ 1
      [   27.904655399] initcall reboot_init+0x0/0x20 returned 0 after 0 msecs
      [   27.904800228] calling  sysctl_init+0x0/0x30 @ 1
      [   27.905142914] initcall sysctl_init+0x0/0x30 returned 0 after 0 msecs
      [   27.905287211] calling  ksysfs_init+0x0/0xb0 @ 1
       ##### CPU 0 buffer started ####
                  init-1     [000]    27.905395:      1:120:R   + [001]    11:115:S
       ##### CPU 1 buffer started ####
                <idle>-0     [001]    27.905425:      0:140:R ==> [001]    11:115:R
                  init-1     [000]    27.905426:      1:120:D ==> [000]     0:140:R
                <idle>-0     [000]    27.905431:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905451:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.905456:      4:115:S ==> [000]     0:140:R
                 udevd-11    [001]    27.905458:     11:115:R   + [001]    14:115:R
                <idle>-0     [000]    27.905459:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905462:      0:140:R ==> [000]     4:115:R
                 udevd-11    [001]    27.905462:     11:115:R ==> [001]    14:115:R
           ksoftirqd/0-4     [000]    27.905467:      4:115:S ==> [000]     0:140:R
                <idle>-0     [000]    27.905470:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905473:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.905476:      4:115:S ==> [000]     0:140:R
                <idle>-0     [000]    27.905479:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905482:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.905486:      4:115:S ==> [000]     0:140:R
                 udevd-14    [001]    27.905499:     14:120:X ==> [001]    11:115:R
                 udevd-11    [001]    27.905506:     11:115:R   + [000]     1:120:D
                <idle>-0     [000]    27.905515:      0:140:R ==> [000]     1:120:R
                 udevd-11    [001]    27.905517:     11:115:S ==> [001]     0:140:R
      [   27.905557107] initcall ksysfs_init+0x0/0xb0 returned 0 after 3906 msecs
      [   27.905705736] calling  init_jiffies_clocksource+0x0/0x10 @ 1
      [   27.905779239] initcall init_jiffies_clocksource+0x0/0x10 returned 0 after 0 msecs
      [   27.906769814] calling  pm_init+0x0/0x30 @ 1
      [   27.906853627] initcall pm_init+0x0/0x30 returned 0 after 0 msecs
      [   27.906997803] calling  pm_disk_init+0x0/0x20 @ 1
      [   27.907076946] initcall pm_disk_init+0x0/0x20 returned 0 after 0 msecs
      [   27.907222556] calling  swsusp_header_init+0x0/0x30 @ 1
      [   27.907294325] initcall swsusp_header_init+0x0/0x30 returned 0 after 0 msecs
      [   27.907439620] calling  stop_machine_init+0x0/0x50 @ 1
                  init-1     [000]    27.907485:      1:120:R   + [000]     2:115:S
                  init-1     [000]    27.907490:      1:120:D ==> [000]     2:115:R
              kthreadd-2     [000]    27.907507:      2:115:R   + [001]    15:115:R
                <idle>-0     [001]    27.907517:      0:140:R ==> [001]    15:115:R
              kthreadd-2     [000]    27.907517:      2:115:D ==> [000]     0:140:R
                <idle>-0     [000]    27.907521:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.907524:      0:140:R ==> [000]     4:115:R
                 udevd-15    [001]    27.907527:     15:115:D   + [000]     2:115:D
           ksoftirqd/0-4     [000]    27.907537:      4:115:S ==> [000]     2:115:R
                 udevd-15    [001]    27.907537:     15:115:D ==> [001]     0:140:R
              kthreadd-2     [000]    27.907546:      2:115:R   + [000]     1:120:D
              kthreadd-2     [000]    27.907550:      2:115:S ==> [000]     1:120:R
                  init-1     [000]    27.907584:      1:120:R   + [000]    15:  0:D
                  init-1     [000]    27.907589:      1:120:R   + [000]     2:115:S
                  init-1     [000]    27.907593:      1:120:D ==> [000]    15:  0:R
                 udevd-15    [000]    27.907601:     15:  0:S ==> [000]     2:115:R
       ##### CPU 0 buffer started ####
              kthreadd-2     [000]    27.907616:      2:115:R   + [001]    16:115:R
       ##### CPU 1 buffer started ####
                <idle>-0     [001]    27.907620:      0:140:R ==> [001]    16:115:R
              kthreadd-2     [000]    27.907621:      2:115:D ==> [000]     0:140:R
                 udevd-16    [001]    27.907625:     16:115:D   + [000]     2:115:D
                <idle>-0     [000]    27.907628:      0:140:R   + [000]     4:115:S
                 udevd-16    [001]    27.907629:     16:115:D ==> [001]     0:140:R
                <idle>-0     [000]    27.907631:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.907636:      4:115:S ==> [000]     2:115:R
              kthreadd-2     [000]    27.907644:      2:115:R   + [000]     1:120:D
              kthreadd-2     [000]    27.907647:      2:115:S ==> [000]     1:120:R
                  init-1     [000]    27.907657:      1:120:R   + [001]    16:  0:D
                <idle>-0     [001]    27.907666:      0:140:R ==> [001]    16:  0:R
      [   27.907703862] initcall stop_machine_init+0x0/0x50 returned 0 after 0 msecs
      [   27.907850704] calling  filelock_init+0x0/0x30 @ 1
      [   27.907926573] initcall filelock_init+0x0/0x30 returned 0 after 0 msecs
      [   27.908071327] calling  init_script_binfmt+0x0/0x10 @ 1
      [   27.908165195] initcall init_script_binfmt+0x0/0x10 returned 0 after 0 msecs
      [   27.908309461] calling  init_elf_binfmt+0x0/0x10 @ 1
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      74239072