1. 04 12月, 2008 1 次提交
    • S
      ftrace: graph of a single function · ea4e2bc4
      Steven Rostedt 提交于
      This patch adds the file:
      
         /debugfs/tracing/set_graph_function
      
      which can be used along with the function graph tracer.
      
      When this file is empty, the function graph tracer will act as
      usual. When the file has a function in it, the function graph
      tracer will only trace that function.
      
      For example:
      
       # echo blk_unplug > /debugfs/tracing/set_graph_function
       # cat /debugfs/tracing/trace
       [...]
       ------------------------------------------
       | 2)  make-19003  =>  kjournald-2219
       ------------------------------------------
      
       2)               |  blk_unplug() {
       2)               |    dm_unplug_all() {
       2)               |      dm_get_table() {
       2)      1.381 us |        _read_lock();
       2)      0.911 us |        dm_table_get();
       2)      1. 76 us |        _read_unlock();
       2) +   12.912 us |      }
       2)               |      dm_table_unplug_all() {
       2)               |        blk_unplug() {
       2)      0.778 us |          generic_unplug_device();
       2)      2.409 us |        }
       2)      5.992 us |      }
       2)      0.813 us |      dm_table_put();
       2) +   29. 90 us |    }
       2) +   34.532 us |  }
      
      You can add up to 32 functions into this file. Currently we limit it
      to 32, but this may change with later improvements.
      
      To add another function, use the append '>>':
      
        # echo sys_read >> /debugfs/tracing/set_graph_function
        # cat /debugfs/tracing/set_graph_function
        blk_unplug
        sys_read
      
      Using the '>' will clear out the function and write anew:
      
        # echo sys_write > /debug/tracing/set_graph_function
        # cat /debug/tracing/set_graph_function
        sys_write
      
      Note, if you have function graph running while doing this, the small
      time between clearing it and updating it will cause the graph to
      record all functions. This should not be an issue because after
      it sets the filter, only those functions will be recorded from then on.
      If you need to only record a particular function then set this
      file first before starting the function graph tracer. In the future
      this side effect may be corrected.
      
      The set_graph_function file is similar to the set_ftrace_filter but
      it does not take wild cards nor does it allow for more than one
      function to be set with a single write. There is no technical reason why
      this is the case, I just do not have the time yet to implement that.
      
      Note, dynamic ftrace must be enabled for this to appear because it
      uses the dynamic ftrace records to match the name to the mcount
      call sites.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea4e2bc4
  2. 03 12月, 2008 1 次提交
  3. 26 11月, 2008 6 次提交
    • A
      tracing: add "power-tracer": C/P state tracer to help power optimization · f3f47a67
      Arjan van de Ven 提交于
      Impact: new "power-tracer" ftrace plugin
      
      This patch adds a C/P-state ftrace plugin that will generate
      detailed statistics about the C/P-states that are being used,
      so that we can look at detailed decisions that the C/P-state
      code is making, rather than the too high level "average"
      that we have today.
      
      An example way of using this is:
      
       mount -t debugfs none /sys/kernel/debug
       echo cstate > /sys/kernel/debug/tracing/current_tracer
       echo 1 > /sys/kernel/debug/tracing/tracing_enabled
       sleep 1
       echo 0 > /sys/kernel/debug/tracing/tracing_enabled
       cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f3f47a67
    • S
      ftrace: add thread comm to function graph tracer · 660c7f9b
      Steven Rostedt 提交于
      Impact: enhancement to function graph tracer
      
      Export the trace_find_cmdline so the function graph tracer can
      use it to print the comms of the threads.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      660c7f9b
    • F
      tracing/function-return-tracer: set a more human readable output · 287b6e68
      Frederic Weisbecker 提交于
      Impact: feature
      
      This patch sets a C-like output for the function graph tracing.
      For this aim, we now call two handler for each function: one on the entry
      and one other on return. This way we can draw a well-ordered call stack.
      
      The pid of the previous trace is loosely stored to be compared against
      the one of the current trace to see if there were a context switch.
      
      Without this little feature, the call tree would seem broken at
      some locations.
      We could use the sched_tracer to capture these sched_events but this
      way of processing is much more simpler.
      
      2 spaces have been chosen for indentation to fit the screen while deep
      calls. The time of execution in nanosecs is printed just after closed
      braces, it seems more easy this way to find the corresponding function.
      If the time was printed as a first column, it would be not so easy to
      find the corresponding function if it is called on a deep depth.
      
      I plan to output the return value but on 32 bits CPU, the return value
      can be 32 or 64, and its difficult to guess on which case we are.
      I don't know what would be the better solution on X86-32: only print
      eax (low-part) or even edx (high-part).
      
      Actually it's thee same problem when a function return a 8 bits value, the
      high part of eax could contain junk values...
      
      Here is an example of trace:
      
      sys_read() {
        fget_light() {
        } 526
        vfs_read() {
          rw_verify_area() {
            security_file_permission() {
              cap_file_permission() {
              } 519
            } 1564
          } 2640
          do_sync_read() {
            pipe_read() {
              __might_sleep() {
              } 511
              pipe_wait() {
                prepare_to_wait() {
                } 760
                deactivate_task() {
                  dequeue_task() {
                    dequeue_task_fair() {
                      dequeue_entity() {
                        update_curr() {
                          update_min_vruntime() {
                          } 504
                        } 1587
                        clear_buddies() {
                        } 512
                        add_cfs_task_weight() {
                        } 519
                        update_min_vruntime() {
                        } 511
                      } 5602
                      dequeue_entity() {
                        update_curr() {
                          update_min_vruntime() {
                          } 496
                        } 1631
                        clear_buddies() {
                        } 496
                        update_min_vruntime() {
                        } 527
                      } 4580
                      hrtick_update() {
                        hrtick_start_fair() {
                        } 488
                      } 1489
                    } 13700
                  } 14949
                } 16016
                msecs_to_jiffies() {
                } 496
                put_prev_task_fair() {
                } 504
                pick_next_task_fair() {
                } 489
                pick_next_task_rt() {
                } 496
                pick_next_task_fair() {
                } 489
                pick_next_task_idle() {
                } 489
      
      ------------8<---------- thread 4 ------------8<----------
      
      finish_task_switch() {
      } 1203
      do_softirq() {
        __do_softirq() {
          __local_bh_disable() {
          } 669
          rcu_process_callbacks() {
            __rcu_process_callbacks() {
              cpu_quiet() {
                rcu_start_batch() {
                } 503
              } 1647
            } 3128
            __rcu_process_callbacks() {
            } 542
          } 5362
          _local_bh_enable() {
          } 587
        } 8880
      } 9986
      kthread_should_stop() {
      } 669
      deactivate_task() {
        dequeue_task() {
          dequeue_task_fair() {
            dequeue_entity() {
              update_curr() {
                calc_delta_mine() {
                } 511
                update_min_vruntime() {
                } 511
              } 2813
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      287b6e68
    • F
      tracing/function-return-tracer: change the name into function-graph-tracer · fb52607a
      Frederic Weisbecker 提交于
      Impact: cleanup
      
      This patch changes the name of the "return function tracer" into
      function-graph-tracer which is a more suitable name for a tracing
      which makes one able to retrieve the ordered call stack during
      the code flow.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fb52607a
    • M
      x86, bts, ftrace: a BTS ftrace plug-in prototype · 1e9b51c2
      Markus Metzger 提交于
      Impact: add new ftrace plugin
      
      A prototype for a BTS ftrace plug-in.
      
      The tracer collects branch trace in a cyclic buffer for each cpu.
      
      The tracer is not configurable and the trace for each snapshot is
      appended when doing cat /debug/tracing/trace.
      
      This is a proof of concept that will be extended with future patches
      to become a (hopefully) useful tool.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1e9b51c2
    • M
      x86, ftrace: call trace->open() before stopping tracing; add trace->print_header() · 8bba1bf5
      Markus Metzger 提交于
      Add a callback to allow an ftrace plug-in to write its own header.
      
      Move the call to trace->open() up a few lines.
      
      The changes are required by the BTS ftrace plug-in.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8bba1bf5
  4. 23 11月, 2008 2 次提交
    • T
      tracing: identify which executable object the userspace address belongs to · b54d3de9
      Török Edwin 提交于
      Impact: modify+improve the userstacktrace tracing visualization feature
      
      Store thread group leader id, and use it to lookup the address in the
      process's map. We could have looked up the address on thread's map,
      but the thread might not exist by the time we are called. The process
      might not exist either, but if you are reading trace_pipe, that is
      unlikely.
      
      Example usage:
      
       mount -t debugfs nodev /sys/kernel/debug
       cd /sys/kernel/debug/tracing
       echo userstacktrace >iter_ctrl
       echo sym-userobj >iter_ctrl
       echo sched_switch >current_tracer
       echo 1 >tracing_enabled
       cat trace_pipe >/tmp/trace&
       .... run application ...
       echo 0 >tracing_enabled
       cat /tmp/trace
      
      You'll see stack entries like:
      
         /lib/libpthread-2.7.so[+0xd370]
      
      You can convert them to function/line using:
      
         addr2line -fie /lib/libpthread-2.7.so 0xd370
      
      Or:
      
         addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370
      
      For non-PIC/PIE executables this won't work:
      
         a.out[+0x73b]
      
      You need to run the following: addr2line -fie a.out 0x40073b
      (where 0x400000 is the default load address of a.out)
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b54d3de9
    • T
      tracing: add support for userspace stacktraces in tracing/iter_ctrl · 02b67518
      Török Edwin 提交于
      Impact: add new (default-off) tracing visualization feature
      
      Usage example:
      
       mount -t debugfs nodev /sys/kernel/debug
       cd /sys/kernel/debug/tracing
       echo userstacktrace >iter_ctrl
       echo sched_switch >current_tracer
       echo 1 >tracing_enabled
       .... run application ...
       echo 0 >tracing_enabled
      
      Then read one of 'trace','latency_trace','trace_pipe'.
      
      To get the best output you can compile your userspace programs with
      frame pointers (at least glibc + the app you are tracing).
      Signed-off-by: NTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      02b67518
  5. 18 11月, 2008 2 次提交
    • F
      tracing/function-return-tracer: add the overrun field · 0231022c
      Frederic Weisbecker 提交于
      Impact: help to find the better depth of trace
      
      We decided to arbitrary define the depth of function return trace as
      "20". Perhaps this is not enough. To help finding an optimal depth, we
      measure now the overrun: the number of functions that have been missed
      for the current thread. By default this is not displayed, we have to
      do set a particular flag on the return tracer: echo overrun >
      /debug/tracing/trace_options And the overrun will be printed on the
      right.
      
      As the trace shows below, the current 20 depth is not enough.
      
      update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838)
      update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838)
      do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838)
      tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838)
      tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838)
      vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274)
      vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274)
      set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274)
      con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274)
      release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274)
      release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274)
      con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274)
      con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274)
      n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274)
      irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274)
      ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274)
      ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274)
      ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274)
      hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0231022c
    • F
      tracing/ftrace: implement a set_flag callback for tracers · adf9f195
      Frederic Weisbecker 提交于
      Impact: give a way to send specific messages to tracers
      
      The current implementation of tracing uses some flags to control the
      output of general tracers. But we have no way to implement custom
      flags handling for a specific tracer. This patch proposes a new
      callback for the struct tracer which called set_flag and a structure
      that represents a 32 bits variable flag.
      
      A tracer can implement a struct tracer_flags on which it puts the
      initial value of the flag integer. Than it can place a range of flags
      with their name and their flag mask on the flag integer. The structure
      that implement a single flag is called struct tracer_opt.
      
      These custom flags will be available through the trace_options file
      like the general tracing flags. Changing their value is done like the
      other general flags. For example if you have a flag that calls "foo",
      you can activate it by writing "foo" or "nofoo" on trace_options.
      
      Note that the set_flag callback is optional and is only needed if you
      want the flags changing to be signaled to your tracer and let it to
      accept or refuse their assignment.
      
      V2: Some arrangements in coding style....
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      adf9f195
  6. 17 11月, 2008 1 次提交
  7. 16 11月, 2008 1 次提交
  8. 13 11月, 2008 4 次提交
    • S
      ftrace: CPU buffer start annotation clean ups · 12ef7d44
      Steven Rostedt 提交于
      Impact: better handling of CPU buffer start annotation
      
      Because of the confusion with the per CPU buffers wrapping where
      one CPU might be more active at the end of the trace than the other
      CPUs causing that one CPU to have a shorter history. Kernel
      developers were confused by the "missing" data of that one CPU
      at the beginning of the trace output. An annotation was added to
      the trace output to show that the buffer had started:
      
       # tracer: function
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
       ##### CPU 3 buffer started ####
                <idle>-0     [003]   158.192959: smp_apic_timer_interrupt
       [...]
                 <idle>-0     [003]   161.556520: default_idle
       ##### CPU 1 buffer started ####
                 <idle>-0     [001]   161.592494: hrtimer_force_reprogram
       [etc]
      
      But this annotation gets a bit messy when tracers do not fill the
      buffers. This patch does a couple of things:
      
       One) it adds a flag to trace_options to disable these annotations
      
       Two) it does not annotate if the tracer did not overflow its buffer.
      
      This makes the output much cleaner.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      12ef7d44
    • S
      ftrace: add tracer called branch · 80e5ea45
      Steven Rostedt 提交于
      Impact: added new branch tracer
      
      Currently the tracing of branch profiling (unlikelys and likelys hit)
      is only activated by the iter_ctrl. This patch adds a tracer called
      "branch" that will just trace the branch profiling. The advantage
      of adding this tracer is that it can be added to the ftrace selftests
      on startup.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      80e5ea45
    • S
      ftrace: rename unlikely iter_ctrl to branch · 9f029e83
      Steven Rostedt 提交于
      Impact: rename of iter_ctrl unlikely to branch
      
      The unlikely name is ugly. This patch converts the iter_ctrl command
      "unlikely" and "nounlikely" to "branch" and "nobranch" respectively.
      
      It also renames a lot of internal functions to use "branch" instead
      of "unlikely".
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f029e83
    • S
      trace: rename unlikely profiler to branch profiler · 2ed84eeb
      Steven Rostedt 提交于
      Impact: name change of unlikely tracer and profiler
      
      Ingo Molnar suggested changing the config from UNLIKELY_PROFILE
      to BRANCH_PROFILING. I never did like the "unlikely" name so I
      went one step farther, and renamed all the unlikely configurations
      to a "BRANCH" variant.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ed84eeb
  9. 12 11月, 2008 3 次提交
    • S
      tracing: likely/unlikely branch annotation tracer · 52f232cb
      Steven Rostedt 提交于
      Impact: new likely/unlikely branch tracer
      
      This patch adds a way to record the instances of the likely() and unlikely()
      branch condition annotations.
      
      When "unlikely" is set in /debugfs/tracing/iter_ctrl the unlikely conditions
      will be added to any of the ftrace tracers. The change takes effect when
      a new tracer is passed into the current_tracer file.
      
      For example:
      
       bash-3471  [003]   357.014755: [INCORRECT] sched_info_dequeued:sched_stats.h:177
       bash-3471  [003]   357.014756: [correct] update_curr:sched_fair.c:489
       bash-3471  [003]   357.014758: [correct] calc_delta_fair:sched_fair.c:411
       bash-3471  [003]   357.014759: [correct] account_group_exec_runtime:sched_stats.h:356
       bash-3471  [003]   357.014761: [correct] update_curr:sched_fair.c:489
       bash-3471  [003]   357.014763: [INCORRECT] calc_delta_fair:sched_fair.c:411
       bash-3471  [003]   357.014765: [correct] calc_delta_mine:sched.c:1279
      
      Which shows the normal tracer heading, as well as whether the condition was
      correct "[correct]" or was mistaken "[INCORRECT]", followed by the function,
      file name and line number.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      52f232cb
    • F
      tracing/fastboot: Use the ring-buffer timestamp for initcall entries · 74239072
      Frederic Weisbecker 提交于
      Impact: Split the boot tracer entries in two parts: call and return
      
      Now that we are using the sched tracer from the boot tracer, we want
      to use the same timestamp than the ring-buffer to have consistent time
      captures between sched events and initcall events.
      
      So we get rid of the old time capture by the boot tracer and split the
      initcall events in two parts: call and return. This way we have the
      ring buffer timestamp of both.
      
      An example trace:
      
      [   27.904149584] calling  net_ns_init+0x0/0x1c0 @ 1
      [   27.904429624] initcall net_ns_init+0x0/0x1c0 returned 0 after 0 msecs
      [   27.904575926] calling  reboot_init+0x0/0x20 @ 1
      [   27.904655399] initcall reboot_init+0x0/0x20 returned 0 after 0 msecs
      [   27.904800228] calling  sysctl_init+0x0/0x30 @ 1
      [   27.905142914] initcall sysctl_init+0x0/0x30 returned 0 after 0 msecs
      [   27.905287211] calling  ksysfs_init+0x0/0xb0 @ 1
       ##### CPU 0 buffer started ####
                  init-1     [000]    27.905395:      1:120:R   + [001]    11:115:S
       ##### CPU 1 buffer started ####
                <idle>-0     [001]    27.905425:      0:140:R ==> [001]    11:115:R
                  init-1     [000]    27.905426:      1:120:D ==> [000]     0:140:R
                <idle>-0     [000]    27.905431:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905451:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.905456:      4:115:S ==> [000]     0:140:R
                 udevd-11    [001]    27.905458:     11:115:R   + [001]    14:115:R
                <idle>-0     [000]    27.905459:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905462:      0:140:R ==> [000]     4:115:R
                 udevd-11    [001]    27.905462:     11:115:R ==> [001]    14:115:R
           ksoftirqd/0-4     [000]    27.905467:      4:115:S ==> [000]     0:140:R
                <idle>-0     [000]    27.905470:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905473:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.905476:      4:115:S ==> [000]     0:140:R
                <idle>-0     [000]    27.905479:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.905482:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.905486:      4:115:S ==> [000]     0:140:R
                 udevd-14    [001]    27.905499:     14:120:X ==> [001]    11:115:R
                 udevd-11    [001]    27.905506:     11:115:R   + [000]     1:120:D
                <idle>-0     [000]    27.905515:      0:140:R ==> [000]     1:120:R
                 udevd-11    [001]    27.905517:     11:115:S ==> [001]     0:140:R
      [   27.905557107] initcall ksysfs_init+0x0/0xb0 returned 0 after 3906 msecs
      [   27.905705736] calling  init_jiffies_clocksource+0x0/0x10 @ 1
      [   27.905779239] initcall init_jiffies_clocksource+0x0/0x10 returned 0 after 0 msecs
      [   27.906769814] calling  pm_init+0x0/0x30 @ 1
      [   27.906853627] initcall pm_init+0x0/0x30 returned 0 after 0 msecs
      [   27.906997803] calling  pm_disk_init+0x0/0x20 @ 1
      [   27.907076946] initcall pm_disk_init+0x0/0x20 returned 0 after 0 msecs
      [   27.907222556] calling  swsusp_header_init+0x0/0x30 @ 1
      [   27.907294325] initcall swsusp_header_init+0x0/0x30 returned 0 after 0 msecs
      [   27.907439620] calling  stop_machine_init+0x0/0x50 @ 1
                  init-1     [000]    27.907485:      1:120:R   + [000]     2:115:S
                  init-1     [000]    27.907490:      1:120:D ==> [000]     2:115:R
              kthreadd-2     [000]    27.907507:      2:115:R   + [001]    15:115:R
                <idle>-0     [001]    27.907517:      0:140:R ==> [001]    15:115:R
              kthreadd-2     [000]    27.907517:      2:115:D ==> [000]     0:140:R
                <idle>-0     [000]    27.907521:      0:140:R   + [000]     4:115:S
                <idle>-0     [000]    27.907524:      0:140:R ==> [000]     4:115:R
                 udevd-15    [001]    27.907527:     15:115:D   + [000]     2:115:D
           ksoftirqd/0-4     [000]    27.907537:      4:115:S ==> [000]     2:115:R
                 udevd-15    [001]    27.907537:     15:115:D ==> [001]     0:140:R
              kthreadd-2     [000]    27.907546:      2:115:R   + [000]     1:120:D
              kthreadd-2     [000]    27.907550:      2:115:S ==> [000]     1:120:R
                  init-1     [000]    27.907584:      1:120:R   + [000]    15:  0:D
                  init-1     [000]    27.907589:      1:120:R   + [000]     2:115:S
                  init-1     [000]    27.907593:      1:120:D ==> [000]    15:  0:R
                 udevd-15    [000]    27.907601:     15:  0:S ==> [000]     2:115:R
       ##### CPU 0 buffer started ####
              kthreadd-2     [000]    27.907616:      2:115:R   + [001]    16:115:R
       ##### CPU 1 buffer started ####
                <idle>-0     [001]    27.907620:      0:140:R ==> [001]    16:115:R
              kthreadd-2     [000]    27.907621:      2:115:D ==> [000]     0:140:R
                 udevd-16    [001]    27.907625:     16:115:D   + [000]     2:115:D
                <idle>-0     [000]    27.907628:      0:140:R   + [000]     4:115:S
                 udevd-16    [001]    27.907629:     16:115:D ==> [001]     0:140:R
                <idle>-0     [000]    27.907631:      0:140:R ==> [000]     4:115:R
           ksoftirqd/0-4     [000]    27.907636:      4:115:S ==> [000]     2:115:R
              kthreadd-2     [000]    27.907644:      2:115:R   + [000]     1:120:D
              kthreadd-2     [000]    27.907647:      2:115:S ==> [000]     1:120:R
                  init-1     [000]    27.907657:      1:120:R   + [001]    16:  0:D
                <idle>-0     [001]    27.907666:      0:140:R ==> [001]    16:  0:R
      [   27.907703862] initcall stop_machine_init+0x0/0x50 returned 0 after 0 msecs
      [   27.907850704] calling  filelock_init+0x0/0x30 @ 1
      [   27.907926573] initcall filelock_init+0x0/0x30 returned 0 after 0 msecs
      [   27.908071327] calling  init_script_binfmt+0x0/0x10 @ 1
      [   27.908165195] initcall init_script_binfmt+0x0/0x10 returned 0 after 0 msecs
      [   27.908309461] calling  init_elf_binfmt+0x0/0x10 @ 1
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      74239072
    • F
      tracing/fastboot: move boot tracer structs and funcs into their own header. · 3f5ec136
      Frederic Weisbecker 提交于
      Impact: Cleanups on the boot tracer and ftrace
      
      This patch bring some cleanups about the boot tracer headers. The
      functions and structures of this tracer have nothing related to ftrace
      and should have so their own header file.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3f5ec136
  10. 11 11月, 2008 1 次提交
    • F
      tracing: add a tracer to catch execution time of kernel functions · 15e6cb36
      Frederic Weisbecker 提交于
      Impact: add new tracing plugin which can trace full (entry+exit) function calls
      
      This tracer uses the low level function return ftrace plugin to
      measure the execution time of the kernel functions.
      
      The first field is the caller of the function, the second is the
      measured function, and the last one is the execution time in
      nanoseconds.
      
      - v3:
      
      - HAVE_FUNCTION_RET_TRACER have been added. Each arch that support ftrace return
        should enable it.
      - ftrace_return_stub becomes ftrace_stub.
      - CONFIG_FUNCTION_RET_TRACER depends now on CONFIG_FUNCTION_TRACER
      - Return traces printing can be used for other tracers on trace.c
      - Adapt to the new tracing API (no more ctrl_update callback)
      - Correct the check of "disabled" during insertion.
      - Minor changes...
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      15e6cb36
  11. 08 11月, 2008 5 次提交
    • S
      ftrace: display start of CPU buffer in trace output · a309720c
      Steven Rostedt 提交于
      Impact: change in trace output
      
      Because the trace buffers are per cpu ring buffers, the start of
      the trace can be confusing. If one CPU is very active at the
      end of the trace, its history will not go as far back as the
      other CPU traces.  This means that output for a particular CPU
      may not appear for the first part of a trace.
      
      To help annotate what is happening, and to prevent any more
      confusion, this patch adds a line that annotates the start of
      a CPU buffer output.
      
      For example:
      
             automount-3495  [001]   184.596443: dnotify_parent <-vfs_write
      [...]
             automount-3495  [001]   184.596449: dput <-path_put
             automount-3496  [002]   184.596450: down_read_trylock <-do_page_fault
      [...]
                 sshd-3497  [001]   184.597069: up_read <-do_page_fault
                <idle>-0     [000]   184.597074: __exit_idle <-exit_idle
      [...]
             automount-3496  [002]   184.597257: filemap_fault <-__do_fault
                <idle>-0     [003]   184.597261: exit_idle <-smp_apic_timer_interrupt
      
      Note, parsers of a trace output should always ignore any lines that
      start with a '#'.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a309720c
    • S
      ftrace: remove trace array ctrl · c76f0694
      Steven Rostedt 提交于
      Impact: remove obsolete variable in trace_array structure
      
      With the new start / stop method of ftrace, the ctrl variable
      in the trace_array structure is now obsolete. Remove it.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c76f0694
    • S
      ftrace: remove ctrl_update method · bbf5b1a0
      Steven Rostedt 提交于
      Impact: Remove the ctrl_update tracer method
      
      With the new quick start/stop method of tracing, the ctrl_update
      method is out of date.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bbf5b1a0
    • S
      ftrace: fix sched_switch API · e168e051
      Steven Rostedt 提交于
      Impact: fix for sched_switch that broke dynamic ftrace startup
      
      The commit: tracing/fastboot: use sched switch tracer from boot tracer
      broke the API of the sched_switch trace. The use of the
      tracing_start/stop_cmdline record is for only recording the cmdline,
      NOT recording the schedule switches themselves.
      
      Seeing that the boot tracer broke the API to do something that it
      wanted, this patch adds a new interface for the API while
      puting back the original interface of the old API.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e168e051
    • S
      ftrace: fix boot trace sched startup · 75f5c47d
      Steven Rostedt 提交于
      Impact: boot tracer startup modified
      
      The boot tracer calls into some of the schedule tracing private functions
      that should not be exported. This patch cleans it up, and makes
      way for further changes in the ftrace infrastructure.
      
      This patch adds a api to assign a tracer array to the schedule
      context switch tracer.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75f5c47d
  12. 06 11月, 2008 1 次提交
    • S
      ftrace: restructure tracing start/stop infrastructure · 9036990d
      Steven Rostedt 提交于
      Impact: change where tracing is started up and stopped
      
      Currently, when a new tracer is selected via echo'ing a tracer name into
      the current_tracer file, the startup is only done if tracing_enabled is
      set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
      the tracing_enabled file) a full shutdown is performed.
      
      The full startup and shutdown of a tracer can be expensive and the
      user can lose out traces when echo'ing in 0 to the tracing_enabled file,
      because the process takes too long. There can also be places that
      the user would like to start and stop the tracer several times and
      doing the full startup and shutdown of a tracer might be too expensive.
      
      This patch performs the full startup and shutdown when a tracer is
      selected. It also adds a way to do a quick start or stop of a tracer.
      The quick version is just a flag that prevents the tracing from
      taking place, but the overhead of the code is still there.
      
      For example, the startup of a tracer may enable tracepoints, or enable
      the function tracer.  The stop and start will just set a flag to
      have the tracer ignore the calls when the tracepoint or function trace
      is called.  The overhead of the tracer may still be present when
      the tracer is stopped, but no tracing will occur. Setting the tracer
      to the 'nop' tracer (or any other tracer) will perform the shutdown
      of the tracer which will disable the tracepoint or disable the
      function tracer.
      
      The tracing_enabled file will simply start or stop tracing.
      
      This change is all internal. The end result for the user should be the same
      as before. If tracing_enabled is not set, no trace will happen.
      If tracing_enabled is set, then the trace will happen. The tracing_enabled
      variable is static between tracers. Enabling  tracing_enabled and
      going to another tracer will keep tracing_enabled enabled. Same
      is true with disabling tracing_enabled.
      
      This patch will now provide a fast start/stop method to the users
      for enabling or disabling tracing.
      
      Note: There were two methods to the struct tracer that were never
       used: The methods start and stop. These were to be used as a hook
       to the reading of the trace output, but ended up not being
       necessary. These two methods are now used to enable the start
       and stop of each tracer, in case the tracer needs to do more than
       just not write into the buffer. For example, the irqsoff tracer
       must stop recording max latencies when tracing is stopped.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9036990d
  13. 05 11月, 2008 1 次提交
  14. 04 11月, 2008 2 次提交
    • S
      ftrace: function tracer with irqs disabled · b2a866f9
      Steven Rostedt 提交于
      Impact: disable interrupts during trace entry creation (as opposed to preempt)
      
      To help with performance, I set the ftracer to not disable interrupts,
      and only to disable preemption. If an interrupt occurred, it would not
      be traced, because the function tracer protects itself from recursion.
      This may be faster, but the trace output might miss some traces.
      
      This patch makes the fuction trace disable interrupts, but it also
      adds a runtime feature to disable preemption instead. It does this by
      having two different tracer functions. When the function tracer is
      enabled, it will check to see which version is requested (irqs disabled
      or preemption disabled). Then it will use the corresponding function
      as the tracer.
      
      Irq disabling is the default behaviour, but if the user wants better
      performance, with the chance of missing traces, then they can choose
      the preempt disabled version.
      
      Running hackbench 3 times with the irqs disabled and 3 times with
      the preempt disabled function tracer yielded:
      
      tracing type       times            entries recorded
      ------------      --------          ----------------
      irq disabled      43.393            166433066
                        43.282            166172618
                        43.298            166256704
      
      preempt disabled  38.969            159871710
                        38.943            159972935
                        39.325            161056510
      
      Average:
      
         irqs disabled:  43.324           166287462
      preempt disabled:  39.079           160300385
      
       preempt is 10.8 percent faster than irqs disabled.
      
      I wrote a patch to count function trace recursion and reran hackbench.
      
      With irq disabled: 1,150 times the function tracer did not trace due to
        recursion.
      with preempt disabled: 5,117,718 times.
      
      The thousand times with irq disabled could be due to NMIs, or simply a case
      where it called a function that was not protected by notrace.
      
      But we also see that a large amount of the trace is lost with the
      preempt version.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b2a866f9
    • S
      ftrace: introduce ftrace_preempt_disable()/enable() · 8f0a056f
      Steven Rostedt 提交于
      Impact: add new ftrace-plugin internal APIs
      
      Parts of the tracer needs to be careful about schedule recursion.
      If the NEED_RESCHED flag is set, a preempt_enable will call schedule.
      Inside the schedule function, the NEED_RESCHED flag is cleared.
      
      The problem arises when a trace happens in the schedule function but before
      NEED_RESCHED is cleared. The race is as follows:
      
      schedule()
        >> tracer called
      
          trace_function()
             preempt_disable()
             [ record trace ]
             preempt_enable()  <<- here's the issue.
      
               [check NEED_RESCHED]
                schedule()
                [ Repeat the above, over and over again ]
      
      The naive approach is simply to use preempt_enable_no_schedule instead.
      The problem with that approach is that, although we solve the schedule
      recursion issue, we now might lose a preemption check when not in the
      schedule function.
      
        trace_function()
          preempt_disable()
          [ record trace ]
          [Interrupt comes in and sets NEED_RESCHED]
          preempt_enable_no_resched()
          [continue without scheduling]
      
      The way ftrace handles this problem is with the following approach:
      
      	int resched;
      
      	resched = need_resched();
      	preempt_disable_notrace();
      	[record trace]
      	if (resched)
      		preempt_enable_no_sched_notrace();
      	else
      		preempt_enable_notrace();
      
      This may seem like the opposite of what we want. If resched is set
      then we call the "no_sched" version??  The reason we do this is because
      if NEED_RESCHED is set before we disable preemption, there's two reasons
      for that:
      
        1) we are in an atomic code path
        2) we are already on our way to the schedule function, and maybe even
           in the schedule function, but have yet to clear the flag.
      
      Both the above cases we do not want to schedule.
      
      This solution has already been implemented within the ftrace infrastructure.
      But the problem is that it has been implemented several times. This patch
      encapsulates this code to two nice functions.
      
        resched = ftrace_preempt_disable();
        [ record trace]
        ftrace_preempt_enable(resched);
      
      This way the tracers do not need to worry about getting it right.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8f0a056f
  15. 31 10月, 2008 1 次提交
    • S
      ftrace: handle archs that do not support irqs_disabled_flags · 9244489a
      Steven Rostedt 提交于
      Impact: build fix on non-lockdep architectures
      
      Some architectures do not support a way to read the irq flags that
      is set from "local_irq_save(flags)" to determine if interrupts were
      disabled or enabled. Ftrace uses this information to display to the user
      if the trace occurred with interrupts enabled or disabled.
      
      Besides the fact that those archs that do not support this will fail to
      compile, unless they fix it, we do not want to have the trace simply
      say interrupts were not disabled or they were enabled, without knowing
      the real answer.
      
      This patch adds a 'X' in the output to let the user know that the
      architecture they are running on does not support a way for the tracer
      to determine if interrupts were enabled or disabled. It also lets those
      same archs compile with tracing enabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9244489a
  16. 21 10月, 2008 1 次提交
  17. 14 10月, 2008 7 次提交
    • S
      ftrace: preempt disable over interrupt disable · 38697053
      Steven Rostedt 提交于
      With the new ring buffer infrastructure in ftrace, I'm trying to make
      ftrace a little more light weight.
      
      This patch converts a lot of the local_irq_save/restore into
      preempt_disable/enable.  The original preempt count in a lot of cases
      has to be sent in as a parameter so that it can be recorded correctly.
      Some places were recording it incorrectly before anyway.
      
      This is also laying the ground work to make ftrace a little bit
      more reentrant, and remove all locking. The function tracers must
      still protect from reentrancy.
      
      Note: All the function tracers must be careful when using preempt_disable.
        It must do the following:
      
        resched = need_resched();
        preempt_disable_notrace();
        [...]
        if (resched)
      	preempt_enable_no_resched_notrace();
        else
      	preempt_enable_notrace();
      
      The reason is that if this function traces schedule() itself, the
      preempt_enable_notrace() will cause a schedule, which will lead
      us into a recursive failure.
      
      If we needed to reschedule before calling preempt_disable, we
      should have already scheduled. Since we did not, this is most
      likely that we should not and are probably inside a schedule
      function.
      
      If resched was not set, we still need to catch the need resched
      flag being set when preemption was off and the if case at the
      end will catch that for us.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      38697053
    • S
      ftrace: type cast filter+verifier · 7104f300
      Steven Rostedt 提交于
      The mmiotrace map had a bug that would typecast the entry from
      the trace to the wrong type. That is a known danger of C typecasts,
      there's absolutely zero checking done on them.
      
      Help that problem a bit by using a GCC extension to implement a
      type filter that restricts the types that a trace record can be
      cast into, and by adding a dynamic check (in debug mode) to verify
      the type of the entry.
      
      This patch adds a macro to assign all entries of ftrace using the type
      of the variable and checking the entry id. The typecasts are now done
      in the macro for only those types that it knows about, which should
      be all the types that are allowed to be read from the tracer.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7104f300
    • F
      tracing/ftrace: change the type of the print_line callback · 2c4f035f
      Frederic Weisbecker 提交于
      We need a kind of disambiguation when a print_line callback
      returns 0.
      
      _There is not enough space to print all the entry.
       Please flush the seq and retry.
      _I can't handle this type of entry
      
      This patch changes the type of this callback for better information.
      
      Also some changes have been made in this V2.
      
      _ Only relay to default functions after the print_line callback fails.
      _ This patch doesn't fix the issue with the broken pipe (see patch 2/4 for that)
      
      Some things are still in discussion:
      
      _ Find better names for the enum print_line_t values
      _ Change the type of print_trace_line into boolean.
      
      Patches to change that can be sent later.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPekka Paalanen <pq@iki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c4f035f
    • S
      ftrace: take advantage of variable length entries · 777e208d
      Steven Rostedt 提交于
      Now that the underlining ring buffer for ftrace now hold variable length
      entries, we can take advantage of this by only storing the size of the
      actual event into the buffer. This happens to increase the number of
      entries in the buffer dramatically.
      
      We can also get rid of the "trace_cont" operation, but I'm keeping that
      until we have no more users. Some of the ftrace tracers can now change
      their code to adapt to this new feature.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      777e208d
    • S
      ftrace: make work with new ring buffer · 3928a8a2
      Steven Rostedt 提交于
      This patch ports ftrace over to the new ring buffer.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3928a8a2
    • F
      tracing/ftrace: add the boot tracer · d13744cd
      Frédéric Weisbecker 提交于
      Add the boot/initcall tracer.
      
      It's primary purpose is to be able to trace the initcalls.
      
      It is intended to be used with scripts/bootgraph.pl after some small
      improvements.
      
      Note that it is not active after its init. To avoid tracing (and so
      crashing) before the whole tracing engine init, you have to explicitly
      call start_boot_trace() after do_pre_smp_initcalls() to enable it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d13744cd
    • F
      tracing/ftrace: replace none tracer by nop tracer · 43a15386
      Frédéric Weisbecker 提交于
      Replace "none" tracer by the recently created "nop" tracer.
      Both are pretty similar except that nop accepts TRACE_PRINT
      or TRACE_SPECIAL entries.
      
      And as a consequence, changing the size of the ring buffer now
      requires that tracing has already been disabled.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Steven Noonan <steven@uplinklabs.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      43a15386