1. 15 3月, 2013 1 次提交
  2. 14 11月, 2012 1 次提交
    • D
      tracing: Format non-nanosec times from tsc clock without a decimal point. · 8be0709f
      David Sharp 提交于
      With the addition of the "tsc" clock, formatting timestamps to look like
      fractional seconds is misleading. Mark clocks as either in nanoseconds or
      not, and format non-nanosecond timestamps as decimal integers.
      
      Tested:
      $ cd /sys/kernel/debug/tracing/
      $ cat trace_clock
      [local] global tsc
      $ echo sched_switch > set_event
      $ echo 1 > tracing_on ; sleep 0.0005 ; echo 0 > tracing_on
      $ cat trace
                <idle>-0     [000]  6330.555552: sched_switch: prev_comm=swapper prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=bash next_pid=29964 next_prio=120
                 sleep-29964 [000]  6330.555628: sched_switch: prev_comm=bash prev_pid=29964 prev_prio=120 prev_state=S ==> next_comm=swapper next_pid=0 next_prio=120
        ...
      $ echo 1 > options/latency-format
      $ cat trace
        <idle>-0       0 4104553247us+: sched_switch: prev_comm=swapper prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=bash next_pid=29964 next_prio=120
         sleep-29964   0 4104553322us+: sched_switch: prev_comm=bash prev_pid=29964 prev_prio=120 prev_state=S ==> next_comm=swapper next_pid=0 next_prio=120
        ...
      $ echo tsc > trace_clock
      $ cat trace
      $ echo 1 > tracing_on ; sleep 0.0005 ; echo 0 > tracing_on
      $ echo 0 > options/latency-format
      $ cat trace
                <idle>-0     [000] 16490053398357: sched_switch: prev_comm=swapper prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=bash next_pid=31128 next_prio=120
                 sleep-31128 [000] 16490053588518: sched_switch: prev_comm=bash prev_pid=31128 prev_prio=120 prev_state=S ==> next_comm=swapper next_pid=0 next_prio=120
        ...
      echo 1 > options/latency-format
      $ cat trace
        <idle>-0       0 91557653238+: sched_switch: prev_comm=swapper prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=bash next_pid=31128 next_prio=120
         sleep-31128   0 91557843399+: sched_switch: prev_comm=bash prev_pid=31128 prev_prio=120 prev_state=S ==> next_comm=swapper next_pid=0 next_prio=120
        ...
      
      v2:
      Move arch-specific bits out of generic code.
      v4:
      Fix x86_32 build due to 64-bit division.
      
      Google-Bug-Id: 6980623
      Link: http://lkml.kernel.org/r/1352837903-32191-2-git-send-email-dhsharp@google.com
      
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8be0709f
  3. 15 6月, 2012 1 次提交
    • S
      tracing: Register the ftrace internal events during early boot · 7374e827
      Steven Rostedt 提交于
      All trace events including ftrace internel events (like trace_printk
      and function tracing), register functions that describe how to print
      their output. The events may be recorded as soon as the ring buffer
      is allocated, but they are just raw binary in the buffer. The mapping
      of event ids to how to print them are held within a structure that
      is registered on system boot.
      
      If a crash happens in boot up before these functions are registered
      then their output (via ftrace_dump_on_oops) will be useless:
      
      Dumping ftrace buffer:
      ---------------------------------
         <...>-1       0.... 319705us : Unknown type 6
      ---------------------------------
      
      This can be quite frustrating for a kernel developer trying to see
      what is going wrong.
      
      There's no reason to register them so late in the boot up process.
      They can be registered by early_initcall().
      Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7374e827
  4. 20 4月, 2012 1 次提交
    • S
      tracing: Fix stacktrace of latency tracers (irqsoff and friends) · db4c75cb
      Steven Rostedt 提交于
      While debugging a latency with someone on IRC (mirage335) on #linux-rt (OFTC),
      we discovered that the stacktrace output of the latency tracers
      (preemptirqsoff) was empty.
      
      This bug was caused by the creation of the dynamic length stack trace
      again (like commit 12b5da34 "tracing: Fix ent_size in trace output" was).
      
      This bug is caused by the latency tracers requiring the next event
      to determine the time between the current event and the next. But by
      grabbing the next event, the iter->ent_size is set to the next event
      instead of the current one. As the stacktrace event is the last event,
      this makes the ent_size zero and causes nothing to be printed for
      the stack trace. The dynamic stacktrace uses the ent_size to determine
      how much of the stack can be printed. The ent_size of zero means
      no stack.
      
      The simple fix is to save the iter->ent_size before finding the next event.
      
      Note, mirage335 asked to remain anonymous from LKML and git, so I will
      not add the Reported-by and Tested-by tags, even though he did report
      the issue and tested the fix.
      
      Cc: stable@vger.kernel.org # 3.1+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      db4c75cb
  5. 21 3月, 2012 1 次提交
  6. 22 2月, 2012 1 次提交
  7. 21 2月, 2012 1 次提交
  8. 17 11月, 2011 1 次提交
    • S
      tracing: Add irq, preempt-count and need resched info to default trace output · 77271ce4
      Steven Rostedt 提交于
      People keep asking how to get the preempt count, irq, and need resched info
      and we keep telling them to enable the latency format. Some developers think
      that traces without this info is completely useless, and for a lot of tasks
      it is useless.
      
      The first option was to enable the latency trace as the default format, but
      the header for the latency format is pretty useless for most tracers and
      it also does the timestamp in straight microseconds from the time the trace
      started. This is sometimes more difficult to read as the default trace is
      seconds from the start of boot up.
      
      Latency format:
      
       # tracer: nop
       #
       # nop latency trace v1.1.5 on 3.2.0-rc1-test+
       # --------------------------------------------------------------------
       # latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
       #    -----------------
       #    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
       #    -----------------
       #
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /     delay
       #  cmd     pid   ||||| time  |   caller
       #     \   /      |||||  \    |   /
       migratio-6       0...2 41778231us+: rcu_note_context_switch <-__schedule
       migratio-6       0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
       migratio-6       0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
       migratio-6       0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
       migratio-6       0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
       migratio-6       0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
      
      default format:
      
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
            migration/0-6     [000]    50.025810: rcu_note_context_switch <-__schedule
            migration/0-6     [000]    50.025812: trace_rcu_utilization <-rcu_note_context_switch
            migration/0-6     [000]    50.025813: rcu_sched_qs <-rcu_note_context_switch
            migration/0-6     [000]    50.025815: rcu_preempt_qs <-rcu_note_context_switch
            migration/0-6     [000]    50.025817: trace_rcu_utilization <-rcu_note_context_switch
            migration/0-6     [000]    50.025818: debug_lockdep_rcu_enabled <-__schedule
            migration/0-6     [000]    50.025820: debug_lockdep_rcu_enabled <-__schedule
      
      The latency format header has latency information that is pretty meaningless
      for most tracers. Although some of the header is useful, and we can add that
      later to the default format as well.
      
      What is really useful with the latency format is the irqs-off, need-resched
      hard/softirq context and the preempt count.
      
      This commit adds the option irq-info which is on by default that adds this
      information:
      
       # tracer: nop
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
                 <idle>-0     [000] d..2    49.309305: cpuidle_get_driver <-cpuidle_idle_call
                 <idle>-0     [000] d..2    49.309307: mwait_idle <-cpu_idle
                 <idle>-0     [000] d..2    49.309309: need_resched <-mwait_idle
                 <idle>-0     [000] d..2    49.309310: test_ti_thread_flag <-need_resched
                 <idle>-0     [000] d..2    49.309312: trace_power_start.constprop.13 <-mwait_idle
                 <idle>-0     [000] d..2    49.309313: trace_cpu_idle <-mwait_idle
                 <idle>-0     [000] d..2    49.309315: need_resched <-mwait_idle
      
      If a user wants the old format, they can disable the 'irq-info' option:
      
       # tracer: nop
       #
       #           TASK-PID   CPU#      TIMESTAMP  FUNCTION
       #              | |       |          |         |
                 <idle>-0     [000]     49.309305: cpuidle_get_driver <-cpuidle_idle_call
                 <idle>-0     [000]     49.309307: mwait_idle <-cpu_idle
                 <idle>-0     [000]     49.309309: need_resched <-mwait_idle
                 <idle>-0     [000]     49.309310: test_ti_thread_flag <-need_resched
                 <idle>-0     [000]     49.309312: trace_power_start.constprop.13 <-mwait_idle
                 <idle>-0     [000]     49.309313: trace_cpu_idle <-mwait_idle
                 <idle>-0     [000]     49.309315: need_resched <-mwait_idle
      Requested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      77271ce4
  9. 15 7月, 2011 1 次提交
    • S
      tracing: Have dynamic size event stack traces · 4a9bd3f1
      Steven Rostedt 提交于
      Currently the stack trace per event in ftace is only 8 frames.
      This can be quite limiting and sometimes useless. Especially when
      the "ignore frames" is wrong and we also use up stack frames for
      the event processing itself.
      
      Change this to be dynamic by adding a percpu buffer that we can
      write a large stack frame into and then copy into the ring buffer.
      
      For interrupts and NMIs that come in while another event is being
      process, will only get to use the 8 frame stack. That should be enough
      as the task that it interrupted will have the full stack frame anyway.
      Requested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4a9bd3f1
  10. 26 5月, 2011 1 次提交
  11. 05 4月, 2011 1 次提交
    • J
      tracing: Avoid soft lockup in trace_pipe · ee5e51f5
      Jiri Olsa 提交于
      running following commands:
      
        # enable the binary option
        echo 1 > ./options/bin
        # disable context info option
        echo 0 > ./options/context-info
        # tracing only events
        echo 1 > ./events/enable
        cat trace_pipe
      
      plus forcing system to generate many tracing events,
      is causing lockup (in NON preemptive kernels) inside
      tracing_read_pipe function.
      
      The issue is also easily reproduced by running ltp stress test.
      (ftrace_stress_test.sh)
      
      The reasons are:
       - bin/hex/raw output functions for events are set to
         trace_nop_print function, which prints nothing and
         returns TRACE_TYPE_HANDLED value
       - LOST EVENT trace do not handle trace_seq overflow
      
      These reasons force the while loop in tracing_read_pipe
      function never to break.
      
      The attached patch fixies handling of lost event trace, and
      changes trace_nop_print to print minimal info, which is needed
      for the correct tracing_read_pipe processing.
      
      v2 changes:
       - omit the cond_resched changes by trace_nop_print changes
       - WARN changed to WARN_ONCE and added info to be able
         to find out the culprit
      
      v3 changes:
       - make more accurate patch comment
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ee5e51f5
  12. 10 3月, 2011 2 次提交
  13. 21 7月, 2010 1 次提交
    • L
      tracing: Reduce latency and remove percpu trace_seq · bc289ae9
      Lai Jiangshan 提交于
      __print_flags() and __print_symbolic() use percpu trace_seq:
      
      1) Its memory is allocated at compile time, it wastes memory if we don't use tracing.
      2) It is percpu data and it wastes more memory for multi-cpus system.
      3) It disables preemption when it executes its core routine
         "trace_seq_printf(s, "%s: ", #call);" and introduces latency.
      
      So we move this trace_seq to struct trace_iterator.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4C078350.7090106@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bc289ae9
  14. 20 7月, 2010 1 次提交
  15. 15 5月, 2010 1 次提交
    • S
      tracing: Allow events to share their print functions · a9a57763
      Steven Rostedt 提交于
      Multiple events may use the same method to print their data.
      Instead of having all events have a pointer to their print funtions,
      the trace_event structure now points to a trace_event_functions structure
      that will hold the way to print ouf the event.
      
      The event itself is now passed to the print function to let the print
      function know what kind of event it should print.
      
      This opens the door to consolidating the way several events print
      their output.
      
         text	   data	    bss	    dec	    hex	filename
      4913961	1088356	 861512	6863829	 68bbd5	vmlinux.orig
      4900382	1048964	 861512	6810858	 67ecea	vmlinux.init
      4900446	1049028	 861512	6810986	 67ed6a	vmlinux.preprint
      
      This change slightly increases the size but is needed for the next change.
      
      v3: Fix the branch tracer events to handle this change.
      
      v2: Fix the new function graph tracer event calls to handle this change.
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a9a57763
  16. 06 5月, 2010 1 次提交
  17. 01 5月, 2010 2 次提交
  18. 10 12月, 2009 2 次提交
    • J
      tracing: Add full state to trace_seq · d184b31c
      Johannes Berg 提交于
      The trace_seq buffer might fill up, and right now one needs to check the
      return value of each printf into the buffer to check for that.
      
      Instead, have the buffer keep track of whether it is full or not, and
      reject more input if it is full or would have overflowed with an input
      that wasn't added.
      
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d184b31c
    • S
      tracing: Buffer the output of seq_file in case of filled buffer · a63ce5b3
      Steven Rostedt 提交于
      If the seq_read fills the buffer it will call s_start again on the next
      itertation with the same position. This causes a problem with the
      function_graph tracer because it consumes the iteration in order to
      determine leaf functions.
      
      What happens is that the iterator stores the entry, and the function
      graph plugin will look at the next entry. If that next entry is a return
      of the same function and task, then the function is a leaf and the
      function_graph plugin calls ring_buffer_read which moves the ring buffer
      iterator forward (the trace iterator still points to the function start
      entry).
      
      The copying of the trace_seq to the seq_file buffer will fail if the
      seq_file buffer is full. The seq_read will not show this entry.
      The next read by userspace will cause seq_read to again call s_start
      which will reuse the trace iterator entry (the function start entry).
      But the function return entry was already consumed. The function graph
      plugin will think that this entry is a nested function and not a leaf.
      
      To solve this, the trace code now checks the return status of the
      seq_printf (trace_print_seq). If the writing to the seq_file buffer
      fails, we set a flag in the iterator (leftover) and we do not reset
      the trace_seq buffer. On the next call to s_start, we check the leftover
      flag, and if it is set, we just reuse the trace_seq buffer and do not
      call into the plugin print functions.
      
      Before this patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)               |          __fsnotify_parent() {
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      After the patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)   0.548 us    |          __fsnotify_parent();
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      [
        Updated the patch to fix a missing return 0 from the trace_print_seq()
        stub when CONFIG_TRACING is disabled.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      ]
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a63ce5b3
  19. 24 10月, 2009 1 次提交
    • J
      tracing: Fix trace_seq_printf() return value · 3e69533b
      Jiri Olsa 提交于
      trace_seq_printf() return value is a little ambiguous. It
      currently returns the length of the space available in the
      buffer. printf usually returns the amount written. This is not
      adequate here, because:
      
        trace_seq_printf(s, "");
      
      is perfectly legal, and returning 0 would indicate that it
      failed.
      
      We can always see the amount written by looking at the before
      and after values of s->len. This is not quite the same use as
      printf. We only care if the string was successfully written to
      the buffer or not.
      
      Make trace_seq_printf() return 0 if the trace oversizes the
      buffer's free space, 1 otherwise.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091023233646.631787612@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e69533b
  20. 08 10月, 2009 1 次提交
  21. 06 10月, 2009 1 次提交
  22. 12 9月, 2009 2 次提交
    • S
      tracing: consolidate code between trace_output.c and trace_function_graph.c · f81c972d
      Steven Rostedt 提交于
      Both trace_output.c and trace_function_graph.c do basically the same
      thing to handle the printing of the latency-format. This patch moves
      the code into one function that both can use.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f81c972d
    • S
      tracing: add lock depth to entries · 637e7e86
      Steven Rostedt 提交于
      This patch adds the lock depth of the big kernel lock to the generic
      entry header. This way we can see the depth of the lock and help
      in removing the BKL.
      
      Example:
      
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
         <idle>-0       2.N..3 5902255250us+: lock_acquire: read rcu_read_lock
         <idle>-0       2.N..3 5902255253us+: lock_release: rcu_read_lock
         <idle>-0       2dN..3 5902255257us+: lock_acquire: xtime_lock
         <idle>-0       2dN..4 5902255259us : lock_acquire: clocksource_lock
         <idle>-0       2dN..4 5902255261us+: lock_release: clocksource_lock
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      637e7e86
  23. 11 9月, 2009 1 次提交
  24. 02 7月, 2009 1 次提交
  25. 10 6月, 2009 2 次提交
  26. 03 6月, 2009 4 次提交
    • S
      tracing: add annotation to what type of stack trace is recorded · 563af16c
      Steven Rostedt 提交于
      The current method of printing out a stack trace is to add a new line
      and print out the trace:
      
          yum-updatesd-3120  [002]   573.691303:
       => do_softirq
       => irq_exit
       => smp_apic_timer_interrupt
       => apic_timer_interrupt
      
      This looks a bit awkward, and if we have both stack and user stack traces
      running, it would be nice to have a title to tell them apart, although
      it is easy to tell by the output.
      
      This patch adds an annotation to the start of the stack traces:
      
                  init-1     [003]   929.304979: <stack trace>
       => user_path_at
       => vfs_fstatat
       => vfs_stat
       => sys_newstat
       => system_call_fastpath
      
                   cat-3459  [002]  1016.824040: <user stack trace>
       =>  <0000003aae6c0250>
       =>  <00007ffff4b06ae4>
       =>  <69636172742f6775>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      563af16c
    • S
      tracing: fix multiple use of __print_flags and __print_symbolic · 56d8bd3f
      Steven Whitehouse 提交于
      Here is an updated patch to include the extra call to
      trace_seq_init() as requested. This is vs. the latest
      -tip tree and fixes the use of multiple __print_flags
      and __print_symbolic in a single tracer. Also tested
      to ensure its working now:
      
      mount.gfs2-2534  [000]   235.850587: gfs2_glock_queue: 8.7 glock 1:2 dequeue PR
      mount.gfs2-2534  [000]   235.850591: gfs2_demote_rq: 8.7 glock 1:0 demote EX to NL flags:DI
      mount.gfs2-2534  [000]   235.850591: gfs2_glock_queue: 8.7 glock 1:0 dequeue EX
      glock_workqueue-2529  [000]   235.850666: gfs2_glock_state_change: 8.7 glock 1:0 state EX => NL tgt:NL dmt:NL flags:lDpI
      glock_workqueue-2529  [000]   235.850672: gfs2_glock_put: 8.7 glock 1:0 state NL => IV flags:I
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      LKML-Reference: <1244037123.29604.603.camel@localhost.localdomain>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56d8bd3f
    • W
      tracing/events: fix output format of user stack · 048dc50c
      walimis 提交于
      According to "events/ftrace/user_stack/format", fix the output of
      user stack.
      
      before fix:
      
        sh-1073  [000]    31.137561:  <b7f274fe> <-  <0804e33c> <-  <080835c1>
      
      after fix:
      
        sh-1072  [000]    37.039329:
       =>  <b7f8a4fe>
       =>  <0804e33c>
       =>  <080835c1>
      Signed-off-by: Nwalimis <walimisdev@gmail.com>
      LKML-Reference: <1244016090-7814-3-git-send-email-walimisdev@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      048dc50c
    • W
      tracing/events: fix output format of kernel stack · f11b3f4e
      walimis 提交于
      According to "events/ftrace/kernel_stack/format", output format of
      kernel stack should use "=>" instead of "<=".
      
      The second problem is that we shouldn't skip the first entry in the stack,
      although it seems to be duplicated when used in the "function" tracer,
      but events also use it. If we skip the first one, we will drop the topmost
      entry of the stack.
      
      The last problem is that if the last entry is ULONG_MAX(0xffffffff), we should
      drop it, otherwise it will print a NULL name line.
      
      before fix:
      
            sh-1072  [000]   26.957239: sched_process_fork: parent sh:1072 child sh:1073
            sh-1072  [000]   26.957262:
       <= syscall_call
       <=
            sh-1072  [000]   26.957744: sched_switch: task sh:1072 [120] (R) ==> sh:1073 [120]
            sh-1072  [000]   26.957752:
       <= preempt_schedule
       <= wake_up_new_task
       <= do_fork
       <= sys_clone
       <= syscall_call
       <=
      
      After fix:
      
            sh-1075  [000]    39.791848: sched_process_fork: parent sh:1075  child sh:1076
            sh-1075  [000]    39.791871:
       => sys_clone
       => syscall_call
            sh-1075  [000]    39.792713: sched_switch: task sh:1075 [120] (R) ==> sh:1076 [120]
            sh-1075  [000]    39.792722:
       => schedule
       => preempt_schedule
       => wake_up_new_task
       => do_fork
       => sys_clone
       => syscall_call
      Signed-off-by: Nwalimis <walimisdev@gmail.com>
      LKML-Reference: <1244016090-7814-2-git-send-email-walimisdev@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f11b3f4e
  27. 02 6月, 2009 1 次提交
  28. 27 5月, 2009 2 次提交
    • S
      tracing: add __print_symbolic to trace events · 0f4fc29d
      Steven Rostedt 提交于
      This patch adds __print_symbolic which is similar to __print_flags but
      works for an enumeration type instead. That is, there is only a one to one
      mapping between the values and the symbols. When a match is made, then
      it is printed, otherwise the hex value is outputed.
      
      [ Impact: add interface for showing symbol names in events ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      0f4fc29d
    • S
      tracing: add __print_flags for events · be74b73a
      Steven Rostedt 提交于
      Developers have been asking for the ability in the ftrace event tracer
      to display names of bits in a flags variable.
      
      Instead of printing out c2, it would be easier to read FOO|BAR|GOO,
      assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7.
      
      Some examples where this would be useful are the state flags in a context
      switch, kmalloc flags, and even permision flags in accessing files.
      
      [
        v2 changes include:
      
        Frederic Weisbecker's idea of using a mask instead of bits,
        thus we can output GFP_KERNEL instead of GPF_WAIT|GFP_IO|GFP_FS.
      
        Li Zefan's idea of allowing the caller of __print_flags to add their
        own delimiter (or no delimiter) where we can get for file permissions
        rwx instead of r|w|x.
      ]
      
      [
        v3 changes:
      
         Christoph Hellwig's idea of using an array instead of va_args.
      ]
      
      [ Impact: better displaying of flags in trace output ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      be74b73a
  29. 26 5月, 2009 1 次提交
    • L
      tracing: add trace_event_read_lock() · 4f535968
      Lai Jiangshan 提交于
      I found that there is nothing to protect event_hash in
      ftrace_find_event(). Rcu protects the event hashlist
      but not the event itself while we use it after its extraction
      through ftrace_find_event().
      
      This lack of a proper locking in this spot opens a race
      window between any event dereferencing and module removal.
      
      Eg:
      
      --Task A--
      
      print_trace_line(trace) {
        event = find_ftrace_event(trace)
      
      --Task B--
      
      trace_module_remove_events(mod) {
        list_trace_events_module(ev, mod) {
          unregister_ftrace_event(ev->event) {
            hlist_del(ev->event->node)
              list_del(....)
          }
        }
      }
      |--> module removed, the event has been dropped
      
      --Task A--
      
        event->print(trace); // Dereferencing freed memory
      
      If the event retrieved belongs to a module and this module
      is concurrently removed, we may end up dereferencing a data
      from a freed module.
      
      RCU could solve this, but it would add latency to the kernel and
      forbid tracers output callbacks to call any sleepable code.
      So this fix converts 'trace_event_mutex' to a read/write semaphore,
      and adds trace_event_read_lock() to protect ftrace_find_event().
      
      [ Impact: fix possible freed memory dereference in ftrace ]
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <4A114806.7090302@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      4f535968
  30. 15 5月, 2009 1 次提交
  31. 06 5月, 2009 1 次提交