1. 19 6月, 2009 1 次提交
    • S
      function-graph: add stack frame test · 71e308a2
      Steven Rostedt 提交于
      In case gcc does something funny with the stack frames, or the return
      from function code, we would like to detect that.
      
      An arch may implement passing of a variable that is unique to the
      function and can be saved on entering a function and can be tested
      when exiting the function. Usually the frame pointer can be used for
      this purpose.
      
      This patch also implements this for x86. Where it passes in the stack
      frame of the parent function, and will test that frame on exit.
      
      There was a case in x86_32 with optimize for size (-Os) where, for a
      few functions, gcc would align the stack frame and place a copy of the
      return address into it. The function graph tracer modified the copy and
      not the actual return address. On return from the funtion, it did not go
      to the tracer hook, but returned to the parent. This broke the function
      graph tracer, because the return of the parent (where gcc did not do
      this funky manipulation) returned to the location that the child function
      was suppose to. This caused strange kernel crashes.
      
      This test detected the problem and pointed out where the issue was.
      
      This modifies the parameters of one of the functions that the arch
      specific code calls, so it includes changes to arch code to accommodate
      the new prototype.
      
      Note, I notice that the parsic arch implements its own push_return_trace.
      This is now a generic function and the ftrace_push_return_trace should be
      used instead. This patch does not touch that code.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      71e308a2
  2. 03 6月, 2009 1 次提交
    • S
      function-graph: enable the stack after initialization of other variables · 82310a32
      Steven Rostedt 提交于
      The function graph tracer checks if the task_struct has ret_stack defined
      to know if it is OK or not to use it. The initialization is done for
      all tasks by one process, but the idle tasks use the same initialization
      used by new tasks.
      
      If an interrupt happens on an idle task that just had the ret_stack
      created, but before the rest of the initialization took place, then
      we can corrupt the return address of the functions.
      
      This patch moves the setting of the task_struct's ret_stack to after
      the other variables have been initialized.
      
      [ Impact: prevent kernel panic on idle task when starting function graph ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      82310a32
  3. 25 3月, 2009 2 次提交
    • S
      function-graph: add option to calculate graph time or not · a2a16d6a
      Steven Rostedt 提交于
      graph time is the time that a function is executing another function.
      Thus if function A calls B, if graph-time is set, then the time for
      A includes B. This is the default behavior. But if graph-time is off,
      then the time spent executing B is subtracted from A.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      a2a16d6a
    • S
      tracing: adding function timings to function profiler · 0706f1c4
      Steven Rostedt 提交于
      If the function graph trace is enabled, the function profiler will
      use it to take the timing of the functions.
      
       cat /debug/tracing/trace_stat/functions
      
        Function                               Hit    Time
        --------                               ---    ----
        mwait_idle                             127    183028.4 us
        schedule                                26    151997.7 us
        __schedule                              31    151975.1 us
        sys_wait4                                2    74080.53 us
        do_wait                                  2    74077.80 us
        sys_newlstat                           138    39929.16 us
        do_path_lookup                         179    39845.79 us
        vfs_lstat_fd                           138    39761.97 us
        user_path_at                           153    39469.58 us
        path_walk                              179    39435.76 us
        __link_path_walk                       189    39143.73 us
      [...]
      
      Note the times are skewed due to the function graph tracer not taking
      into account schedules.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      0706f1c4
  4. 24 3月, 2009 1 次提交
  5. 20 3月, 2009 2 次提交
    • S
      function-graph: show binary events as comments · 5087f8d2
      Steven Rostedt 提交于
      With the added TRACE_EVENT macro, the events no longer appear in
      the function graph tracer. This was because the function graph
      did not know how to display the entries. The graph tracer was
      only aware of its own entries and the printk entries.
      
      By using the event call back feature, the graph tracer can now display
      the events.
      
       # echo irq > /debug/tracing/set_event
      
      Which can show:
      
       0)               |          handle_IRQ_event() {
       0)               |            /* irq_handler_entry: irq=48 handler=eth0 */
       0)               |            e1000_intr() {
       0)   0.926 us    |              __napi_schedule();
       0)   3.888 us    |            }
       0)               |            /* irq_handler_exit: irq=48 return=handled */
       0)   0.655 us    |            runqueue_is_locked();
       0)               |            __wake_up() {
       0)   0.831 us    |              _spin_lock_irqsave();
      
      The irq entry and exit events show up as comments.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      5087f8d2
    • S
      function-graph: calculate function depth within function graph tracer · 2fbcdb35
      Steven Rostedt 提交于
      Currently, the function graph tracer depends on the trace_printk
      to record the depth. All the information is already there in the trace
      to calculate function depth, with the exception of having the printk
      be the first item. But as soon as a entry or exit is reached, then
      we know the depth.
      
      This patch changes the iter->private data from recording a per cpu
      last_pid, to a structure that holds both the last_pid and the current
      depth. This data is used to determine the function depth for the
      printks.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      2fbcdb35
  6. 19 3月, 2009 1 次提交
  7. 17 3月, 2009 1 次提交
    • S
      tracing: protect reader of cmdline output · 4ca53085
      Steven Rostedt 提交于
      Impact: fix to one cause of incorrect comm outputs in trace
      
      The spinlock only protected the creation of a comm <=> pid pair.
      But it was possible that a reader could look up a pid, and get the
      wrong comm because it had no locking.
      
      This also required changing trace_find_cmdline to copy the comm cache
      and not just send back a pointer to it.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      4ca53085
  8. 13 3月, 2009 1 次提交
    • F
      tracing/core: bring back raw trace_printk for dynamic formats strings · 48ead020
      Frederic Weisbecker 提交于
      Impact: fix callsites with dynamic format strings
      
      Since its new binary implementation, trace_printk() internally uses static
      containers for the format strings on each callsites. But the value is
      assigned once at build time, which means that it can't take dynamic
      formats.
      
      So this patch unearthes the raw trace_printk implementation for the callers
      that will need trace_printk to be able to carry these dynamic format
      strings. The trace_printk() macro will use the appropriate implementation
      for each callsite. Most of the time however, the binary implementation will
      still be used.
      
      The other impact of this patch is that mmiotrace_printk() will use the old
      implementation because it calls the low level trace_vprintk and we can't
      guess here whether the format passed in it is dynamic or not.
      
      Some parts of this patch have been written by Steven Rostedt (most notably
      the part that chooses the appropriate implementation for each callsites).
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      48ead020
  9. 11 3月, 2009 1 次提交
  10. 07 3月, 2009 1 次提交
    • F
      tracing/core: drop the old trace_printk() implementation in favour of trace_bprintk() · 769b0441
      Frederic Weisbecker 提交于
      Impact: faster and lighter tracing
      
      Now that we have trace_bprintk() which is faster and consume lesser
      memory than trace_printk() and has the same purpose, we can now drop
      the old implementation in favour of the binary one from trace_bprintk(),
      which means we move all the implementation of trace_bprintk() to
      trace_printk(), so the Api doesn't change except that we must now use
      trace_seq_bprintk() to print the TRACE_PRINT entries.
      
      Some changes result of this:
      
      - Previously, trace_bprintk depended of a single tracer and couldn't
        work without. This tracer has been dropped and the whole implementation
        of trace_printk() (like the module formats management) is now integrated
        in the tracing core (comes with CONFIG_TRACING), though we keep the file
        trace_printk (previously trace_bprintk.c) where we can find the module
        management. Thus we don't overflow trace.c
      
      - changes some parts to use trace_seq_bprintk() to print TRACE_PRINT entries.
      
      - change a bit trace_printk/trace_vprintk macros to support non-builtin formats
        constants, and fix 'const' qualifiers warnings. But this is all transparent for
        developers.
      
      - etc...
      
      V2:
      
      - Rebase against last changes
      - Fix mispell on the changelog
      
      V3:
      
      - Rebase against last changes (moving trace_printk() to kernel.h)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1236356510-8381-5-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      769b0441
  11. 05 3月, 2009 1 次提交
  12. 19 2月, 2009 3 次提交
  13. 18 2月, 2009 1 次提交
    • F
      tracing/core: use appropriate waiting on trace_pipe · 6eaaa5d5
      Frederic Weisbecker 提交于
      Impact: api and pipe waiting change
      
      Currently, the waiting used in tracing_read_pipe() is done through a
      100 msecs schedule_timeout() loop which periodically check if there
      are traces on the buffer.
      
      This can cause small latencies for programs which are reading the incoming
      events.
      
      This patch makes the reader waiting for the trace_wait waitqueue except
      for few tracers such as the sched and functions tracers which might be
      already hold the runqueue lock while waking up the reader.
      
      This is performed through a new callback wait_pipe() on struct tracer.
      If none is implemented on a specific tracer, the default waiting for
      trace_wait queue is attached.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6eaaa5d5
  14. 11 2月, 2009 1 次提交
  15. 09 2月, 2009 2 次提交
  16. 06 2月, 2009 1 次提交
  17. 29 1月, 2009 1 次提交
  18. 23 1月, 2009 1 次提交
    • F
      tracing/function-graph-tracer: various fixes and features · 9005f3eb
      Frederic Weisbecker 提交于
      This patch brings various bugfixes:
      
      - Drop the first irrelevant task switch on the very beginning of a trace.
      
      - Drop the OVERHEAD word from the headers, the DURATION word is sufficient
        and will not overlap other columns.
      
      - Make the headers fit well their respective columns whatever the
        selected options.
      
      Ie, default options:
      
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
      
        1)   0.646 us    |                    }
        1)               |                    mem_cgroup_del_lru_list() {
        1)   0.624 us    |                      lookup_page_cgroup();
        1)   1.970 us    |                    }
      
       echo funcgraph-proc > trace_options
      
       # tracer: function_graph
       #
       # CPU  TASK/PID        DURATION                  FUNCTION CALLS
       # |    |    |           |   |                     |   |   |   |
      
        0)   bash-2937    |   0.895 us    |                }
        0)   bash-2937    |   0.888 us    |                __rcu_read_unlock();
        0)   bash-2937    |   0.864 us    |                conv_uni_to_pc();
        0)   bash-2937    |   1.015 us    |                __rcu_read_lock();
      
       echo nofuncgraph-cpu > trace_options
       echo nofuncgraph-proc > trace_options
      
       # tracer: function_graph
       #
       #   DURATION                  FUNCTION CALLS
       #    |   |                     |   |   |   |
      
         3.752 us    |                  native_pud_val();
         0.616 us    |                  native_pud_val();
         0.624 us    |                  native_pmd_val();
      
      About features, one can now disable the duration (this will hide the
      overhead too for convenient reasons and because on  doesn't need
      overhead if it hasn't the duration):
      
       echo nofuncgraph-duration > trace_options
      
       # tracer: function_graph
       #
       #                FUNCTION CALLS
       #                |   |   |   |
      
                 cap_vm_enough_memory() {
                   __vm_enough_memory() {
                     vm_acct_memory();
                   }
                 }
               }
      
      And at last, an option to print the absolute time:
      
       //Restart from default options
       echo funcgraph-abstime > trace_options
      
       # tracer: function_graph
       #
       #      TIME       CPU  DURATION                  FUNCTION CALLS
       #       |         |     |   |                     |   |   |   |
      
         261.339774 |   1) + 42.823 us   |    }
         261.339775 |   1)   1.045 us    |    _spin_lock_irq();
         261.339777 |   1)   0.940 us    |    _spin_lock_irqsave();
         261.339778 |   1)   0.752 us    |    _spin_unlock_irqrestore();
         261.339780 |   1)   0.857 us    |    _spin_unlock_irq();
         261.339782 |   1)               |    flush_to_ldisc() {
         261.339783 |   1)               |      tty_ldisc_ref() {
         261.339783 |   1)               |        tty_ldisc_try() {
         261.339784 |   1)   1.075 us    |          _spin_lock_irqsave();
         261.339786 |   1)   0.842 us    |          _spin_unlock_irqrestore();
         261.339788 |   1)   4.211 us    |        }
         261.339788 |   1)   5.662 us    |      }
      
      The format is seconds.usecs.
      
      I guess no one needs the nanosec precision here, the main goal is to have
      an overview about the general timings of events, and to see the place when
      the trace switches from one cpu to another.
      
      ie:
      
         274.874760 |   1)   0.676 us    |      _spin_unlock();
         274.874762 |   1)   0.609 us    |      native_load_sp0();
         274.874763 |   1)   0.602 us    |      native_load_tls();
         274.878739 |   0)   0.722 us    |                  }
         274.878740 |   0)   0.714 us    |                  native_pmd_val();
         274.878741 |   0)   0.730 us    |                  native_pmd_val();
      
      Here there is a 4000 usecs difference when we switch the cpu.
      
      Changes in V2:
      
      - Completely fix the first pointless task switch.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9005f3eb
  19. 01 1月, 2009 1 次提交
  20. 29 12月, 2008 2 次提交
  21. 26 12月, 2008 1 次提交
  22. 12 12月, 2008 1 次提交
    • F
      tracing/function-graph-tracer: Output arrows signal on hardirq call/return · f8b755ac
      Frederic Weisbecker 提交于
      Impact: make more obvious the hardirq calls in the output
      
      When a hardirq is triggered inside the codeflow on output, we have
      now two arrows that indicate the entry and return of the hardirq.
      
       0)               |          bit_waitqueue() {
       0)   0.880 us    |            __phys_addr();
       0)   2.699 us    |          }
       0)               |          __wake_up_bit() {
       0)   ==========> |          smp_apic_timer_interrupt() {
       0)   0.797 us    |            native_apic_mem_write();
       0)   0.715 us    |            exit_idle();
       0)               |            irq_enter() {
       0)   0.722 us    |              idle_cpu();
       0)   5.519 us    |            }
       0)               |            hrtimer_interrupt() {
       0)               |              ktime_get() {
       0)               |                ktime_get_ts() {
       0)   0.805 us    |                  getnstimeofday();
      
       [...]
      
       0) ! 108.528 us  |            }
       0)               |            irq_exit() {
       0)               |              do_softirq() {
       0)               |                __do_softirq() {
       0)   0.895 us    |                  __local_bh_disable();
       0)               |                  run_timer_softirq() {
       0)   0.827 us    |                    hrtimer_run_pending();
       0)   1.226 us    |                    _spin_lock_irq();
       0)               |                    _spin_unlock_irq() {
       0)   6.550 us    |                  }
       0)   0.924 us    |                  _local_bh_enable();
       0) + 12.129 us   |                }
       0) + 13.911 us   |              }
       0)   0.707 us    |              idle_cpu();
       0) + 17.009 us   |            }
       0) ! 137.419 us  |          }
       0)   <========== |
       0)   1.045 us    |          }
       0) ! 148.908 us  |        }
       0) ! 151.022 us  |      }
       0) ! 153.022 us  |    }
       0)   0.963 us    |    journal_mark_dirty();
       0)   0.925 us    |    __brelse();
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f8b755ac
  23. 08 12月, 2008 1 次提交
  24. 04 12月, 2008 1 次提交
    • F
      tracing/function-graph-tracer: handle ftrace_printk entries · 1fd8f2a3
      Frederic Weisbecker 提交于
      Handle the TRACE_PRINT entries from the function grapg tracer
      and output them as a C comment just below the function that called
      it, as if it was a comment inside this function.
      
      Example with an ftrace_printk inside might_sleep() function:
      
      void __might_sleep(char *file, int line)
      {
      	static unsigned long prev_jiffy;	/* ratelimiting */
      
      	ftrace_printk("Hi I'm a comment in might_sleep() :-)");
      
      A chunk of a resulting trace:
      
       0)               |        _reiserfs_free_block() {
       0)               |          reiserfs_read_bitmap_block() {
       0)               |            __bread() {
       0)               |              __getblk() {
       0)               |                __find_get_block() {
       0)   0.698 us    |                  mark_page_accessed();
       0)   2.267 us    |                }
       0)               |                __might_sleep() {
       0)               |                  /* Hi I'm a comment in might_sleep() :-) */
       0)   1.321 us    |                }
       0)   5.872 us    |              }
       0)   7.313 us    |            }
       0)   8.718 us    |          }
      
      And this patch brings two minor fixes:
      
      - The newline after a switch-out task has disappeared
      - The "|" sign just before the cpu number on task-switch has been deleted.
      
       0)   0.616 us    |                pick_next_task_rt();
       0)   1.457 us    |                _spin_trylock();
       0)   0.653 us    |                _spin_unlock();
       0)   0.728 us    |                _spin_trylock();
       0)   0.631 us    |                _spin_unlock();
       0)   0.729 us    |                native_load_sp0();
       0)   0.593 us    |                native_load_tls();
       ------------------------------------------
       0)    cat-2834    =>   migrati-3
       ------------------------------------------
      
       0)               |    finish_task_switch() {
       0)   0.841 us    |      _spin_unlock_irq();
       0)   0.616 us    |      post_schedule_rt();
       0)   3.882 us    |    }
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fd8f2a3
  25. 03 12月, 2008 2 次提交
    • F
      tracing/function-graph-tracer: improve duration output · 166d3c79
      Frederic Weisbecker 提交于
      Impact: better trace output of duration for long calls
      
      The old duration output didn't exceeded 9999.999 us to fit the column
      and the nanosecs were always 3 numbers. As Ingo suggested, it's better
      to have the whole microseconds elapsed time and shift the nanosecs precision
      if needed to fit the maximum 7 numbers. And usec need more number, the case
      should be rare and important enough to break a bit the column alignment to
      show it.
      
      So, depending of the duration value, we now have these patterns:
      
          u.nnn us
         uu.nnn us
        uuu.nnn us
       uuuu.nnn us
       uuuuu.nn us
       uuuuuu.n us
       uuuuuuuu..... us
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      166d3c79
    • F
      tracing/function-graph-tracer: display unified style cmdline and pid · 11e84acc
      Frederic Weisbecker 提交于
      Impact: extend function-graph output: let one know which thread called a function
      
      This patch implements a helper function to print the couple cmdline/pid.
      Its output is provided during task switching and on each row if the new
      "funcgraph-proc" defualt-off option is set through trace_options file.
      
      The output is center aligned and never exceeds 14 characters. The cmdline
      is truncated over 7 chars.
      But note that if the pid exceeds 6 characters, the column will overflow (but
      the situation is abnormal).
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      11e84acc
  26. 28 11月, 2008 2 次提交
    • I
      tracing/function-graph-tracer: more output tweaks · d51090b3
      Ingo Molnar 提交于
      Impact: prettify the output some more
      
      Before:
      
      0)           |     sys_read() {
      0)      0.796 us |   fget_light();
      0)           |       vfs_read() {
      0)           |         rw_verify_area() {
      0)           |           security_file_permission() {
      ------------8<---------- thread sshd-1755 ------------8<----------
      
      After:
      
       0)               |  sys_read() {
       0)      0.796 us |    fget_light();
       0)               |    vfs_read() {
       0)               |      rw_verify_area() {
       0)               |        security_file_permission() {
       ------------------------------------------
       | 1)  migration/0--1  =>  sshd-1755
       ------------------------------------------
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d51090b3
    • F
      tracing/function-graph-tracer: adjustments of the trace informations · 1a056155
      Frederic Weisbecker 提交于
      Impact: increase the visual qualities of the call-graph-tracer output
      
      This patch applies various trace output formatting changes:
      
       - CPU is now a decimal number, followed by a parenthesis.
      
       - Overhead is now on the second column (gives a good visibility)
      
       - Cost is now on the third column, can't exceed 9999.99 us. It is
         followed by a virtual line based on a "|" character.
      
       - Functions calls are now the last column on the right. This way, we
         haven't dynamic column (which flow is harder to follow) on its right.
      
       - CPU and Overhead have their own option flag. They are default-on but you
         can disable them easily:
      
            echo nofuncgraph-cpu > trace_options
            echo nofuncgraph-overhead > trace_options
      
      TODO:
      
      _ Refactoring of the thread switch output.
      _ Give a default-off option to output the thread and its pid on each row.
      _ Provide headers
      _ ....
      
      Here is an example of the new trace style:
      
      0)           |             mutex_unlock() {
      0)      0.639 us |           __mutex_unlock_slowpath();
      0)      1.607 us |         }
      0)           |             remove_wait_queue() {
      0)      0.616 us |           _spin_lock_irqsave();
      0)      0.616 us |           _spin_unlock_irqrestore();
      0)      2.779 us |         }
      0)      0.495 us |         n_tty_set_room();
      0) ! 9999.999 us |       }
      0)           |           tty_ldisc_deref() {
      0)      0.615 us |         _spin_lock_irqsave();
      0)      0.616 us |         _spin_unlock_irqrestore();
      0)      2.793 us |       }
      0)           |           current_fs_time() {
      0)      0.488 us |         current_kernel_time();
      0)      0.495 us |         timespec_trunc();
      0)      2.486 us |       }
      0) ! 9999.999 us |     }
      0) ! 9999.999 us |   }
      0) ! 9999.999 us | }
      0)           |     sys_read() {
      0)      0.796 us |   fget_light();
      0)           |       vfs_read() {
      0)           |         rw_verify_area() {
      0)           |           security_file_permission() {
      0)      0.488 us |         cap_file_permission();
      0)      1.720 us |       }
      0)      3.  4 us |     }
      0)           |         tty_read() {
      0)      0.488 us |       tty_paranoia_check();
      0)           |           tty_ldisc_ref_wait() {
      0)           |             tty_ldisc_try() {
      0)      0.615 us |           _spin_lock_irqsave();
      0)      0.615 us |           _spin_unlock_irqrestore();
      0)      5.436 us |         }
      0)      6.427 us |       }
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1a056155
  27. 27 11月, 2008 1 次提交
    • F
      tracing/function-graph-tracer: enhancements for the trace output · 83a8df61
      Frederic Weisbecker 提交于
      Impact: enhance the output of the graph-tracer
      
      This patch applies some ideas of Ingo Molnar and Steven Rostedt.
      
      * Output leaf functions in one line with parenthesis, semicolon and duration
        output.
      
      * Add a second column (after cpu) for an overhead sign.
        if duration > 100 us, "!"
        if duration > 10 us, "+"
        else " "
      
      * Print output in us with remaining nanosec: u.n
      
      * Print duration on the right end, following the indentation of the functions.
        Use also visual clues: "-" on entry call (no duration to output) and "+" on
        return (duration output).
      
      The name of the tracer has been fixed as well: function-branch becomes
      function_branch.
      
      Here is an example of the new output:
      
      CPU[000]           dequeue_entity() {                    -
      CPU[000]             update_curr() {                    -
      CPU[000]               update_min_vruntime();                    + 0.512 us
      CPU[000]             }                                + 1.504 us
      CPU[000]             clear_buddies();                    + 0.481 us
      CPU[000]             update_min_vruntime();                    + 0.504 us
      CPU[000]           }                                + 4.557 us
      CPU[000]           hrtick_update() {                    -
      CPU[000]             hrtick_start_fair();                    + 0.489 us
      CPU[000]           }                                + 1.443 us
      CPU[000] +       }                                + 14.655 us
      CPU[000] +     }                                + 15.678 us
      CPU[000] +   }                                + 16.686 us
      CPU[000]     msecs_to_jiffies();                    + 0.481 us
      CPU[000]     put_prev_task_fair();                    + 0.504 us
      CPU[000]     pick_next_task_fair();                    + 0.482 us
      CPU[000]     pick_next_task_rt();                    + 0.504 us
      CPU[000]     pick_next_task_fair();                    + 0.481 us
      CPU[000]     pick_next_task_idle();                    + 0.489 us
      CPU[000]     _spin_trylock();                    + 0.655 us
      CPU[000]     _spin_unlock();                    + 0.609 us
      
      CPU[000]  ------------8<---------- thread bash-2794 ------------8<----------
      
      CPU[000]               finish_task_switch() {                    -
      CPU[000]                 _spin_unlock_irq();                    + 0.722 us
      CPU[000]               }                                + 2.369 us
      CPU[000] !           }                                + 501972.605 us
      CPU[000] !         }                                + 501973.763 us
      CPU[000]           copy_from_read_buf() {                    -
      CPU[000]             _spin_lock_irqsave();                    + 0.670 us
      CPU[000]             _spin_unlock_irqrestore();                    + 0.699 us
      CPU[000]             copy_to_user() {                    -
      CPU[000]               might_fault() {                    -
      CPU[000]                 __might_sleep();                    + 0.503 us
      CPU[000]               }                                + 1.632 us
      CPU[000]               __copy_to_user_ll();                    + 0.542 us
      CPU[000]             }                                + 3.858 us
      CPU[000]             tty_audit_add_data() {                    -
      CPU[000]               _spin_lock_irq();                    + 0.609 us
      CPU[000]               _spin_unlock_irq();                    + 0.624 us
      CPU[000]             }                                + 3.196 us
      CPU[000]             _spin_lock_irqsave();                    + 0.624 us
      CPU[000]             _spin_unlock_irqrestore();                    + 0.625 us
      CPU[000] +         }                                + 13.611 us
      CPU[000]           copy_from_read_buf() {                    -
      CPU[000]             _spin_lock_irqsave();                    + 0.624 us
      CPU[000]             _spin_unlock_irqrestore();                    + 0.616 us
      CPU[000]           }                                + 2.820 us
      CPU[000]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      83a8df61
  28. 26 11月, 2008 4 次提交
    • S
      ftrace: add cpu annotation for function graph tracer · 437f24fb
      Steven Rostedt 提交于
      Impact: enhancement for function graph tracer
      
      When run on a SMP box, the function graph tracer is confusing because
      it shows the different CPUS as changes in the trace.
      
      This patch adds the annotation of 'CPU[###]' where ### is a three digit
      number. The output will look similar to this:
      
      CPU[001]     dput() {
      CPU[000] } 726
      CPU[001]     } 487
      CPU[000] do_softirq() {
      CPU[001]   } 2221
      CPU[000]   __do_softirq() {
      CPU[000]     __local_bh_disable() {
      CPU[001]   unroll_tree_refs() {
      CPU[000]     } 569
      CPU[001]   } 501
      CPU[000]     rcu_process_callbacks() {
      CPU[001]   kfree() {
      
      What makes this nice is that now you can grep the file and produce
      readable format for a particular CPU.
      
       # cat /debug/tracing/trace > /tmp/trace
       # grep '^CPU\[000\]' /tmp/trace > /tmp/trace0
       # grep '^CPU\[001\]' /tmp/trace > /tmp/trace1
      
      Will give you:
      
       # head /tmp/trace0
      CPU[000] ------------8<---------- thread sshd-3899 ------------8<----------
      CPU[000]     inotify_dentry_parent_queue_event() {
      CPU[000]     } 2531
      CPU[000]     inotify_inode_queue_event() {
      CPU[000]     } 505
      CPU[000]   } 69626
      CPU[000] } 73089
      CPU[000] audit_syscall_exit() {
      CPU[000]   path_put() {
      CPU[000]     dput() {
      
       # head /tmp/trace1
      CPU[001] ------------8<---------- thread pcscd-3446 ------------8<----------
      CPU[001]               } 4186
      CPU[001]               dput() {
      CPU[001]               } 543
      CPU[001]               vfs_permission() {
      CPU[001]                 inode_permission() {
      CPU[001]                   shmem_permission() {
      CPU[001]                     generic_permission() {
      CPU[001]                     } 501
      CPU[001]                   } 2205
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      437f24fb
    • S
      ftrace: add thread comm to function graph tracer · 660c7f9b
      Steven Rostedt 提交于
      Impact: enhancement to function graph tracer
      
      Export the trace_find_cmdline so the function graph tracer can
      use it to print the comms of the threads.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      660c7f9b
    • F
      tracing/function-return-tracer: set a more human readable output · 287b6e68
      Frederic Weisbecker 提交于
      Impact: feature
      
      This patch sets a C-like output for the function graph tracing.
      For this aim, we now call two handler for each function: one on the entry
      and one other on return. This way we can draw a well-ordered call stack.
      
      The pid of the previous trace is loosely stored to be compared against
      the one of the current trace to see if there were a context switch.
      
      Without this little feature, the call tree would seem broken at
      some locations.
      We could use the sched_tracer to capture these sched_events but this
      way of processing is much more simpler.
      
      2 spaces have been chosen for indentation to fit the screen while deep
      calls. The time of execution in nanosecs is printed just after closed
      braces, it seems more easy this way to find the corresponding function.
      If the time was printed as a first column, it would be not so easy to
      find the corresponding function if it is called on a deep depth.
      
      I plan to output the return value but on 32 bits CPU, the return value
      can be 32 or 64, and its difficult to guess on which case we are.
      I don't know what would be the better solution on X86-32: only print
      eax (low-part) or even edx (high-part).
      
      Actually it's thee same problem when a function return a 8 bits value, the
      high part of eax could contain junk values...
      
      Here is an example of trace:
      
      sys_read() {
        fget_light() {
        } 526
        vfs_read() {
          rw_verify_area() {
            security_file_permission() {
              cap_file_permission() {
              } 519
            } 1564
          } 2640
          do_sync_read() {
            pipe_read() {
              __might_sleep() {
              } 511
              pipe_wait() {
                prepare_to_wait() {
                } 760
                deactivate_task() {
                  dequeue_task() {
                    dequeue_task_fair() {
                      dequeue_entity() {
                        update_curr() {
                          update_min_vruntime() {
                          } 504
                        } 1587
                        clear_buddies() {
                        } 512
                        add_cfs_task_weight() {
                        } 519
                        update_min_vruntime() {
                        } 511
                      } 5602
                      dequeue_entity() {
                        update_curr() {
                          update_min_vruntime() {
                          } 496
                        } 1631
                        clear_buddies() {
                        } 496
                        update_min_vruntime() {
                        } 527
                      } 4580
                      hrtick_update() {
                        hrtick_start_fair() {
                        } 488
                      } 1489
                    } 13700
                  } 14949
                } 16016
                msecs_to_jiffies() {
                } 496
                put_prev_task_fair() {
                } 504
                pick_next_task_fair() {
                } 489
                pick_next_task_rt() {
                } 496
                pick_next_task_fair() {
                } 489
                pick_next_task_idle() {
                } 489
      
      ------------8<---------- thread 4 ------------8<----------
      
      finish_task_switch() {
      } 1203
      do_softirq() {
        __do_softirq() {
          __local_bh_disable() {
          } 669
          rcu_process_callbacks() {
            __rcu_process_callbacks() {
              cpu_quiet() {
                rcu_start_batch() {
                } 503
              } 1647
            } 3128
            __rcu_process_callbacks() {
            } 542
          } 5362
          _local_bh_enable() {
          } 587
        } 8880
      } 9986
      kthread_should_stop() {
      } 669
      deactivate_task() {
        dequeue_task() {
          dequeue_task_fair() {
            dequeue_entity() {
              update_curr() {
                calc_delta_mine() {
                } 511
                update_min_vruntime() {
                } 511
              } 2813
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      287b6e68
    • F
      tracing/function-return-tracer: change the name into function-graph-tracer · fb52607a
      Frederic Weisbecker 提交于
      Impact: cleanup
      
      This patch changes the name of the "return function tracer" into
      function-graph-tracer which is a more suitable name for a tracing
      which makes one able to retrieve the ordered call stack during
      the code flow.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fb52607a
  29. 18 11月, 2008 1 次提交
    • F
      tracing/function-return-tracer: add the overrun field · 0231022c
      Frederic Weisbecker 提交于
      Impact: help to find the better depth of trace
      
      We decided to arbitrary define the depth of function return trace as
      "20". Perhaps this is not enough. To help finding an optimal depth, we
      measure now the overrun: the number of functions that have been missed
      for the current thread. By default this is not displayed, we have to
      do set a particular flag on the return tracer: echo overrun >
      /debug/tracing/trace_options And the overrun will be printed on the
      right.
      
      As the trace shows below, the current 20 depth is not enough.
      
      update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838)
      update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838)
      do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838)
      tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838)
      tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838)
      vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274)
      vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274)
      set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274)
      con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274)
      release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274)
      release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274)
      con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274)
      con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274)
      n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274)
      irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274)
      smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274)
      ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274)
      ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274)
      ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274)
      hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0231022c