- 28 11月, 2008 2 次提交
-
-
由 Ingo Molnar 提交于
Impact: prettify the output some more Before: 0) | sys_read() { 0) 0.796 us | fget_light(); 0) | vfs_read() { 0) | rw_verify_area() { 0) | security_file_permission() { ------------8<---------- thread sshd-1755 ------------8<---------- After: 0) | sys_read() { 0) 0.796 us | fget_light(); 0) | vfs_read() { 0) | rw_verify_area() { 0) | security_file_permission() { ------------------------------------------ | 1) migration/0--1 => sshd-1755 ------------------------------------------ Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Impact: increase the visual qualities of the call-graph-tracer output This patch applies various trace output formatting changes: - CPU is now a decimal number, followed by a parenthesis. - Overhead is now on the second column (gives a good visibility) - Cost is now on the third column, can't exceed 9999.99 us. It is followed by a virtual line based on a "|" character. - Functions calls are now the last column on the right. This way, we haven't dynamic column (which flow is harder to follow) on its right. - CPU and Overhead have their own option flag. They are default-on but you can disable them easily: echo nofuncgraph-cpu > trace_options echo nofuncgraph-overhead > trace_options TODO: _ Refactoring of the thread switch output. _ Give a default-off option to output the thread and its pid on each row. _ Provide headers _ .... Here is an example of the new trace style: 0) | mutex_unlock() { 0) 0.639 us | __mutex_unlock_slowpath(); 0) 1.607 us | } 0) | remove_wait_queue() { 0) 0.616 us | _spin_lock_irqsave(); 0) 0.616 us | _spin_unlock_irqrestore(); 0) 2.779 us | } 0) 0.495 us | n_tty_set_room(); 0) ! 9999.999 us | } 0) | tty_ldisc_deref() { 0) 0.615 us | _spin_lock_irqsave(); 0) 0.616 us | _spin_unlock_irqrestore(); 0) 2.793 us | } 0) | current_fs_time() { 0) 0.488 us | current_kernel_time(); 0) 0.495 us | timespec_trunc(); 0) 2.486 us | } 0) ! 9999.999 us | } 0) ! 9999.999 us | } 0) ! 9999.999 us | } 0) | sys_read() { 0) 0.796 us | fget_light(); 0) | vfs_read() { 0) | rw_verify_area() { 0) | security_file_permission() { 0) 0.488 us | cap_file_permission(); 0) 1.720 us | } 0) 3. 4 us | } 0) | tty_read() { 0) 0.488 us | tty_paranoia_check(); 0) | tty_ldisc_ref_wait() { 0) | tty_ldisc_try() { 0) 0.615 us | _spin_lock_irqsave(); 0) 0.615 us | _spin_unlock_irqrestore(); 0) 5.436 us | } 0) 6.427 us | } Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 11月, 2008 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: enhance the output of the graph-tracer This patch applies some ideas of Ingo Molnar and Steven Rostedt. * Output leaf functions in one line with parenthesis, semicolon and duration output. * Add a second column (after cpu) for an overhead sign. if duration > 100 us, "!" if duration > 10 us, "+" else " " * Print output in us with remaining nanosec: u.n * Print duration on the right end, following the indentation of the functions. Use also visual clues: "-" on entry call (no duration to output) and "+" on return (duration output). The name of the tracer has been fixed as well: function-branch becomes function_branch. Here is an example of the new output: CPU[000] dequeue_entity() { - CPU[000] update_curr() { - CPU[000] update_min_vruntime(); + 0.512 us CPU[000] } + 1.504 us CPU[000] clear_buddies(); + 0.481 us CPU[000] update_min_vruntime(); + 0.504 us CPU[000] } + 4.557 us CPU[000] hrtick_update() { - CPU[000] hrtick_start_fair(); + 0.489 us CPU[000] } + 1.443 us CPU[000] + } + 14.655 us CPU[000] + } + 15.678 us CPU[000] + } + 16.686 us CPU[000] msecs_to_jiffies(); + 0.481 us CPU[000] put_prev_task_fair(); + 0.504 us CPU[000] pick_next_task_fair(); + 0.482 us CPU[000] pick_next_task_rt(); + 0.504 us CPU[000] pick_next_task_fair(); + 0.481 us CPU[000] pick_next_task_idle(); + 0.489 us CPU[000] _spin_trylock(); + 0.655 us CPU[000] _spin_unlock(); + 0.609 us CPU[000] ------------8<---------- thread bash-2794 ------------8<---------- CPU[000] finish_task_switch() { - CPU[000] _spin_unlock_irq(); + 0.722 us CPU[000] } + 2.369 us CPU[000] ! } + 501972.605 us CPU[000] ! } + 501973.763 us CPU[000] copy_from_read_buf() { - CPU[000] _spin_lock_irqsave(); + 0.670 us CPU[000] _spin_unlock_irqrestore(); + 0.699 us CPU[000] copy_to_user() { - CPU[000] might_fault() { - CPU[000] __might_sleep(); + 0.503 us CPU[000] } + 1.632 us CPU[000] __copy_to_user_ll(); + 0.542 us CPU[000] } + 3.858 us CPU[000] tty_audit_add_data() { - CPU[000] _spin_lock_irq(); + 0.609 us CPU[000] _spin_unlock_irq(); + 0.624 us CPU[000] } + 3.196 us CPU[000] _spin_lock_irqsave(); + 0.624 us CPU[000] _spin_unlock_irqrestore(); + 0.625 us CPU[000] + } + 13.611 us CPU[000] copy_from_read_buf() { - CPU[000] _spin_lock_irqsave(); + 0.624 us CPU[000] _spin_unlock_irqrestore(); + 0.616 us CPU[000] } + 2.820 us CPU[000] Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 11月, 2008 4 次提交
-
-
由 Steven Rostedt 提交于
Impact: enhancement for function graph tracer When run on a SMP box, the function graph tracer is confusing because it shows the different CPUS as changes in the trace. This patch adds the annotation of 'CPU[###]' where ### is a three digit number. The output will look similar to this: CPU[001] dput() { CPU[000] } 726 CPU[001] } 487 CPU[000] do_softirq() { CPU[001] } 2221 CPU[000] __do_softirq() { CPU[000] __local_bh_disable() { CPU[001] unroll_tree_refs() { CPU[000] } 569 CPU[001] } 501 CPU[000] rcu_process_callbacks() { CPU[001] kfree() { What makes this nice is that now you can grep the file and produce readable format for a particular CPU. # cat /debug/tracing/trace > /tmp/trace # grep '^CPU\[000\]' /tmp/trace > /tmp/trace0 # grep '^CPU\[001\]' /tmp/trace > /tmp/trace1 Will give you: # head /tmp/trace0 CPU[000] ------------8<---------- thread sshd-3899 ------------8<---------- CPU[000] inotify_dentry_parent_queue_event() { CPU[000] } 2531 CPU[000] inotify_inode_queue_event() { CPU[000] } 505 CPU[000] } 69626 CPU[000] } 73089 CPU[000] audit_syscall_exit() { CPU[000] path_put() { CPU[000] dput() { # head /tmp/trace1 CPU[001] ------------8<---------- thread pcscd-3446 ------------8<---------- CPU[001] } 4186 CPU[001] dput() { CPU[001] } 543 CPU[001] vfs_permission() { CPU[001] inode_permission() { CPU[001] shmem_permission() { CPU[001] generic_permission() { CPU[001] } 501 CPU[001] } 2205 Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Steven Rostedt 提交于
Impact: enhancement to function graph tracer Export the trace_find_cmdline so the function graph tracer can use it to print the comms of the threads. Signed-off-by: NSteven Rostedt <srostedt@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Impact: feature This patch sets a C-like output for the function graph tracing. For this aim, we now call two handler for each function: one on the entry and one other on return. This way we can draw a well-ordered call stack. The pid of the previous trace is loosely stored to be compared against the one of the current trace to see if there were a context switch. Without this little feature, the call tree would seem broken at some locations. We could use the sched_tracer to capture these sched_events but this way of processing is much more simpler. 2 spaces have been chosen for indentation to fit the screen while deep calls. The time of execution in nanosecs is printed just after closed braces, it seems more easy this way to find the corresponding function. If the time was printed as a first column, it would be not so easy to find the corresponding function if it is called on a deep depth. I plan to output the return value but on 32 bits CPU, the return value can be 32 or 64, and its difficult to guess on which case we are. I don't know what would be the better solution on X86-32: only print eax (low-part) or even edx (high-part). Actually it's thee same problem when a function return a 8 bits value, the high part of eax could contain junk values... Here is an example of trace: sys_read() { fget_light() { } 526 vfs_read() { rw_verify_area() { security_file_permission() { cap_file_permission() { } 519 } 1564 } 2640 do_sync_read() { pipe_read() { __might_sleep() { } 511 pipe_wait() { prepare_to_wait() { } 760 deactivate_task() { dequeue_task() { dequeue_task_fair() { dequeue_entity() { update_curr() { update_min_vruntime() { } 504 } 1587 clear_buddies() { } 512 add_cfs_task_weight() { } 519 update_min_vruntime() { } 511 } 5602 dequeue_entity() { update_curr() { update_min_vruntime() { } 496 } 1631 clear_buddies() { } 496 update_min_vruntime() { } 527 } 4580 hrtick_update() { hrtick_start_fair() { } 488 } 1489 } 13700 } 14949 } 16016 msecs_to_jiffies() { } 496 put_prev_task_fair() { } 504 pick_next_task_fair() { } 489 pick_next_task_rt() { } 496 pick_next_task_fair() { } 489 pick_next_task_idle() { } 489 ------------8<---------- thread 4 ------------8<---------- finish_task_switch() { } 1203 do_softirq() { __do_softirq() { __local_bh_disable() { } 669 rcu_process_callbacks() { __rcu_process_callbacks() { cpu_quiet() { rcu_start_batch() { } 503 } 1647 } 3128 __rcu_process_callbacks() { } 542 } 5362 _local_bh_enable() { } 587 } 8880 } 9986 kthread_should_stop() { } 669 deactivate_task() { dequeue_task() { dequeue_task_fair() { dequeue_entity() { update_curr() { calc_delta_mine() { } 511 update_min_vruntime() { } 511 } 2813 Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Impact: cleanup This patch changes the name of the "return function tracer" into function-graph-tracer which is a more suitable name for a tracing which makes one able to retrieve the ordered call stack during the code flow. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 11月, 2008 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: help to find the better depth of trace We decided to arbitrary define the depth of function return trace as "20". Perhaps this is not enough. To help finding an optimal depth, we measure now the overrun: the number of functions that have been missed for the current thread. By default this is not displayed, we have to do set a particular flag on the return tracer: echo overrun > /debug/tracing/trace_options And the overrun will be printed on the right. As the trace shows below, the current 20 depth is not enough. update_wall_time+0x37f/0x8c0 -> update_xtime_cache (345 ns) (Overruns: 2838) update_wall_time+0x384/0x8c0 -> clocksource_get_next (1141 ns) (Overruns: 2838) do_timer+0x23/0x100 -> update_wall_time (3882 ns) (Overruns: 2838) tick_do_update_jiffies64+0xbf/0x160 -> do_timer (5339 ns) (Overruns: 2838) tick_sched_timer+0x6a/0xf0 -> tick_do_update_jiffies64 (7209 ns) (Overruns: 2838) vgacon_set_cursor_size+0x98/0x120 -> native_io_delay (2613 ns) (Overruns: 274) vgacon_cursor+0x16e/0x1d0 -> vgacon_set_cursor_size (33151 ns) (Overruns: 274) set_cursor+0x5f/0x80 -> vgacon_cursor (36432 ns) (Overruns: 274) con_flush_chars+0x34/0x40 -> set_cursor (38790 ns) (Overruns: 274) release_console_sem+0x1ec/0x230 -> up (721 ns) (Overruns: 274) release_console_sem+0x225/0x230 -> wake_up_klogd (316 ns) (Overruns: 274) con_flush_chars+0x39/0x40 -> release_console_sem (2996 ns) (Overruns: 274) con_write+0x22/0x30 -> con_flush_chars (46067 ns) (Overruns: 274) n_tty_write+0x1cc/0x360 -> con_write (292670 ns) (Overruns: 274) smp_apic_timer_interrupt+0x2a/0x90 -> native_apic_mem_write (330 ns) (Overruns: 274) irq_enter+0x17/0x70 -> idle_cpu (413 ns) (Overruns: 274) smp_apic_timer_interrupt+0x2f/0x90 -> irq_enter (1525 ns) (Overruns: 274) ktime_get_ts+0x40/0x70 -> getnstimeofday (465 ns) (Overruns: 274) ktime_get_ts+0x60/0x70 -> set_normalized_timespec (436 ns) (Overruns: 274) ktime_get+0x16/0x30 -> ktime_get_ts (2501 ns) (Overruns: 274) hrtimer_interrupt+0x77/0x1a0 -> ktime_get (3439 ns) (Overruns: 274) Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 11月, 2008 2 次提交
-
-
由 Frederic Weisbecker 提交于
This patch adds the support for dynamic tracing on the function return tracer. The whole difference with normal dynamic function tracing is that we don't need to hook on a particular callback. The only pro that we want is to nop or set dynamically the calls to ftrace_caller (which is ftrace_return_caller here). Some security checks ensure that we are not trying to launch dynamic tracing for return tracing while normal function tracing is already running. An example of trace with getnstimeofday set as a filter: ktime_get_ts+0x22/0x50 -> getnstimeofday (2283 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1396 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1825 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1426 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1524 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1382 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1434 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1464 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1502 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1404 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1397 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1051 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1314 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1344 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1163 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1390 ns) ktime_get_ts+0x22/0x50 -> getnstimeofday (1374 ns) Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Impact: extend the ->init() method with the ability to fail This bring a way to know if the initialization of a tracer successed. A tracer must return 0 on success and a traditional error (ie: -ENOMEM) if it fails. If a tracer fails to init, it is free to print a detailed warn. The tracing api will not and switch to a new tracer will just return the error from the init callback. Note: this will be used for the return tracer. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 11月, 2008 1 次提交
-
-
由 Frederic Weisbecker 提交于
Impact: add new tracing plugin which can trace full (entry+exit) function calls This tracer uses the low level function return ftrace plugin to measure the execution time of the kernel functions. The first field is the caller of the function, the second is the measured function, and the last one is the execution time in nanoseconds. - v3: - HAVE_FUNCTION_RET_TRACER have been added. Each arch that support ftrace return should enable it. - ftrace_return_stub becomes ftrace_stub. - CONFIG_FUNCTION_RET_TRACER depends now on CONFIG_FUNCTION_TRACER - Return traces printing can be used for other tracers on trace.c - Adapt to the new tracing API (no more ctrl_update callback) - Correct the check of "disabled" during insertion. - Minor changes... Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-