- 28 3月, 2012 1 次提交
-
-
由 Steven Rostedt 提交于
When reading the trace file, the records of each of the per_cpu buffers are examined to find the next event to print out. At the point of looking at the event, the size of the event is recorded. But if the first event is chosen, the other events in the other CPU buffers will reset the event size that is stored in the iterator descriptor, causing the event size passed to the output functions to be incorrect. In most cases this is not a problem, but for the case of stack traces, it is. With the change to the stack tracing to record a dynamic number of back traces, the output depends on the size of the entry instead of the fixed 8 back traces. When the entry size is not correct, the back traces would not be fully printed. Note, reading from the per-cpu trace files were not affected. Reported-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 02 3月, 2012 1 次提交
-
-
由 Steven Rostedt 提交于
As ftrace_dump() (called by ftrace_dump_on_oops) disables interrupts as it dumps its output to the console, it can keep interrupts disabled for long periods of time. This is likely to trigger the NMI watchdog, and it can disrupt the output of critical data. Add a touch_nmi_watchdog() to each event that is written to the screen to keep the NMI watchdog from affecting the output. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 23 2月, 2012 1 次提交
-
-
由 Steven Rostedt 提交于
As the ring-buffer code is being used by other facilities in the kernel, having tracing_on file disable *all* buffers is not a desired affect. It should only disable the ftrace buffers that are being used. Move the code into the trace.c file and use the buffer disabling for tracing_on() and tracing_off(). This way only the ftrace buffers will be affected by them and other kernel utilities will not be confused to why their output suddenly stopped. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 13 2月, 2012 1 次提交
-
-
由 Geunsik Lim 提交于
Actually, sched_switch function tracer is merged into wakeup/wakeup_rt Update 'mini-HOWTO' for ftrace(Kernel function tracer). If we want to trace "sched:sched_switch" to trace sched_switch func, We may utilize event option.(e.g: trace-cmd list -e | grep sched) This patch is based on Linux-3.3.rc2-SMP-PREEMPT Link: http://lkml.kernel.org/r/1328695537-15081-1-git-send-email-geunsik.lim@gmail.com Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: NGeunsik Lim <geunsik.lim@samsung.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 04 1月, 2012 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 12 12月, 2011 1 次提交
-
-
由 Paul E. McKenney 提交于
Add an EXPORT_SYMBOL_GPL() so that rcutorture can dump the trace buffer upon detection of an RCU error. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
-
- 18 11月, 2011 1 次提交
-
-
由 Steven Rostedt 提交于
Knowing the number of event entries in the ring buffer compared to the total number that were written is useful information. The latency format gives this information and there's no reason that the default format does not. This information is now added to the default header, along with the number of online CPUs: # tracer: nop # # entries-in-buffer/entries-written: 159836/64690869 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] ...2 49.442971: local_touch_nmi <-cpu_idle <idle>-0 [000] d..2 49.442973: enter_idle <-cpu_idle <idle>-0 [000] d..2 49.442974: atomic_notifier_call_chain <-enter_idle <idle>-0 [000] d..2 49.442976: __atomic_notifier_call_chain <-atomic_notifier The above shows that the trace contains 159836 entries, but 64690869 were written. One could figure out that there were 64531033 entries that were dropped. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 17 11月, 2011 1 次提交
-
-
由 Steven Rostedt 提交于
People keep asking how to get the preempt count, irq, and need resched info and we keep telling them to enable the latency format. Some developers think that traces without this info is completely useless, and for a lot of tasks it is useless. The first option was to enable the latency trace as the default format, but the header for the latency format is pretty useless for most tracers and it also does the timestamp in straight microseconds from the time the trace started. This is sometimes more difficult to read as the default trace is seconds from the start of boot up. Latency format: # tracer: nop # # nop latency trace v1.1.5 on 3.2.0-rc1-test+ # -------------------------------------------------------------------- # latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule default format: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule The latency format header has latency information that is pretty meaningless for most tracers. Although some of the header is useful, and we can add that later to the default format as well. What is really useful with the latency format is the irqs-off, need-resched hard/softirq context and the preempt count. This commit adds the option irq-info which is on by default that adds this information: # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle <idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle If a user wants the old format, they can disable the 'irq-info' option: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] 49.309309: need_resched <-mwait_idle <idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] 49.309315: need_resched <-mwait_idle Requested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 08 11月, 2011 1 次提交
-
-
由 Jiri Olsa 提交于
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers do not display proper latency header. The involved/fixed latency tracers are: wakeup_rt wakeup preemptirqsoff preemptoff irqsoff The patch adds proper handling of tracer configuration options for latency tracers, and displaying correct header info accordingly. * The current output (for wakeup tracer) with both graph and function tracers disabled is: # tracer: wakeup # <idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0 <idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up ... * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0 cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up * The current output (for wakeup tracer) with only function tracer enabled is: # tracer: wakeup # cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0 cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / <idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1 <idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 14 10月, 2011 2 次提交
-
-
由 Steven Rostedt 提交于
The trace_pipe_raw handler holds a cached page from the time the file is opened to the time it is closed. The cached page is used to handle the case of the user space buffer being smaller than what was read from the ring buffer. The left over buffer is held in the cache so that the next read will continue where the data left off. After EOF is returned (no more data in the buffer), the index of the cached page is set to zero. If a user app reads the page again after EOF, the check in the buffer will see that the cached page is less than page size and will return the cached page again. This will cause reading the trace_pipe_raw again after EOF to return duplicate data, making the output look like the time went backwards but instead data is just repeated. The fix is to not reset the index right after all data is read from the cache, but to reset it after all data is read and more data exists in the ring buffer. Cc: stable <stable@kernel.org> Reported-by: NJeremy Eder <jeder@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Geunsik Lim 提交于
tracing_enabled option is deprecated. To start/stop tracing, write to /sys/kernel/debug/tracing/tracing_on without tracing_enabled. This patch is based on Linux 3.1.0-rc1 Signed-off-by: NGeunsik Lim <geunsik.lim@samsung.com> Link: http://lkml.kernel.org/r/1313127022-23830-1-git-send-email-leemgs1@gmail.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 11 10月, 2011 2 次提交
-
-
由 Steven Rostedt 提交于
When doing intense tracing, the kmalloc inside trace_marker can introduce side effects to what is being traced. As trace_marker() is used by userspace to inject data into the kernel ring buffer, it needs to do so with the least amount of intrusion to the operations of the kernel or the user space application. As the ring buffer is designed to write directly into the buffer without the need to make a temporary buffer, and userspace already went through the hassle of knowing how big the write will be, we can simply pin the userspace pages and write the data directly into the buffer. This improves the impact of tracing via trace_marker tremendously! Thanks to Peter Zijlstra and Thomas Gleixner for pointing out the use of get_user_pages_fast() and kmap_atomic(). Suggested-by: NThomas Gleixner <tglx@linutronix.de> Suggested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
As the function tracer is very intrusive, lots of self checks are performed on the tracer and if something is found to be strange it will shut itself down keeping it from corrupting the rest of the kernel. This shutdown may still allow functions to be traced, as the tracing only stops new modifications from happening. Trying to stop the function tracer itself can cause more harm as it requires code modification. Although a WARN_ON() is executed, a user may not notice it. To help the user see that something isn't right with the tracing of the system a big warning is added to the output of the tracer that lets the user know that their data may be incomplete. Reported-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 19 9月, 2011 1 次提交
-
-
由 Steven Rostedt 提交于
When debugging tight race conditions, it can be helpful to have a synchronized tracing method. Although in most cases the global clock provides this functionality, if timings is not the issue, it is more comforting to know that the order of events really happened in a precise order. Instead of using a clock, add a "counter" that is simply an incrementing atomic 64bit counter that orders the events as they are perceived to happen. The trace_clock_counter() is added from the attempt by Peter Zijlstra trying to convert the trace_clock_global() to it. I took Peter's counter code and made trace_clock_counter() instead, and added it to the choice of clocks. Just echo counter > /debug/tracing/trace_clock to activate it. Requested-by: NThomas Gleixner <tglx@linutronix.de> Requested-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-By: NValdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 13 9月, 2011 1 次提交
-
-
由 Thomas Gleixner 提交于
The tracing locks can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 8月, 2011 2 次提交
-
-
由 Vaibhav Nagarnaik 提交于
The stats file under per_cpu folder provides the number of entries, overruns and other statistics about the CPU ring buffer. However, the numbers do not provide any indication of how full the ring buffer is in bytes compared to the overall size in bytes. Also, it is helpful to know the rate at which the cpu buffer is filling up. This patch adds an entry "bytes: " in printed stats for per_cpu ring buffer which provides the actual bytes consumed in the ring buffer. This field includes the number of bytes used by recorded events and the padding bytes added when moving the tail pointer to next page. It also adds the following time stamps: "oldest event ts:" - the oldest timestamp in the ring buffer "now ts:" - the timestamp at the time of reading The field "now ts" provides a consistent time snapshot to the userspace when being read. This is read from the same trace clock used by tracing event timestamps. Together, these values provide the rate at which the buffer is filling up, from the formula: bytes / (now_ts - oldest_event_ts) Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1313531179-9323-3-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Vaibhav Nagarnaik 提交于
The current file "buffer_size_kb" reports the size of per-cpu buffer and not the overall memory allocated which could be misleading. A new file "buffer_total_size_kb" adds up all the enabled CPU buffer sizes and reports it. This is only a readonly entry. Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1313531179-9323-2-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 15 7月, 2011 1 次提交
-
-
由 Steven Rostedt 提交于
Currently the stack trace per event in ftace is only 8 frames. This can be quite limiting and sometimes useless. Especially when the "ignore frames" is wrong and we also use up stack frames for the event processing itself. Change this to be dynamic by adding a percpu buffer that we can write a large stack frame into and then copy into the ring buffer. For interrupts and NMIs that come in while another event is being process, will only get to use the 8 frame stack. That should be enough as the task that it interrupted will have the full stack frame anyway. Requested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 15 6月, 2011 7 次提交
-
-
由 Masami Hiramatsu 提交于
Fix to support kernel stack trace correctly on kprobe-tracer. Since the execution path of kprobe-based dynamic events is different from other tracepoint-based events, normal ftrace_trace_stack() doesn't work correctly. To fix that, this introduces ftrace_trace_stack_regs() which traces stack via pt_regs instead of current stack register. e.g. # echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events # echo 1 > /sys/kernel/debug/tracing/options/stacktrace # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable # head -n 20 /sys/kernel/debug/tracing/trace bash-2968 [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca) bash-2968 [000] 10297.050247: <stack trace> => schedule_timeout => n_tty_read => tty_read => vfs_read => sys_read => system_call_fastpath kworker/0:1-2940 [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca) kworker/0:1-2940 [000] 10297.050266: <stack trace> => worker_thread => kthread => kernel_thread_helper sshd-1132 [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca) sshd-1132 [000] 10297.050365: <stack trace> => sysret_careful Note: Even with this fix, the first entry will be skipped if the probe is put on the function entry area before the frame pointer is set up (usually, that is 4 bytes (push %bp; mov %sp %bp) on x86), because stack unwinder depends on the frame pointer. Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: yrl.pp-manager.tt@hitachi.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@gmail.com> Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Peter Huewe 提交于
This patch replaces the code for getting an unsigned long from a userspace buffer by a simple call to kstroul_from_user. This makes it easier to read and less error prone. Signed-off-by: NPeter Huewe <peterhuewe@gmx.de> Link: http://lkml.kernel.org/r/1307476707-14762-1-git-send-email-peterhuewe@gmx.deSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Jiri Olsa 提交于
The header display of function tracer does not follow the context-info option, so field names are displayed even if this option is off. Added check for TRACE_ITER_CONTEXT_INFO trace_flags. With following commands: # echo function > ./current_tracer # echo 0 > options/context-info # cat trace This is what it looked like before: # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | add_preempt_count <-schedule rcu_note_context_switch <-schedule ... This is what it looks like now: # tracer: function # _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel ... Signed-off-by: NJiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1307113131-10045-4-git-send-email-jolsa@redhat.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Add a trace option to disable tracing on free. When this option is set, a write into the free_buffer file will not only shrink the ring buffer down to zero, but it will also disable tracing. Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Vaibhav Nagarnaik 提交于
The proc file entry buffer_size_kb is used to set the size of tracing buffer. The memory to expand the buffer size is kernel memory. Consider a use case where tracing is handled by a user space utility, which acts as a gate keeper for tracing requests. In an OOM condition, tracing is considered a low priority task and if the utility gets killed the ring buffer memory cannot be released back to the kernel. This patch adds a proc file called "free_buffer" whose purpose is to stop tracing and free up the ring buffer when it is closed. The user space process can then set the desired size in buffer_size_kb file and open the fd to the "free_buffer" file. Under OOM condition, if the process gets killed, the kernel closes the file descriptor. The release handler stops the tracing and releases the kernel memory automatically. Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Link: http://lkml.kernel.org/r/1308012717-11148-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Vaibhav Nagarnaik 提交于
The tracing ring buffer is a group of per-cpu ring buffers where allocation and logging is done on a per-cpu basis. The events that are generated on a particular CPU are logged in the corresponding buffer. This is to provide wait-free writes between CPUs and good NUMA node locality while accessing the ring buffer. However, the allocation routines consider NUMA locality only for buffer page metadata and not for the actual buffer page. This causes the pages to be allocated on the NUMA node local to the CPU where the allocation routine is running at the time. This patch fixes the problem by using a NUMA node specific allocation routine so that the pages are allocated from a NUMA node local to the logging CPU. I tested with the getuid_microbench from autotest. It is a simple binary that calls getuid() in a loop and measures the average time for the syscall to complete. The following command was used to test: $ getuid_microbench 1000000 Compared the numbers found on kernel with and without this patch and found that logging latency decreases by 30-50 ns/call. tracing with non-NUMA allocation - 569 ns/call tracing with NUMA allocation - 512 ns/call Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1304470602-20366-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Vaibhav Nagarnaik 提交于
In using syscall tracing by concurrent processes, the wakeup() that is called in the event commit function causes contention on the spin lock of the waitqueue. I enabled sys_enter_getuid and sys_exit_getuid tracepoints, and by running getuid_microbench from autotest in parallel I found that the contention causes exponential latency increase in the tracing path. The autotest binary getuid_microbench calls getuid() in a tight loop for the given number of iterations and measures the average time required to complete a single invocation of syscall. The patch schedules a delayed work after 2 ms once an event commit calls to wake up the trace wait_queue. This removes the delay caused by contention on spin lock in wakeup() and amortizes the wakeup() calls scheduled over the 2 ms period. In the following example, the script enables the sys_enter_getuid and sys_exit_getuid tracepoints and runs the getuid_microbench in parallel with the given number of processes. The output clearly shows the latency increase caused by contentions. $ ~/getuid.sh 1 1000000 calls in 0.720974253 s (720.974253 ns/call) $ ~/getuid.sh 2 1000000 calls in 1.166457554 s (1166.457554 ns/call) 1000000 calls in 1.168933765 s (1168.933765 ns/call) $ ~/getuid.sh 3 1000000 calls in 1.783827516 s (1783.827516 ns/call) 1000000 calls in 1.795553270 s (1795.553270 ns/call) 1000000 calls in 1.796493376 s (1796.493376 ns/call) $ ~/getuid.sh 4 1000000 calls in 4.483041796 s (4483.041796 ns/call) 1000000 calls in 4.484165388 s (4484.165388 ns/call) 1000000 calls in 4.484850762 s (4484.850762 ns/call) 1000000 calls in 4.485643576 s (4485.643576 ns/call) $ ~/getuid.sh 5 1000000 calls in 6.497521653 s (6497.521653 ns/call) 1000000 calls in 6.502000236 s (6502.000236 ns/call) 1000000 calls in 6.501709115 s (6501.709115 ns/call) 1000000 calls in 6.502124100 s (6502.124100 ns/call) 1000000 calls in 6.502936358 s (6502.936358 ns/call) After the patch, the latencies scale better. 1000000 calls in 0.728720455 s (728.720455 ns/call) 1000000 calls in 0.842782857 s (842.782857 ns/call) 1000000 calls in 0.883803135 s (883.803135 ns/call) 1000000 calls in 0.902077764 s (902.077764 ns/call) 1000000 calls in 0.902838202 s (902.838202 ns/call) 1000000 calls in 0.908896885 s (908.896885 ns/call) 1000000 calls in 0.932523515 s (932.523515 ns/call) 1000000 calls in 0.958009672 s (958.009672 ns/call) 1000000 calls in 0.986188020 s (986.188020 ns/call) 1000000 calls in 0.989771102 s (989.771102 ns/call) 1000000 calls in 0.933518391 s (933.518391 ns/call) 1000000 calls in 0.958897947 s (958.897947 ns/call) 1000000 calls in 1.031038897 s (1031.038897 ns/call) 1000000 calls in 1.089516025 s (1089.516025 ns/call) 1000000 calls in 1.141998347 s (1141.998347 ns/call) Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1305059241-7629-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 07 5月, 2011 1 次提交
-
-
由 Arjan van de Ven 提交于
This partially reverts commit e6e1e259. That commit changed the structure layout of the trace structure, which in turn broke PowerTOP (1.9x generation) quite badly. I appreciate not wanting to expose the variable in question, and PowerTOP was not using it, so I've replaced the variable with just a padding field - that way if in the future a new field is needed it can just use this padding field. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 4月, 2011 1 次提交
-
-
由 Jiri Olsa 提交于
running following commands: # enable the binary option echo 1 > ./options/bin # disable context info option echo 0 > ./options/context-info # tracing only events echo 1 > ./events/enable cat trace_pipe plus forcing system to generate many tracing events, is causing lockup (in NON preemptive kernels) inside tracing_read_pipe function. The issue is also easily reproduced by running ltp stress test. (ftrace_stress_test.sh) The reasons are: - bin/hex/raw output functions for events are set to trace_nop_print function, which prints nothing and returns TRACE_TYPE_HANDLED value - LOST EVENT trace do not handle trace_seq overflow These reasons force the while loop in tracing_read_pipe function never to break. The attached patch fixies handling of lost event trace, and changes trace_nop_print to print minimal info, which is needed for the correct tracing_read_pipe processing. v2 changes: - omit the cond_resched changes by trace_nop_print changes - WARN changed to WARN_ONCE and added info to be able to find out the culprit v3 changes: - make more accurate patch comment Signed-off-by: NJiri Olsa <jolsa@redhat.com> LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 10 3月, 2011 3 次提交
-
-
由 Steven Rostedt 提交于
If the kernel command line declares a tracer "ftrace=sometracer" and that tracer is either not defined or is enabled after irqsoff, then the irqs off selftest will fail with the following error: Testing tracer irqsoff: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/trace/tra ce.c:713 update_max_tr_single+0xfa/0x11b() Hardware name: Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.38-rc8-test #1 Call Trace: [<c0441d9d>] ? warn_slowpath_common+0x65/0x7a [<c049adb2>] ? update_max_tr_single+0xfa/0x11b [<c0441dc1>] ? warn_slowpath_null+0xf/0x13 [<c049adb2>] ? update_max_tr_single+0xfa/0x11b [<c049e454>] ? stop_critical_timing+0x154/0x204 [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049e529>] ? time_hardirqs_on+0x25/0x28 [<c0468bca>] ? trace_hardirqs_on_caller+0x18/0x12f [<c0468cec>] ? trace_hardirqs_on+0xb/0xd [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049b6b8>] ? register_tracer+0xf8/0x1a3 [<c14e93fe>] ? init_irqsoff_tracer+0xd/0x11 [<c040115e>] ? do_one_initcall+0x71/0x121 [<c14e93f1>] ? init_irqsoff_tracer+0x0/0x11 [<c14ce3a9>] ? kernel_init+0x13a/0x1b6 [<c14ce26f>] ? kernel_init+0x0/0x1b6 [<c0403842>] ? kernel_thread_helper+0x6/0x10 ---[ end trace e93713a9d40cd06c ]--- .. no entries found ..FAILED! What happens is the "ftrace=..." will expand the ring buffer to its default size (from its minimum size) but it will not expand the max ring buffer (the ring buffer to store maximum latencies). When the irqsoff test runs, it will call the ring buffer swap routine that checks if the max ring buffer is the same size as the normal ring buffer, and will fail if it is not. This causes the test to fail. The solution is to expand the max ring buffer before running the self test if the max ring buffer is used by that tracer and the normal ring buffer is expanded. The max ring buffer should be shrunk again after the test is done to save space. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The lock_depth field in the event headers was added as a temporary data point for help in removing the BKL. Now that the BKL is pretty much been removed, we can remove this field. This in turn changes the header from 12 bytes to 8 bytes, removing the 4 byte buffer that gcc would insert if the first field in the data load was 8 bytes in size. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 David Sharp 提交于
Add an "overwrite" trace_option for ftrace to control whether the buffer should be overwritten on overflow or not. The default remains to overwrite old events when the buffer is full. This patch adds the option to instead discard newest events when the buffer is full. This is useful to get a snapshot of traces just after enabling traces. Dropping the current event is also a simpler code path. Signed-off-by: NDavid Sharp <dhsharp@google.com> LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 09 2月, 2011 1 次提交
-
-
由 Steven Rostedt 提交于
tracing_enabled should not be used, it is heavy weight and does not do much in helping lower the overhead. tracing_on should be used instead. Warn users to use tracing_on when tracing_enabled is used as it will soon be removed from the tracing directory. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 08 1月, 2011 1 次提交
-
-
由 Li Zefan 提交于
While running my ftrace stress test, this showed up: BUG: sleeping function called from invalid context at mm/mmap.c:233 ... note: cat[3293] exited with preempt_count 1 The bug was introduced by commit 91e86e56 ("tracing: Fix recursive user stack trace") Cc: <stable@kernel.org> Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4D0089AC.1020802@cn.fujitsu.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 01 12月, 2010 1 次提交
-
-
由 Slava Pestov 提交于
The file_ops struct for the "trace" special file defined llseek as seq_lseek(). However, if the file was opened for writing only, seq_open() was not called, and the seek would dereference a null pointer, file->private_data. This patch introduces a new wrapper for seq_lseek() which checks if the file descriptor is opened for reading first. If not, it does nothing. Cc: <stable@kernel.org> Signed-off-by: NSlava Pestov <slavapestov@google.com> LKML-Reference: <1290640396-24179-1-git-send-email-slavapestov@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 18 11月, 2010 1 次提交
-
-
由 Arnd Bergmann 提交于
The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 11月, 2010 1 次提交
-
-
由 Steven Rostedt 提交于
The user stack trace can fault when examining the trace. Which would call the do_page_fault handler, which would trace again, which would do the user stack trace, which would fault and call do_page_fault again ... Thus this is causing a recursive bug. We need to have a recursion detector here. [ Resubmitted by Jiri Olsa ] [ Eric Dumazet recommended using __this_cpu_* instead of __get_cpu_* ] Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NJiri Olsa <jolsa@redhat.com> LKML-Reference: <1289390172-9730-3-git-send-email-jolsa@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 21 10月, 2010 1 次提交
-
-
由 Steven Rostedt 提交于
The tracing per_cpu buffers were limited to 999 CPUs for a mear savings in stack space of a char array. Up the array to 30 characters which is more than enough to hold a 64 bit number. Reported-by: NRobin Holt <holt@sgi.com> Suggested-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 18 10月, 2010 1 次提交
-
-
由 matt mooney 提交于
Unnecessary cast from void* in assignment. Signed-off-by: Nmatt mooney <mfm@muteddisk.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 14 8月, 2010 1 次提交
-
-
由 Marcin Slusarz 提交于
When userspace code writes non-new-line-terminated string to trace_marker file, write handler appends new-line and returns number of bytes written to trace buffer, so write(fd, "abc", 3) will return 4 That's unexpected and unfortunately it confuses glibc's fprintf function. Example: int main() { fprintf(stderr, "abc"); return 0; } $ gcc test.c -o test $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer $ ./test 2>/sys/kernel/debug/tracing/trace_marker results in infinite loop: write(fd, "abc", 3) = 4 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 (...) ...and kernel trace buffer full of empty markers. Fix it by sanitizing write return value. Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com> LKML-Reference: <20100727231801.GB2826@joi.lan> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 05 8月, 2010 1 次提交
-
-
由 Jason Wessel 提交于
Add in a helper function to allow the kdb shell to dump the ftrace buffer. Modify trace.c to expose the capability to iterate over the ftrace buffer in a read only capacity. Signed-off-by: NJason Wessel <jason.wessel@windriver.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com>
-