- 15 6月, 2011 2 次提交
-
-
由 Vaibhav Nagarnaik 提交于
The tracing ring buffer is a group of per-cpu ring buffers where allocation and logging is done on a per-cpu basis. The events that are generated on a particular CPU are logged in the corresponding buffer. This is to provide wait-free writes between CPUs and good NUMA node locality while accessing the ring buffer. However, the allocation routines consider NUMA locality only for buffer page metadata and not for the actual buffer page. This causes the pages to be allocated on the NUMA node local to the CPU where the allocation routine is running at the time. This patch fixes the problem by using a NUMA node specific allocation routine so that the pages are allocated from a NUMA node local to the logging CPU. I tested with the getuid_microbench from autotest. It is a simple binary that calls getuid() in a loop and measures the average time for the syscall to complete. The following command was used to test: $ getuid_microbench 1000000 Compared the numbers found on kernel with and without this patch and found that logging latency decreases by 30-50 ns/call. tracing with non-NUMA allocation - 569 ns/call tracing with NUMA allocation - 512 ns/call Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Link: http://lkml.kernel.org/r/1304470602-20366-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Vaibhav Nagarnaik 提交于
In using syscall tracing by concurrent processes, the wakeup() that is called in the event commit function causes contention on the spin lock of the waitqueue. I enabled sys_enter_getuid and sys_exit_getuid tracepoints, and by running getuid_microbench from autotest in parallel I found that the contention causes exponential latency increase in the tracing path. The autotest binary getuid_microbench calls getuid() in a tight loop for the given number of iterations and measures the average time required to complete a single invocation of syscall. The patch schedules a delayed work after 2 ms once an event commit calls to wake up the trace wait_queue. This removes the delay caused by contention on spin lock in wakeup() and amortizes the wakeup() calls scheduled over the 2 ms period. In the following example, the script enables the sys_enter_getuid and sys_exit_getuid tracepoints and runs the getuid_microbench in parallel with the given number of processes. The output clearly shows the latency increase caused by contentions. $ ~/getuid.sh 1 1000000 calls in 0.720974253 s (720.974253 ns/call) $ ~/getuid.sh 2 1000000 calls in 1.166457554 s (1166.457554 ns/call) 1000000 calls in 1.168933765 s (1168.933765 ns/call) $ ~/getuid.sh 3 1000000 calls in 1.783827516 s (1783.827516 ns/call) 1000000 calls in 1.795553270 s (1795.553270 ns/call) 1000000 calls in 1.796493376 s (1796.493376 ns/call) $ ~/getuid.sh 4 1000000 calls in 4.483041796 s (4483.041796 ns/call) 1000000 calls in 4.484165388 s (4484.165388 ns/call) 1000000 calls in 4.484850762 s (4484.850762 ns/call) 1000000 calls in 4.485643576 s (4485.643576 ns/call) $ ~/getuid.sh 5 1000000 calls in 6.497521653 s (6497.521653 ns/call) 1000000 calls in 6.502000236 s (6502.000236 ns/call) 1000000 calls in 6.501709115 s (6501.709115 ns/call) 1000000 calls in 6.502124100 s (6502.124100 ns/call) 1000000 calls in 6.502936358 s (6502.936358 ns/call) After the patch, the latencies scale better. 1000000 calls in 0.728720455 s (728.720455 ns/call) 1000000 calls in 0.842782857 s (842.782857 ns/call) 1000000 calls in 0.883803135 s (883.803135 ns/call) 1000000 calls in 0.902077764 s (902.077764 ns/call) 1000000 calls in 0.902838202 s (902.838202 ns/call) 1000000 calls in 0.908896885 s (908.896885 ns/call) 1000000 calls in 0.932523515 s (932.523515 ns/call) 1000000 calls in 0.958009672 s (958.009672 ns/call) 1000000 calls in 0.986188020 s (986.188020 ns/call) 1000000 calls in 0.989771102 s (989.771102 ns/call) 1000000 calls in 0.933518391 s (933.518391 ns/call) 1000000 calls in 0.958897947 s (958.897947 ns/call) 1000000 calls in 1.031038897 s (1031.038897 ns/call) 1000000 calls in 1.089516025 s (1089.516025 ns/call) 1000000 calls in 1.141998347 s (1141.998347 ns/call) Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1305059241-7629-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 26 5月, 2011 5 次提交
-
-
由 Steven Rostedt 提交于
Witold reported a reboot caused by the selftests of the dynamic function tracer. He sent me a config and I used ktest to do a config_bisect on it (as my config did not cause the crash). It pointed out that the problem config was CONFIG_PROVE_RCU. What happened was that if multiple callbacks are attached to the function tracer, we iterate a list of callbacks. Because the list is managed by synchronize_sched() and preempt_disable, the access to the pointers uses rcu_dereference_raw(). When PROVE_RCU is enabled, the rcu_dereference_raw() calls some debugging functions, which happen to be traced. The tracing of the debug function would then call rcu_dereference_raw() which would then call the debug function and then... well you get the idea. I first wrote two different patches to solve this bug. 1) add a __rcu_dereference_raw() that would not do any checks. 2) add notrace to the offending debug functions. Both of these patches worked. Talking with Paul McKenney on IRC, he suggested to add recursion detection instead. This seemed to be a better solution, so I decided to implement it. As the task_struct already has a trace_recursion to detect recursion in the ring buffer, and that has a very small number it allows, I decided to use that same variable to add flags that can detect the recursion inside the infrastructure of the function tracer. I plan to change it so that the task struct bit can be checked in mcount, but as that requires changes to all archs, I will hold that off to the next merge window. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.comReported-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 liubo 提交于
Filesystem, like Btrfs, has some "ULL" macros, and when these macros are passed to tracepoints'__print_symbolic(), there will be 64->32 truncate WARNINGS during compiling on 32bit box. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com> Link: http://lkml.kernel.org/r/4DACE6E0.7000507@cn.fujitsu.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
When dynamic ftrace is not configured, the ops->flags still needs to have its FTRACE_OPS_FL_ENABLED bit set in ftrace_startup(). Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The self tests for event tracer does not check if the function tracing was successfully activated. It needs to before it continues the tests, otherwise the wrong errors may be reported. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The register_ftrace_function() returns an error code on failure except if the call to ftrace_startup() fails. Add a error return to ftrace_startup() if it fails to start, allowing register_ftrace_funtion() to return a proper error value. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 19 5月, 2011 13 次提交
-
-
由 Steven Rostedt 提交于
Add some basic sanity tests for multiple users of the function tracer at startup. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Since users of the function tracer can now pick and choose which functions they want to trace agnostically from other users of the function tracer, we need to pass the ops struct to the ftrace_set_filter() functions. The functions ftrace_set_global_filter() and ftrace_set_global_notrace() is added to keep the old filter functions which are used to modify the generic function tracers. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Now that functions may be selected individually, it only makes sense that we should allow dynamically allocated trace structures to be traced. This will allow perf to allocate a ftrace_ops structure at runtime and use it to pick and choose which functions that structure will trace. Note, a dynamically allocated ftrace_ops will always be called indirectly instead of being called directly from the mcount in entry.S. This is because there's no safe way to prevent mcount from being preempted before calling the function, unless we modify every entry.S to do so (not likely). Thus, dynamically allocated functions will now be called by the ftrace_ops_list_func() that loops through the ops that are allocated if there are more than one op allocated at a time. This loop is protected with a preempt_disable. To determine if an ftrace_ops structure is allocated or not, a new util function was added to the kernel/extable.c called core_kernel_data(), which returns 1 if the address is between _sdata and _edata. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
ftrace_ops that are registered to trace functions can now be agnostic to each other in respect to what functions they trace. Each ops has their own hash of the functions they want to trace and a hash to what they do not want to trace. A empty hash for the functions they want to trace denotes all functions should be traced that are not in the notrace hash. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
When a hash is modified and might be in use, we need to perform a schedule RCU operation on it, as the hashes will soon be used directly in the function tracer callback. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
This is a step towards each ops structure defining its own set of functions to trace. As the current code with pid's and such are specific to the global_ops, it is restructured to be used with the global ops. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
In order to allow different ops to enable different functions, the ftrace_startup() and ftrace_shutdown() functions need the ops parameter passed to them. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Add the enabled_functions file that is used to show all the functions that have been enabled for tracing as well as their ref counts. This helps seeing if any function has been registered and what functions are being traced. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Every function has its own record that stores the instruction pointer and flags for the function to be traced. There are only two flags: enabled and free. The enabled flag states that tracing for the function has been enabled (actively traced), and the free flag states that the record no longer points to a function and can be used by new functions (loaded modules). These flags are now moved to the MSB of the flags (actually just the top 32bits). The rest of the bits (30 bits) are now used as a ref counter. Everytime a tracer register functions to trace, those functions will have its counter incremented. When tracing is enabled, to determine if a function should be traced, the counter is examined, and if it is non-zero it is set to trace. When a ftrace_ops is registered to trace functions, its hashes are examined. If the ftrace_ops filter_hash count is zero, then all functions are set to be traced, otherwise only the functions in the hash are to be traced. The exception to this is if a function is also in the ftrace_ops notrace_hash. Then that function's counter is not incremented for this ftrace_ops. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
When filtering, allocate a hash to insert the function records. After the filtering is complete, assign it to the ftrace_ops structure. This allows the ftrace_ops structure to have a much smaller array of hash buckets instead of wasting a lot of memory. A read only empty_hash is created to be the minimum size that any ftrace_ops can point to. When a new hash is created, it has the following steps: o Allocate a default hash. o Walk the function records assigning the filtered records to the hash o Allocate a new hash with the appropriate size buckets o Move the entries from the default hash to the new hash. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Combine the filter and notrace hashes to be accessed by a single entity, the global_ops. The global_ops is a ftrace_ops structure that is passed to different functions that can read or modify the filtering of the function tracer. The ftrace_ops structure was modified to hold a filter and notrace hashes so that later patches may allow each ftrace_ops to have its own set of rules to what functions may be filtered. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
When multiple users are allowed to have their own set of functions to trace, having the FTRACE_FL_FILTER flag will not be enough to handle the accounting of those users. Each user will need their own set of functions. Replace the FTRACE_FL_FILTER with a filter_hash instead. This is temporary until the rest of the function filtering accounting gets in. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
To prepare for the accounting system that will allow multiple users of the function tracer, having the FTRACE_FL_NOTRACE as a flag in the dyn_trace record does not make sense. All ftrace_ops will soon have a hash of functions they should trace and not trace. By making a global hash of functions not to trace makes this easier for the transition. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 07 5月, 2011 1 次提交
-
-
由 Arjan van de Ven 提交于
This partially reverts commit e6e1e259. That commit changed the structure layout of the trace structure, which in turn broke PowerTOP (1.9x generation) quite badly. I appreciate not wanting to expose the variable in question, and PowerTOP was not using it, so I've replaced the variable with just a padding field - that way if in the future a new field is needed it can just use this padding field. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2011 9 次提交
-
-
由 Steven Rostedt 提交于
The code used for matching functions is almost identical between normal selecting of functions and using the :mod: feature of set_ftrace_notrace. Consolidate the two users into one function. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
There are three locations that perform almost identical functions in order to update the ftrace_trace_function (the ftrace function variable that gets called by mcount). Consolidate these into a single function called update_ftrace_function(). Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The updating of a function record is moved to a single function. This will allow us to add specific changes in one location for both modules and kernel functions. Later patches will determine if the function record itself needs to be updated (which enables the mcount caller), or just the ftrace_ops needs the update. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Since we disable all function tracer processing if we detect that a modification of a instruction had failed, we do not need to track that the record has failed. No more ftrace processing is allowed, and the FTRACE_FL_CONVERTED flag is pointless. The FTRACE_FL_CONVERTED flag was used to denote records that were successfully converted from mcount calls into nops. But if a single record fails, all of ftrace is disabled. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Since we disable all function tracer processing if we detect that a modification of a instruction had failed, we do not need to track that the record has failed. No more ftrace processing is allowed, and the FTRACE_FL_FAILED flag is pointless. Removing this flag simplifies some of the code, but some ftrace_disabled checks needed to be added or move around a little. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The failures file in the debugfs tracing directory would list the functions that failed to convert when the old dead ftrace daemon tried to update code but failed. Since this code is now dead along with the daemon the failures file is useless. Remove it. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The disabling of interrupts around ftrace_update_code() was used to protect against the evil ftrace daemon from years past. But that daemon has long been killed. It is safe to keep interrupts enabled while updating the initial mcount into nops. The ftrace_mutex is also held which keeps other users at bay. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Let FTRACE_WARN_ON() be used as a stand alone statement or inside a conditional: if (FTRACE_WARN_ON(x)) Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
If function tracing is enabled, a read of the filter files will cause the call to stop_machine to update the function trace sites. It should only call stop_machine on write. Cc: stable@kernel.org Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 24 4月, 2011 1 次提交
-
-
由 Jonathan Corbet 提交于
Neil Brown pointed out that lock_depth somehow escaped the BKL removal work. Let's get rid of it now. Note that the perf scripting utilities still have a bunch of code for dealing with common_lock_depth in tracepoints; I have left that in place in case anybody wants to use that code with older kernels. Suggested-by: NNeil Brown <neilb@suse.de> Signed-off-by: NJonathan Corbet <corbet@lwn.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20110422111910.456c0e84@bike.lwn.netSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 4月, 2011 1 次提交
-
-
由 Michal Simek 提交于
Microblaze doesn't need/support FRAME_POINTERS in order to have a working function tracer. The patch remove Kconfig warning. Warning log: warning: (LOCKDEP && FAULT_INJECTION_STACKTRACE_FILTER && LATENCYTOP && FUNCTION_TRACER && KMEMCHECK) selects FRAME_POINTER which has unmet direct dependencies (DEBUG_KERNEL && (CRIS || M68K || FRV || UML || AVR32 || SUPERH || BLACKFIN || MN10300) || ARCH_WANT_FRAME_POINTERS) Signed-off-by: NMichal Simek <monstr@monstr.eu> Link: http://lkml.kernel.org/r/1301908812-8119-2-git-send-email-monstr@monstr.eu CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 16 4月, 2011 1 次提交
-
-
由 Jens Axboe 提交于
It's a pretty close match to what we had before - the timer triggering would mean that nobody unplugged the plug in due time, in the new scheme this matches very closely what the schedule() unplug now is. It's essentially the difference between an explicit unplug (IO unplug) or an implicit unplug (timer unplug, we scheduled with pending IO queued). Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 12 4月, 2011 2 次提交
-
-
由 Jens Axboe 提交于
It was removed with the on-stack plugging, readd it and track the depth of requests added when flushing the plug. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
由 Jens Axboe 提交于
We no longer have an unplug timer running, so no point in keeping the trace point. Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
-
- 05 4月, 2011 3 次提交
-
-
由 Jiri Olsa 提交于
running following commands: # enable the binary option echo 1 > ./options/bin # disable context info option echo 0 > ./options/context-info # tracing only events echo 1 > ./events/enable cat trace_pipe plus forcing system to generate many tracing events, is causing lockup (in NON preemptive kernels) inside tracing_read_pipe function. The issue is also easily reproduced by running ltp stress test. (ftrace_stress_test.sh) The reasons are: - bin/hex/raw output functions for events are set to trace_nop_print function, which prints nothing and returns TRACE_TYPE_HANDLED value - LOST EVENT trace do not handle trace_seq overflow These reasons force the while loop in tracing_read_pipe function never to break. The attached patch fixies handling of lost event trace, and changes trace_nop_print to print minimal info, which is needed for the correct tracing_read_pipe processing. v2 changes: - omit the cond_resched changes by trace_nop_print changes - WARN changed to WARN_ONCE and added info to be able to find out the culprit v3 changes: - make more accurate patch comment Signed-off-by: NJiri Olsa <jolsa@redhat.com> LKML-Reference: <20110325110518.GC1922@jolsa.brq.redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The file debugfs/tracing/printk_formats maps the addresses to the formats that are used by trace_bprintk() so that userspace tools can read the buffer and be able to decode trace_bprintk events to get the format saved when reading the ring buffer directly. This is because trace_bprintk() does not store the format into the buffer, but just the address of the format, which is hidden in the kernel memory. But currently it only exports trace_bprintk()s from the kernel core and not for modules. The modules need their formats exported as well. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The trace_printk() formats for modules do not show up in the debugfs/tracing/printk_formats file. Only the formats that are for trace_printk()s that are in the kernel core. To facilitate the change to add trace_printk() formats from modules into that file as well, we need to convert the structure that holds the formats from char fmt[], into const char *fmt, and allocate them separately. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 31 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 23 3月, 2011 1 次提交
-
-
由 Jiri Olsa 提交于
If one or more function probes (like traceon) are enabled, and there's no other function filter, the first probe func is skipped (which one depends on the position in the hash). $ echo sys_open:traceon sys_close:traceon > ./set_ftrace_filter $ cat set_ftrace_filter #### all functions enabled #### sys_close:traceon:unlimited $ The reason was, that in the case of no other function filter, the func_pos was not properly updated before calling t_hash_start. Signed-off-by: NJiri Olsa <jolsa@redhat.com> LKML-Reference: <1297874134-7008-1-git-send-email-jolsa@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-