- 14 5月, 2015 3 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
The term "ftrace" is really the infrastructure of the function hooks, and not the trace events. Rename ftrace_event.h to trace_events.h to represent the trace_event infrastructure and decouple the term ftrace from it. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
The trace_event.h file is for the generic trace event code. Move the perf related code into its own trace header file perf.h Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
The name "ftrace" really refers to the function hook infrastructure. It is not about the TRACE_EVENT() macros. The file trace/ftrace.h was originally written to be mostly focused toward the "ftrace" code (that in kernel/trace/) but ended up being generic and used by perf and others. Rename the file to be less confusing about what infrastructure it belongs to. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 08 4月, 2015 2 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Several tracepoints use the helper functions __print_symbolic() or __print_flags() and pass in enums that do the mapping between the binary data stored and the value to print. This works well for reading the ASCII trace files, but when the data is read via userspace tools such as perf and trace-cmd, the conversion of the binary value to a human string format is lost if an enum is used, as userspace does not have access to what the ENUM is. For example, the tracepoint trace_tlb_flush() has: __print_symbolic(REC->reason, { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" }, { TLB_REMOTE_SHOOTDOWN, "remote shootdown" }, { TLB_LOCAL_SHOOTDOWN, "local shootdown" }, { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" }) Which maps the enum values to the strings they represent. But perf and trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would not be able to map it. With TRACE_DEFINE_ENUM(), developers can place these in the event header files and ftrace will convert the enums to their values: By adding: TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH); TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN); TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN); TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN); $ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format [...] __print_symbolic(REC->reason, { 0, "flush on task switch" }, { 1, "remote shootdown" }, { 2, "local shootdown" }, { 3, "local mm shootdown" }) The above is what userspace expects to see, and tools do not need to be modified to parse them. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Guilherme Cox <cox@computer.org> Cc: Tony Luck <tony.luck@gmail.com> Cc: Xie XiuQi <xiexiuqi@huawei.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
Normally the compiler will use the same pointer for a string throughout the file. But there's no guarantee of that happening. Later changes will require that all events have the same pointer to the system string. Name the system string and have all events point to it. Testing this, it did not increases the size of the text, except for the notes section, which should not harm the real size any. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 28 1月, 2015 1 次提交
-
-
由 Dave Martin 提交于
If a trace event contains an array, there is currently no standard way to format this for text output. Drivers are currently hacking around this by a) local hacks that use the trace_seq functionailty directly, or b) just not printing that information. For fixed size arrays, formatting of the elements can be open-coded, but this gets cumbersome for arrays of non-trivial size. These approaches result in non-standard content of the event format description delivered to userspace, so userland tools needs to be taught to understand and parse each array printing method individually. This patch implements a __print_array() helper that tracepoint implementations can use instead of reinventing it. A simple C-style syntax is used to delimit the array and its elements {like,this}. So that the helpers can be used with large static arrays as well as dynamic arrays, they take a pointer and element count: they can be used with __get_dynamic_array() for use with dynamic arrays. Link: http://lkml.kernel.org/r/1422449335-8289-2-git-send-email-javi.merino@arm.com Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: NDave Martin <Dave.Martin@arm.com> Signed-off-by: NJavi Merino <javi.merino@arm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 14 1月, 2015 1 次提交
-
-
由 Peter Zijlstra (Intel) 提交于
Both Linus (most recent) and Steve (a while ago) reported that perf related callbacks have massive stack bloat. The problem is that software events need a pt_regs in order to properly report the event location and unwind stack. And because we could not assume one was present we allocated one on stack and filled it with minimal bits required for operation. Now, pt_regs is quite large, so this is undesirable. Furthermore it turns out that most sites actually have a pt_regs pointer available, making this even more onerous, as the stack space is pointless waste. This patch addresses the problem by observing that software events have well defined nesting semantics, therefore we can use static per-cpu storage instead of on-stack. Linus made the further observation that all but the scheduler callers of perf_sw_event() have a pt_regs available, so we change the regular perf_sw_event() to require a valid pt_regs (where it used to be optional) and add perf_sw_event_sched() for the scheduler. We have a scheduler specific call instead of a more generic _noregs() like construct because we can assume non-recursion from the scheduler and thereby simplify the code further (_noregs would have to put the recursion context call inline in order to assertain which __perf_regs element to use). One last note on the implementation of perf_trace_buf_prepare(); we allow .regs = NULL for those cases where we already have a pt_regs pointer available and do not need another. Reported-by: NLinus Torvalds <torvalds@linux-foundation.org> Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Javi Merino <javi.merino@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Link: http://lkml.kernel.org/r/20141216115041.GW3337@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 11月, 2014 2 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
If the trace_seq of ftrace_raw_output_prep() is full this function returns TRACE_TYPE_PARTIAL_LINE, otherwise it returns zero. The problem is that TRACE_TYPE_PARTIAL_LINE happens to be zero! The thing is, the caller of ftrace_raw_output_prep() expects a success to be zero. Change that to expect it to be TRACE_TYPE_HANDLED. Link: http://lkml.kernel.org/r/20141114112522.GA2988@dhcp128.suse.czReminded-by: NPetr Mladek <pmladek@suse.cz> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt (Red Hat) 提交于
Adding a trace_seq_has_overflowed() which returns true if the trace_seq had too much written into it allows us to simplify the code. Instead of checking the return value of every call to trace_seq_printf() and friends, they can all be called normally, and at the end we can return !trace_seq_has_overflowed() instead. Several functions also return TRACE_TYPE_PARTIAL_LINE when the trace_seq overflowed and TRACE_TYPE_HANDLED otherwise. Another helper function was created called trace_handle_return() which takes a trace_seq and returns these enums. Using this helper function also simplifies the code. This change also makes it possible to remove the return values of trace_seq_printf() and friends. They should instead just be void functions. Link: http://lkml.kernel.org/r/20141114011410.365183157@goodmis.orgReviewed-by: NPetr Mladek <pmladek@suse.cz> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 21 6月, 2014 1 次提交
-
-
由 Steven Rostedt 提交于
Currently the __field() macro in TRACE_EVENT is only good for primitive values, such as integers and pointers, but it fails on complex data types such as structures or unions. This is because the __field() macro determines if the variable is signed or not with the test of: (((type)(-1)) < (type)1) Unfortunately, that fails when type is a structure. Since trace events should support structures as fields a new macro is created for such a case called __field_struct() which acts exactly the same as __field() does but it does not do the signed type check and just uses a constant false for that answer. Cc: Tony Luck <tony.luck@gmail.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 05 6月, 2014 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
If a trace event uses a dynamic array for something other than a string then there's currently no way the TP_printk() can figure out what size it is. A __get_dynamic_array_len() is required to know the length. This also simplifies the __get_bitmask() macro which required it as well, but instead just hardcoded it. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 15 5月, 2014 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
Being able to show a cpumask of events can be useful as some events may affect only some CPUs. There is no standard way to record the cpumask and converting it to a string is rather expensive during the trace as traces happen in hotpaths. It would be better to record the raw event mask and be able to parse it at print time. The following macros were added for use with the TRACE_EVENT() macro: __bitmask() __assign_bitmask() __get_bitmask() To test this, I added this to the sched_migrate_task event, which looked like this: TRACE_EVENT(sched_migrate_task, TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask *cpus), TP_ARGS(p, dest_cpu, cpus), TP_STRUCT__entry( __array( char, comm, TASK_COMM_LEN ) __field( pid_t, pid ) __field( int, prio ) __field( int, orig_cpu ) __field( int, dest_cpu ) __bitmask( cpumask, num_possible_cpus() ) ), TP_fast_assign( memcpy(__entry->comm, p->comm, TASK_COMM_LEN); __entry->pid = p->pid; __entry->prio = p->prio; __entry->orig_cpu = task_cpu(p); __entry->dest_cpu = dest_cpu; __assign_bitmask(cpumask, cpumask_bits(cpus), num_possible_cpus()); ), TP_printk("comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s", __entry->comm, __entry->pid, __entry->prio, __entry->orig_cpu, __entry->dest_cpu, __get_bitmask(cpumask)) ); With the output of: ksmtuned-3613 [003] d..2 485.220508: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=00000000,0000000f migration/1-13 [001] d..5 485.221202: sched_migrate_task: comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=00000000,0000000f awk-3615 [002] d.H5 485.221747: sched_migrate_task: comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=00000000,000000ff migration/2-18 [002] d..5 485.222062: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=00000000,0000000f Link: http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.merino@arm.com Link: http://lkml.kernel.org/r/20140506132238.22e136d1@gandalf.local.homeSuggested-by: NJavi Merino <javi.merino@arm.com> Tested-by: NJavi Merino <javi.merino@arm.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 10 4月, 2014 1 次提交
-
-
由 Mathieu Desnoyers 提交于
gcc <= 4.5.x has significant limitations with respect to initialization of anonymous unions within structures. They need to be surrounded by brackets, _and_ they need to be initialized in the same order in which they appear in the structure declaration. Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676 Link: http://lkml.kernel.org/r/1397077568-3156-1-git-send-email-mathieu.desnoyers@efficios.comSigned-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 09 4月, 2014 1 次提交
-
-
由 Mathieu Desnoyers 提交于
Register/unregister tracepoint probes with struct tracepoint pointer rather than tracepoint name. This change, which vastly simplifies tracepoint.c, has been proposed by Steven Rostedt. It also removes 8.8kB (mostly of text) to the vmlinux size. From this point on, the tracers need to pass a struct tracepoint pointer to probe register/unregister. A probe can now only be connected to a tracepoint that exists. Moreover, tracers are responsible for unregistering the probe before the module containing its associated tracepoint is unloaded. text data bss dec hex filename 10443444 4282528 10391552 25117524 17f4354 vmlinux.orig 10434930 4282848 10391552 25109330 17f2352 vmlinux Link: http://lkml.kernel.org/r/1396992381-23785-2-git-send-email-mathieu.desnoyers@efficios.com CC: Ingo Molnar <mingo@kernel.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Frank Ch. Eigler <fche@redhat.com> CC: Johannes Berg <johannes.berg@intel.com> Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> [ SDR - fixed return val in void func in tracepoint_module_going() ] Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 22 3月, 2014 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
I originally wrote commit 35bb4399 to shrink the size of the overhead of tracepoints by several kilobytes. Later, I received a patch from Vaibhav Nagarnaik that fixed a bug in the same code that this commit touches. Not only did it fix a bug, it also removed code and shrunk the size of the overhead of trace events even more than this commit did. Since this commit is scheduled for 3.15 and Vaibhav's patch is already in mainline, I need to revert this patch in order to keep it from conflicting with Vaibhav's patch. Not to mention, Vaibhav's patch makes this patch obsolete. Link: http://lkml.kernel.org/r/20140320225637.0226041b@gandalf.local.home Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 21 3月, 2014 1 次提交
-
-
由 Vaibhav Nagarnaik 提交于
In event format strings, the array size is reported in two locations. One in array subscript and then via the "size:" attribute. The values reported there have a mismatch. For e.g., in sched:sched_switch the prev_comm and next_comm character arrays have subscript values as [32] where as the actual field size is 16. name: sched_switch ID: 301 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[32]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[32]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; After bisection, the following commit was blamed: 92edca07 tracing: Use direct field, type and system names This commit removes the duplication of strings for field->name and field->type assuming that all the strings passed in __trace_define_field() are immutable. This is not true for arrays, where the type string is created in event_storage variable and field->type for all array fields points to event_storage. Use __stringify() to create a string constant for the type string. Also, get rid of event_storage and event_storage_mutex that are not needed anymore. also, an added benefit is that this reduces the overhead of events a bit more: text data bss dec hex filename 8424787 2036472 1302528 11763787 b3804b vmlinux 8420814 2036408 1302528 11759750 b37086 vmlinux.patched Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com Cc: Laurent Chavey <chavey@google.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 07 3月, 2014 5 次提交
-
-
由 Filipe Brandenburger 提交于
Use a temporary variable to store the expansion of the len expression. If the evaluation is expensive, this commit will ensure it is evaluated only once inside ftrace_get_offsets_<call>. Link: http://lkml.kernel.org/r/1393651938-16418-3-git-send-email-filbranden@google.comSigned-off-by: NFilipe Brandenburger <filbranden@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Filipe Brandenburger 提交于
This fixes expansion of the len argument in __dynamic_array macros. The previous code from commit 7d536cb3 would not fully evaluate the expression before multiplying its result by the size of the type. This went unnoticed because the length stored in the high 16 bits of the offset (which is the one that was broken here) is only used by filter_pred_strloc which only acts on strings for which the size of the type is 1. Link: http://lkml.kernel.org/r/1393651938-16418-2-git-send-email-filbranden@google.comSigned-off-by: NFilipe Brandenburger <filbranden@google.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The functions that assign the contents for the ftrace events are defined by the TRACE_EVENT() macros. Each event has its own unique way to assign data to its buffer. When you have over 500 events, that means there's 500 functions assigning data uniquely for each event (not really that many, as DECLARE_EVENT_CLASS() and multiple DEFINE_EVENT()s will only need a single function). By making helper functions in the core kernel to do some of the work instead, we can shrink the size of the kernel down a bit. With a kernel configured with 502 events, the change in size was: text data bss dec hex filename 12987390 1913504 9785344 24686238 178ae9e /tmp/vmlinux 12959102 1913504 9785344 24657950 178401e /tmp/vmlinux.patched That's a total of 28288 bytes, which comes down to 56 bytes per event. Link: http://lkml.kernel.org/r/20120810034708.370808175@goodmis.orgSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The code that shows array fields for events is defined for all events. This can add up quite a bit when you have over 500 events. By making helper functions in the core kernel to do the work instead, we can shrink the size of the kernel down a bit. With a kernel configured with 502 events, the change in size was: text data bss dec hex filename 12990946 1913568 9785344 24689858 178bcc2 /tmp/vmlinux 12987390 1913504 9785344 24686238 178ae9e /tmp/vmlinux.patched That's a total of 3556 bytes, which comes down to 7 bytes per event. Although it's not much, this code is just called at initialization of the events. Link: http://lkml.kernel.org/r/20120810034708.084036335@goodmis.orgSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The code for trace events to format the raw recorded event data into human readable format in the 'trace' file is repeated for every event in the system. When you have over 500 events, this can add up quite a bit. By making helper functions in the core kernel to do the work instead, we can shrink the size of the kernel down a bit. With a kernel configured with 502 events, the change in size was: text data bss dec hex filename 12991007 1913568 9785344 24689919 178bcff /tmp/vmlinux.orig 12990946 1913568 9785344 24689858 178bcc2 /tmp/vmlinux.patched Note, this version does not save as much as the version of this patch I had a few years ago. That is because in the mean time, commit f71130de ("tracing: Add a helper function for event print functions") did a lot of the work my original patch did. But this change helps slightly, and is part of a larger clean up to reduce the size much further. Link: http://lkml.kernel.org/r/20120810034707.378538034@goodmis.org Cc: Li Zefan <lizefan@huawei.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 10 1月, 2014 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
The event trigger code that checks for callback triggers before and after recording of an event has lots of flags checks. This code is duplicated throughout the ftrace events, kprobes and system calls. They all do the exact same checks against the event flags. Added helper functions ftrace_trigger_soft_disabled(), event_trigger_unlock_commit() and event_trigger_unlock_commit_regs() that consolidated the code and these are used instead. Link: http://lkml.kernel.org/r/20140106222703.5e7dbba2@gandalf.local.homeAcked-by: NTom Zanussi <tom.zanussi@linux.intel.com> Tested-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 22 12月, 2013 1 次提交
-
-
由 Tom Zanussi 提交于
Add a generic event_command.set_trigger_filter() op implementation and have the current set of trigger commands use it - this essentially gives them all support for filters. Syntactically, filters are supported by adding 'if <filter>' just after the command, in which case only events matching the filter will invoke the trigger. For example, to add a filter to an enable/disable_event command: echo 'enable_event:system:event if common_pid == 999' > \ .../othersys/otherevent/trigger The above command will only enable the system:event event if the common_pid field in the othersys:otherevent event is 999. As another example, to add a filter to a stacktrace command: echo 'stacktrace if common_pid == 999' > \ .../somesys/someevent/trigger The above command will only trigger a stacktrace if the common_pid field in the event is 999. The filter syntax is the same as that described in the 'Event filtering' section of Documentation/trace/events.txt. Because triggers can now use filters, the trigger-invoking logic needs to be moved in those cases - e.g. for ftrace_raw_event_calls, if a trigger has a filter associated with it, the trigger invocation now needs to happen after the { assign; } part of the call, in order for the trigger condition to be tested. There's still a SOFT_DISABLED-only check at the top of e.g. the ftrace_raw_events function, so when an event is soft disabled but not because of the presence of a trigger, the original SOFT_DISABLED behavior remains unchanged. There's also a bit of trickiness in that some triggers need to avoid being invoked while an event is currently in the process of being logged, since the trigger may itself log data into the trace buffer. Thus we make sure the current event is committed before invoking those triggers. To do that, we split the trigger invocation in two - the first part (event_triggers_call()) checks the filter using the current trace record; if a command has the post_trigger flag set, it sets a bit for itself in the return value, otherwise it directly invoks the trigger. Once all commands have been either invoked or set their return flag, event_triggers_call() returns. The current record is then either committed or discarded; if any commands have deferred their triggers, those commands are finally invoked following the close of the current event by event_triggers_post_call(). To simplify the above and make it more efficient, the TRIGGER_COND bit is introduced, which is set only if a soft-disabled trigger needs to use the log record for filter testing or needs to wait until the current log record is closed. The syscall event invocation code is also changed in analogous ways. Because event triggers need to be able to create and free filters, this also adds a couple external wrappers for the existing create_filter and free_filter functions, which are too generic to be made extern functions themselves. Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.comSigned-off-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 21 12月, 2013 1 次提交
-
-
由 Tom Zanussi 提交于
Add a 'trigger' file for each trace event, enabling 'trace event triggers' to be set for trace events. 'trace event triggers' are patterned after the existing 'ftrace function triggers' implementation except that triggers are written to per-event 'trigger' files instead of to a single file such as the 'set_ftrace_filter' used for ftrace function triggers. The implementation is meant to be entirely separate from ftrace function triggers, in order to keep the respective implementations relatively simple and to allow them to diverge. The event trigger functionality is built on top of SOFT_DISABLE functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file flags which is checked when any trace event fires. Triggers set for a particular event need to be checked regardless of whether that event is actually enabled or not - getting an event to fire even if it's not enabled is what's already implemented by SOFT_DISABLE mode, so trigger mode directly reuses that. Event trigger essentially inherit the soft disable logic in __ftrace_event_enable_disable() while adding a bit of logic and trigger reference counting via tm_ref on top of that in a new trace_event_trigger_enable_disable() function. Because the base __ftrace_event_enable_disable() code now needs to be invoked from outside trace_events.c, a wrapper is also added for those usages. The triggers for an event are actually invoked via a new function, event_triggers_call(), and code is also added to invoke them for ftrace_raw_event calls as well as syscall events. The main part of the patch creates a new trace_events_trigger.c file to contain the trace event triggers implementation. The standard open, read, and release file operations are implemented here. The open() implementation sets up for the various open modes of the 'trigger' file. It creates and attaches the trigger iterator and sets up the command parser. If opened for reading set up the trigger seq_ops. The read() implementation parses the event trigger written to the 'trigger' file, looks up the trigger command, and passes it along to that event_command's func() implementation for command-specific processing. The release() implementation does whatever cleanup is needed to release the 'trigger' file, like releasing the parser and trigger iterator, etc. A couple of functions for event command registration and unregistration are added, along with a list to add them to and a mutex to protect them, as well as an (initially empty) registration function to add the set of commands that will be added by future commits, and call to it from the trace event initialization code. also added are a couple trigger-specific data structures needed for these implementations such as a trigger iterator and a struct for trigger-specific data. A couple structs consisting mostly of function meant to be implemented in command-specific ways, event_command and event_trigger_ops, are used by the generic event trigger command implementations. They're being put into trace.h alongside the other trace_event data structures and functions, in the expectation that they'll be needed in several trace_event-related files such as trace_events_trigger.c and trace_events.c. The event_command.func() function is meant to be called by the trigger parsing code in order to add a trigger instance to the corresponding event. It essentially coordinates adding a live trigger instance to the event, and arming the triggering the event. Every event_command func() implementation essentially does the same thing for any command: - choose ops - use the value of param to choose either a number or count version of event_trigger_ops specific to the command - do the register or unregister of those ops - associate a filter, if specified, with the triggering event The reg() and unreg() ops allow command-specific implementations for event_trigger_op registration and unregistration, and the get_trigger_ops() op allows command-specific event_trigger_ops selection to be parameterized. When a trigger instance is added, the reg() op essentially adds that trigger to the triggering event and arms it, while unreg() does the opposite. The set_filter() function is used to associate a filter with the trigger - if the command doesn't specify a set_filter() implementation, the command will ignore filters. Each command has an associated trigger_type, which serves double duty, both as a unique identifier for the command as well as a value that can be used for setting a trigger mode bit during trigger invocation. The signature of func() adds a pointer to the event_command struct, used to invoke those functions, along with a command_data param that can be passed to the reg/unreg functions. This allows func() implementations to use command-specific blobs and supports code re-use. The event_trigger_ops.func() command corrsponds to the trigger 'probe' function that gets called when the triggering event is actually invoked. The other functions are used to list the trigger when needed, along with a couple mundane book-keeping functions. This also moves event_file_data() into trace.h so it can be used outside of trace_events.c. Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.comSigned-off-by: NTom Zanussi <tom.zanussi@linux.intel.com> Idea-by: NSteve Rostedt <rostedt@goodmis.org> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 26 11月, 2013 1 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer then the following oops will happen: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c127a17b>] strlen+0x10/0x1a *pde = 00000000 ^M Oops: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000 EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1 EIP is at strlen+0x10/0x1a EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8 ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0 Stack: f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046 00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M 00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000 Call Trace: [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0 [<c1082a93>] ? ktime_get+0x29/0x69 [<c1082a93>] ? ktime_get+0x29/0x69 [<c106bc61>] lock_release+0x57/0x1a5 [<c1082a93>] ? ktime_get+0x29/0x69 [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75 [<c1082a93>] ? ktime_get+0x29/0x69^M [<c1082a93>] ktime_get+0x29/0x69 [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426 [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d [<c10bc184>] ? time_hardirqs_off+0xe/0x28 [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62 [<c1079242>] cpu_startup_entry+0x64/0x192 [<c102299c>] start_secondary+0x277/0x27c Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80 CR2: 0000000000000000 ---[ end trace 01bc47bf519ec1b2 ]--- New tracepoints have been added that have allowed for NULL pointers being assigned to strings. To fix this, change the TRACE_EVENT() code to check for NULL and if it is, it will assign "(null)" to it instead (similar to what glibc printf does). Reported-by: NShuah Khan <shuah.kh@samsung.com> Reported-by: NJovi Zhangwei <jovi.zhangwei@gmail.com> Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com Link: http://lkml.kernel.org/r/528D6972.9010702@samsung.com Fixes: 9cbf1176 ("tracing/events: provide string with undefined size support") Cc: stable@vger.kernel.org # 2.6.31+ Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 19 11月, 2013 1 次提交
-
-
由 Peter Zijlstra 提交于
Vince's perf-trinity fuzzer found yet another 'interesting' problem. When we sample the irq_work_exit tracepoint with period==1 (or PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an infinite event generation loop: ,-> <IPI> | irq_work_exit() -> | trace_irq_work_exit() -> | ... | __perf_event_overflow() -> (due to fasync) | irq_work_queue() -> (irq_work_list must be empty) '--------- arch_irq_work_raise() Similar things can happen due to regular poll() wakeups if we exceed the ring-buffer wakeup watermark, or have an event_limit. To avoid this, dis-allow sampling this particular tracepoint. In order to achieve this, create a special perf_perm function pointer for each event and call this (when set) on trying to create a tracepoint perf event. [ roasted: use expr... to allow for ',' in your expression ] Reported-by: NVince Weaver <vincent.weaver@maine.edu> Tested-by: NVince Weaver <vincent.weaver@maine.edu> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jones <davej@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 11月, 2013 1 次提交
-
-
由 Tom Zanussi 提交于
The trace event filters are still tied to event calls rather than event files, which means you don't get what you'd expect when using filters in the multibuffer case: Before: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 2048 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 Setting the filter in tracing/instances/test1/events shouldn't affect the same event in tracing/events as it does above. After: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 We'd like to just move the filter directly from ftrace_event_call to ftrace_event_file, but there are a couple cases that don't yet have multibuffer support and therefore have to continue using the current event_call-based filters. For those cases, a new USE_CALL_FILTER bit is added to the event_call flags, whose main purpose is to keep the old behavior for those cases until they can be updated with multibuffer support; at that point, the USE_CALL_FILTER flag (and the new associated call_filter_check_discard() function) can go away. The multibuffer support also made filter_current_check_discard() redundant, so this change removes that function as well and replaces it with filter_check_discard() (or call_filter_check_discard() as appropriate). Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.comSigned-off-by: NTom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 14 8月, 2013 3 次提交
-
-
由 Oleg Nesterov 提交于
perf_trace_buf_prepare() + perf_trace_buf_submit(task => NULL) make no sense if hlist_empty(head). Change perf_trace_##call() to check ->perf_events beforehand and do nothing if it is empty. This removes the overhead for tasks without events associated with them. For example, "perf record -e sched:sched_switch -p1" attaches the counter(s) to the single task, but every task in system will do perf_trace_buf_prepare/submit() just to realize that it was not attached to this event. However, we can only do this if __task == NULL, so we also add the __builtin_constant_p(__task) check. With this patch "perf bench sched pipe" shows approximately 4% improvement when "perf record -p1" runs in parallel, many thanks to Steven for the testing. Link: http://lkml.kernel.org/r/20130806160847.GA2746@redhat.comTested-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Oleg Nesterov 提交于
The next patch tries to avoid the costly perf_trace_buf_* calls when possible but there is a problem. We can only do this if __task == NULL, perf_tp_event(task != NULL) has the additional code for this case. Unfortunately, TP_perf_assign/__perf_xxx which changes the default values of __count/__task variables for perf_trace_buf_submit() is called "too late", after we already did perf_trace_buf_prepare(), and the optimization above can't work. So this patch simply embeds __perf_xxx() into TP_ARGS(), this way DECLARE_EVENT_CLASS() can use the result of assignments hidden in "args" right after ftrace_get_offsets_##call() which is mostly trivial. This allows us to have the fast-path "__task != NULL" check at the start, see the next patch. Link: http://lkml.kernel.org/r/20130806160844.GA2739@redhat.comTested-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Oleg Nesterov 提交于
To simplify the review of the next patches: 1. We are going to reimplent __perf_task/counter and embedd them into TP_ARGS(). expand TRACE_EVENT(sched_stat_runtime) into DECLARE_EVENT_CLASS() + DEFINE_EVENT(), this way they can use different TP_ARGS's. 2. Change perf_trace_##call() macro to do perf_fetch_caller_regs() right before perf_trace_buf_prepare(). This way it evaluates TP_ARGS() asap, the next patch explores this fact. Note: after 87f44bbc perf_trace_buf_prepare() doesn't need "struct pt_regs *regs", perhaps it makes sense to remove this argument. And perhaps we can teach perf_trace_buf_submit() to accept regs == NULL and do fetch_caller_regs(CALLER_ADDR1) in this case. 3. Cosmetic, but the typecast from "void*" buys nothing. It just adds the noise, remove it. Link: http://lkml.kernel.org/r/20130806160841.GA2736@redhat.comAcked-by: NPeter Zijlstra <peterz@infradead.org> Tested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 19 7月, 2013 1 次提交
-
-
由 Oleg Nesterov 提交于
Every perf_trace_buf_prepare() caller does WARN_ONCE(size > PERF_MAX_TRACE_SIZE, message) and "message" is almost the same. Shift this WARN_ONCE() into perf_trace_buf_prepare(). This changes the meaning of _ONCE, but I think this is fine. - 4947014 2932448 10104832 17984294 1126b26 vmlinux + 4948422 2932448 10104832 17985702 11270a6 vmlinux on my build. Link: http://lkml.kernel.org/r/20130617170211.GA19813@redhat.comAcked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 21 6月, 2013 1 次提交
-
-
由 Steven Rostedt 提交于
Each TRACE_EVENT() adds several helper functions. If two or more trace events share the same structure and print format, they can also share most of these helper functions and save a lot of space from duplicate code. This is why the DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created. Some events require a trigger to be called at registering and unregistering of the event and to do so they use TRACE_EVENT_FN(). If multiple events require a trigger, they currently have no choice but to use TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This unfortunately causes a lot of wasted duplicate code created. By adding a DEFINE_EVENT_FN(), these events can still use a DECLARE_EVENT_CLASS() and then define their own triggers. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/51C3236C.8030508@hds.comSigned-off-by: NSeiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 13 4月, 2013 1 次提交
-
-
由 zhangwei(Jovi) 提交于
The macro _TRACE_PROFILE_INIT was removed a long time ago, but an "#undef" guard was left behind. Remove it. Link: http://lkml.kernel.org/r/514684EE.6000805@huawei.comSigned-off-by: Nzhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 15 3月, 2013 6 次提交
-
-
由 Steven Rostedt (Red Hat) 提交于
In order to let triggers enable or disable events, we need a 'soft' method for doing so. For example, if a function probe is added that lets a user enable or disable events when a function is called, that change must be done without taking locks or a mutex, and definitely it can't sleep. But the full enabling of a tracepoint is expensive. By adding a 'SOFT_DISABLE' flag, and converting the flags to be updated without the protection of a mutex (using set/clear_bit()), this soft disable flag can be used to allow critical sections to enable or disable events from being traced (after the event has been placed into "SOFT_MODE"). Some caveats though: The comm recorder (to map pids with a comm) can not be soft disabled (yet). If you disable an event with with a "soft" disable and wait a while before reading the trace, the comm cache may be replaced and you'll get a bunch of <...> for comms in the trace. Reading the "enable" file for an event that is disabled will now give you "0*" where the '*' denotes that the tracepoint is still active but the event itself is "disabled". [ fixed _BIT used in & operation : thanks to Dan Carpenter and smatch ] Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Li Zefan 提交于
As we've added __init annotation to field-defining functions, we should add __refdata annotation to event_call variables, which reference those functions. Link: http://lkml.kernel.org/r/51343C1F.2050502@huawei.comReported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Li Zefan 提交于
Those functions are called either during kernel boot or module init. Before: $ dmesg | grep 'Freeing unused kernel memory' Freeing unused kernel memory: 1208k freed Freeing unused kernel memory: 1360k freed Freeing unused kernel memory: 1960k freed After: $ dmesg | grep 'Freeing unused kernel memory' Freeing unused kernel memory: 1236k freed Freeing unused kernel memory: 1388k freed Freeing unused kernel memory: 1960k freed Link: http://lkml.kernel.org/r/5125877D.5000201@huawei.comSigned-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Li Zefan 提交于
Move duplicate code in event print functions to a helper function. This shrinks the size of the kernel by ~13K. text data bss dec hex filename 6596137 1743966 10138672 18478775 119f6b7 vmlinux.o.old 6583002 1743849 10138672 18465523 119c2f3 vmlinux.o.new Link: http://lkml.kernel.org/r/51258746.2060304@huawei.comSigned-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
Pass the struct ftrace_event_file *ftrace_file to the trace_event_buffer_lock_reserve() (new function that replaces the trace_current_buffer_lock_reserver()). The ftrace_file holds a pointer to the trace_array that is in use. In the case of multiple buffers with different trace_arrays, this allows different events to be recorded into different buffers. Also fixed some of the stale comments in include/trace/ftrace.h Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Steven Rostedt 提交于
The trace events for ftrace are all defined via global variables. The arrays of events and event systems are linked to a global list. This prevents multiple users of the event system (what to enable and what not to). By adding descriptors to represent the event/file relation, as well as to which trace_array descriptor they are associated with, allows for more than one set of events to be defined. Once the trace events files have a link between the trace event and the trace_array they are associated with, we can create multiple trace_arrays that can record separate events in separate buffers. Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 14 11月, 2012 1 次提交
-
-
由 Shan Wei 提交于
When doing per-cpu helper optimizing work, find that this code is so puzzled. 1. It's mark as comment text, maybe a sample function for guidelines or a todo work. 2. But, this sample code is odd where struct perf_trace_buf is nonexistent. commit ce71b9 delete struct perf_trace_buf definition. Author: Frederic Weisbecker <fweisbec@gmail.com> Date: Sun Nov 22 05:26:55 2009 +0100 tracing: Use the perf recursion protection from trace event Is it necessary to keep there? just compile test. Link: http://lkml.kernel.org/r/50949FC9.6050202@gmail.comSigned-off-by: NShan Wei <davidshan@tencent.com> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-