1. 19 8月, 2009 1 次提交
  2. 12 8月, 2009 2 次提交
    • F
      tracing: Add ftrace event call parameter to its field descriptor handler · e8f9f4d7
      Frederic Weisbecker 提交于
      Add the struct ftrace_event_call as a parameter of its show_format()
      callback. This way we can use it from the syscall trace events to
      retrieve the syscall name from the ftrace event call parameter and
      describe its fields using the syscalls metadata.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      e8f9f4d7
    • J
      tracing: Add ftrace_event_call void * 'data' field · 69fd4f0e
      Jason Baron 提交于
      add an optional void * pointer to 'ftrace_event_call' that is
      passed in for regfunc and unregfunc.
      
      This prepares for syscall tracepoints creation by passing the name of
      the syscall we want to trace and then retrieve its number through our
      arch syscall table.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      69fd4f0e
  3. 10 8月, 2009 3 次提交
    • F
      perf_counter: Zero dead bytes from ftrace raw samples size alignment · 1853db0e
      Frederic Weisbecker 提交于
      After aligning the ftrace raw samples, there are dead bytes storing
      random data from the stack. We don't want to leak these to userspace,
      then zero these out.
      
      Before:
      
      	0x2de88 [0x50]: event: 9
      	.
      	. ... raw event: size 80 bytes
      	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
      	.  0010:  68 01 00 00 68 01 00 00 2c 00 00 00 00 00 00 00  h...h...,......
      	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
      	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
      	.  0040:  68 01 00 00 40 7f 46 81 ff ff ff ff 00 10 1b 7f  h...@.F........
                                                            ^  ^  ^  ^
                                                               Leak
      
      After:
      
      	0x2d318 [0x50]: event: 9
      	.
      	. ... raw event: size 80 bytes
      	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
      	.  0010:  68 01 00 00 68 01 00 00 68 14 00 00 00 00 00 00  h...h...h......
      	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
      	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
      	.  0040:  68 01 00 00 a0 80 46 81 ff ff ff ff 00 00 00 00  h.....F........
                                                            ^  ^  ^  ^
      							 Fixed
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249915116-5210-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      1853db0e
    • F
      perf_counter: Subtract the buffer size field from the event record size · 304703ab
      Frederic Weisbecker 提交于
      We compute the perf raw sample size by aligning the raw ftrace
      event size plus the buffer size field itself. We do that
      instead of aligning only the perf raw sample size, so that we
      might economize some in some cases.
      
      But this buffer size field is not stored in the perf raw
      sample, we must then substract its size from the buffer once we
      computed the alignment unless we may get a useless u32 field in
      the buffer.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090810141129.GA5124@nowhere>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      304703ab
    • P
      perf_counter: Correct PERF_SAMPLE_RAW output · a044560c
      Peter Zijlstra 提交于
      PERF_SAMPLE_* output switches should unconditionally output the
      correct format, as they are the only way to unambiguously parse
      the PERF_EVENT_SAMPLE data.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249896447.17467.74.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a044560c
  4. 09 8月, 2009 2 次提交
    • F
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker 提交于
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f413cdb8
    • P
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra 提交于
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3a659305
  5. 20 7月, 2009 2 次提交
  6. 11 6月, 2009 1 次提交
    • S
      tracing: do not translate event helper macros in print format · 6ff9a64d
      Steven Rostedt 提交于
      By moving the macro that creates the print format code above the
      defining of the event macro helpers (__get_str, __print_symbolic,
      and __get_dynamic_array), we get a little cleaner print format.
      
      Instead of:
      
        (char *)((void *)REC + REC->__data_loc_name)
      
      we get:
      
         __get_str(name)
      
      Instead of:
      
         ({ static const struct trace_print_flags symbols[] = { { HI_SOFTIRQ, "HI" }, {
      
      we get:
      
         __print_symbolic(REC->vec, { HI_SOFTIRQ, "HI" }, {
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6ff9a64d
  7. 03 6月, 2009 1 次提交
    • S
      tracing: fix multiple use of __print_flags and __print_symbolic · 56d8bd3f
      Steven Whitehouse 提交于
      Here is an updated patch to include the extra call to
      trace_seq_init() as requested. This is vs. the latest
      -tip tree and fixes the use of multiple __print_flags
      and __print_symbolic in a single tracer. Also tested
      to ensure its working now:
      
      mount.gfs2-2534  [000]   235.850587: gfs2_glock_queue: 8.7 glock 1:2 dequeue PR
      mount.gfs2-2534  [000]   235.850591: gfs2_demote_rq: 8.7 glock 1:0 demote EX to NL flags:DI
      mount.gfs2-2534  [000]   235.850591: gfs2_glock_queue: 8.7 glock 1:0 dequeue EX
      glock_workqueue-2529  [000]   235.850666: gfs2_glock_state_change: 8.7 glock 1:0 state EX => NL tgt:NL dmt:NL flags:lDpI
      glock_workqueue-2529  [000]   235.850672: gfs2_glock_put: 8.7 glock 1:0 state NL => IV flags:I
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      LKML-Reference: <1244037123.29604.603.camel@localhost.localdomain>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      56d8bd3f
  8. 02 6月, 2009 3 次提交
    • L
      tracing/events: introduce __dynamic_array() · 7fcb7c47
      Li Zefan 提交于
      __string() is limited:
      
        - it's a char array, but we may want to define array with other types
        - a source string should be available, but we may just know the string size
      
      We introduce __dynamic_array() to break those limitations, and __string()
      becomes a wrapper of it. As a side effect, now __get_str() can be used
      in TP_fast_assign but not only TP_print.
      
      Take XFS for example, we have the string length in the dirent, but the
      string itself is not NULL-terminated, so __dynamic_array() can be used:
      
      TRACE_EVENT(xfs_dir2,
      	TP_PROTO(struct xfs_da_args *args),
      	TP_ARGS(args),
      
      	TP_STRUCT__entry(
      		__field(int, namelen)
      		__dynamic_array(char, name, args->namelen + 1)
      		...
      	),
      
      	TP_fast_assign(
      		char *name = __get_str(name);
      
      		if (args->namelen)
      			memcpy(name, args->name, args->namelen);
      		name[args->namelen] = '\0';
      
      		__entry->namelen = args->namelen;
      	),
      
      	TP_printk("name %.*s namelen %d",
      		  __entry->namelen ? __get_str(name) : NULL
      		  __entry->namelen)
      );
      
      [ Impact: allow defining dynamic size arrays ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2384D2.3080403@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7fcb7c47
    • L
      tracing/events: put TP_fast_assign into braces · a9c1c3ab
      Li Zefan 提交于
      Currently TP_fast_assign has a limitation that we can't define local
      variables in it.
      
      Here's one use case when we introduce __dynamic_array():
      
      TP_fast_assign(
      	type *p = __get_dynamic_array(item);
      
      	foo(p);
      	bar(p);
      ),
      
      [ Impact: allow defining local variables in TP_fast_assign ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2384B1.90100@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a9c1c3ab
    • L
      tracing/events: fix a typo in __string() format output · 6e25db44
      Li Zefan 提交于
      "tsize" should be "\tsize". Also remove the space before "__str_loc".
      
      Before:
       # cat tracing/events/irq/irq_handler_entry/format
              ...
              field:int irq;  offset:12;      size:4;
              field: __str_loc name;  offset:16;tsize:2;
              ...
      
      After:
       # cat tracing/events/irq/irq_handler_entry/format
      	...
              field:int irq;  offset:12;      size:4;
              field:__str_loc name;   offset:16;      size:2;
      	...
      
      [ Impact: standardize __string field description in events format file ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6e25db44
  9. 28 5月, 2009 1 次提交
  10. 27 5月, 2009 2 次提交
    • S
      tracing: add __print_symbolic to trace events · 0f4fc29d
      Steven Rostedt 提交于
      This patch adds __print_symbolic which is similar to __print_flags but
      works for an enumeration type instead. That is, there is only a one to one
      mapping between the values and the symbols. When a match is made, then
      it is printed, otherwise the hex value is outputed.
      
      [ Impact: add interface for showing symbol names in events ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      0f4fc29d
    • S
      tracing: add __print_flags for events · be74b73a
      Steven Rostedt 提交于
      Developers have been asking for the ability in the ftrace event tracer
      to display names of bits in a flags variable.
      
      Instead of printing out c2, it would be easier to read FOO|BAR|GOO,
      assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7.
      
      Some examples where this would be useful are the state flags in a context
      switch, kmalloc flags, and even permision flags in accessing files.
      
      [
        v2 changes include:
      
        Frederic Weisbecker's idea of using a mask instead of bits,
        thus we can output GFP_KERNEL instead of GPF_WAIT|GFP_IO|GFP_FS.
      
        Li Zefan's idea of allowing the caller of __print_flags to add their
        own delimiter (or no delimiter) where we can get for file permissions
        rwx instead of r|w|x.
      ]
      
      [
        v3 changes:
      
         Christoph Hellwig's idea of using an array instead of va_args.
      ]
      
      [ Impact: better displaying of flags in trace output ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      be74b73a
  11. 26 5月, 2009 1 次提交
  12. 29 4月, 2009 1 次提交
    • T
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi 提交于
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a118e4d1
  13. 24 4月, 2009 1 次提交
  14. 22 4月, 2009 3 次提交
    • F
      tracing/events: protect __get_str() · 6a74aa40
      Frederic Weisbecker 提交于
      The __get_str() macro is used in a code part then its content should be
      protected with parenthesis.
      
      [ Impact: make macro definition more robust ]
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      6a74aa40
    • F
      tracing/events: provide string with undefined size support · 9cbf1176
      Frederic Weisbecker 提交于
      This patch provides the support for dynamic size strings on
      event tracing.
      
      The key concept is to use a structure with an ending char array field of
      undefined size and use such ability to allocate the minimal size on the
      ring buffer to make one or more string entries fit inside, as opposite
      to a fixed length strings with upper bound.
      
      The strings themselves are represented using fields which have an offset
      value from the beginning of the entry.
      
      This patch provides three new macros:
      
      __string(item, src)
      
      This one declares a string to the structure inside TP_STRUCT__entry.
      You need to provide the name of the string field and the source that will
      be copied inside.
      This will also add the dynamic size of the string needed for the ring
      buffer entry allocation.
      A stack allocated structure is used to temporarily store the offset
      of each strings, avoiding double calls to strlen() on each event
      insertion.
      
      __get_str(field)
      
      This one will give you a pointer to the string you have created. This
      is an abstract helper to resolve the absolute address given the field
      name which is a relative address from the beginning of the trace_structure.
      
      __assign_str(dst, src)
      
      Use this macro to automatically perform the string copy from src to
      dst. src must be a variable to assign and dst is the name of a __string
      field.
      
      Example on how to use it:
      
      TRACE_EVENT(my_event,
      	TP_PROTO(char *src1, char *src2),
      
      	TP_ARGS(src1, src2),
      	TP_STRUCT__entry(
      		__string(str1, src1)
      		__string(str2, src2)
      	),
      	TP_fast_assign(
      		__assign_str(str1, src1);
      		__assign_str(str2, src2);
      	),
      	TP_printk("%s %s", __get_str(src1), __get_str(src2))
      )
      
      Of course you can mix-up any __field or __array inside this
      TRACE_EVENT. The position of the __string or __assign_str
      doesn't matter.
      
      Changes in v2:
      
      Address the suggestion of Steven Rostedt: drop the opening_string() macro
      and redefine __ending_string() to get the size of the string to be copied
      instead of overwritting the whole ring buffer allocation.
      
      Changes in v3:
      
      Address other suggestions of Steven Rostedt and Peter Zijlstra with
      some changes: drop the __ending_string and the need to have only one
      string field.
      Use offsets instead of absolute addresses.
      
      [ Impact: allow more compact memory usage for string tracing ]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      9cbf1176
    • L
      tracing/events: make struct trace_entry->type to be int type · 7a4f453b
      Li Zefan 提交于
      struct trace_entry->type is unsigned char, while trace event's id is
      int type, thus for a event with id >= 256, it's entry->type is cast
      to (id % 256), and then we can't see the trace output of this event.
      
       # insmod trace-events-sample.ko
       # echo foo_bar > /mnt/tracing/set_event
       # cat /debug/tracing/events/trace-events-sample/foo_bar/id
       256
       # cat /mnt/tracing/trace_pipe
                 <...>-3548  [001]   215.091142: Unknown type 0
                 <...>-3548  [001]   216.089207: Unknown type 0
                 <...>-3548  [001]   217.087271: Unknown type 0
                 <...>-3548  [001]   218.085332: Unknown type 0
      
      [ Impact: fix output for trace events with id >= 256 ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a4f453b
  15. 17 4月, 2009 1 次提交
  16. 15 4月, 2009 2 次提交
    • S
      tracing/events: add support for modules to TRACE_EVENT · 6d723736
      Steven Rostedt 提交于
      Impact: allow modules to add TRACE_EVENTS on load
      
      This patch adds the final hooks to allow modules to use the TRACE_EVENT
      macro. A notifier and a data structure are used to link the TRACE_EVENTs
      defined in the module to connect them with the ftrace event tracing system.
      
      It also adds the necessary automated clean ups to the trace events when a
      module is removed.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6d723736
    • S
      tracing/events: move the ftrace event tracing code to core · f42c85e7
      Steven Rostedt 提交于
      This patch moves the ftrace creation into include/trace/ftrace.h and
      simplifies the work of developers in adding new tracepoints.
      Just the act of creating the trace points in include/trace and including
      define_trace.h will create the events in the debugfs/tracing/events
      directory.
      
      This patch removes the need of include/trace/trace_events.h
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f42c85e7
  17. 14 4月, 2009 4 次提交
    • S
      tracing: consolidate trace and trace_event headers · ea20d929
      Steven Rostedt 提交于
      Impact: clean up
      
      Neil Horman (et. al.) criticized the way the trace events were broken up
      into two files. The reason for that was that ftrace needed to separate out
      the declarations from where the #include <linux/tracepoint.h> was used.
      It then dawned on me that the tracepoint.h header only needs to define the
      TRACE_EVENT macro if it is not already defined.
      
      The solution is simply to test if TRACE_EVENT is defined, and if it is not
      then the linux/tracepoint.h header can define it. This change consolidates
      all the <traces>.h and <traces>_event_types.h into the <traces>.h file.
      Reported-by: NNeil Horman <nhorman@tuxdriver.com>
      Reported-by: NTheodore Tso <tytso@mit.edu>
      Reported-by: NJiaying Zhang <jiayingz@google.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ea20d929
    • T
      tracing/filters: allow on-the-fly filter switching · 0a19e53c
      Tom Zanussi 提交于
      This patch allows event filters to be safely removed or switched
      on-the-fly while avoiding the use of rcu or the suspension of tracing of
      previous versions.
      
      It does it by adding a new filter_pred_none() predicate function which
      does nothing and by never deallocating either the predicates or any of
      the filter_pred members used in matching; the predicate lists are
      allocated and initialized during ftrace_event_calls initialization.
      
      Whenever a filter is removed or replaced, the filter_pred_* functions
      currently in use by the affected ftrace_event_call are immediately
      switched over to to the filter_pred_none() function, while the rest of
      the filter_pred members are left intact, allowing any currently
      executing filter_pred_* functions to finish up, using the values they're
      currently using.
      
      In the case of filter replacement, the new predicate values are copied
      into the old predicates after the above step, and the filter_pred_none()
      functions are replaced by the filter_pred_* functions for the new
      filter.  In this case, it is possible though very unlikely that a
      previous filter_pred_* is still running even after the
      filter_pred_none() switch and the switch to the new filter_pred_*.  In
      that case, however, because nothing has been deallocated in the
      filter_pred, the worst that can happen is that the old filter_pred_*
      function sees the new values and as a result produces either a false
      positive or a false negative, depending on the values it finds.
      
      So one downside to this method is that rarely, it can produce a bad
      match during the filter switch, but it should be possible to live with
      that, IMHO.
      
      The other downside is that at least in this patch the predicate lists
      are always pre-allocated, taking up memory from the start.  They could
      probably be allocated on first-use, and de-allocated when tracing is
      completely stopped - if this patch makes sense, I could create another
      one to do that later on.
      
      Oh, and it also places a restriction on the size of __arrays in events,
      currently set to 128, since they can't be larger than the now embedded
      str_val arrays in the filter_pred struct.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: paulmck@linux.vnet.ibm.com
      LKML-Reference: <1239610670.6660.49.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a19e53c
    • T
      tracing/filters: use ring_buffer_discard_commit() in filter_check_discard() · eb02ce01
      Tom Zanussi 提交于
      This patch changes filter_check_discard() to make use of the new
      ring_buffer_discard_commit() function and modifies the current users to
      call the old commit function in the non-discard case.
      
      It also introduces a version of filter_check_discard() that uses the
      global trace buffer (filter_current_check_discard()) for those cases.
      
      v2 changes:
      
      - fix compile error noticed by Ingo Molnar
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1239178554.10295.36.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eb02ce01
    • S
      tracing/filters: use ring_buffer_discard_commit for discarded events · 77d9f465
      Steven Rostedt 提交于
      The ring_buffer_discard_commit makes better usage of the ring_buffer
      when an event has been discarded. It tries to remove it completely if
      possible.
      
      This patch converts the trace event filtering to use
      ring_buffer_discard_commit instead of the ring_buffer_event_discard.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77d9f465
  18. 23 3月, 2009 4 次提交
    • F
      tracing/events: don't discard an event after commit · b118415b
      Frederic Weisbecker 提交于
      When we want to filter an event, the filter test is done after
      the event is commited to the ring-buffer to be discarded later if
      needed.
      
      But a reader could be reading this event while we are trying to discard
      it. Other kind of racy events can even happen because the event is
      commited and can be read and/or consumed.
      
      What we want is to discard the event before committing it.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <1237763919-21505-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b118415b
    • F
      tracing/events: don't use wake up for events · 07edf712
      Frederic Weisbecker 提交于
      Impact: fix hard-lockup with sched switch events
      
      Some ftrace events, such as sched wakeup, can be traced
      while the runqueue lock is hold. Since they are using
      trace_current_buffer_unlock_commit(), they call wake_up()
      which can try to grab the runqueue lock too, resulting in
      a deadlock.
      
      Now for all event, we call a new helper:
      trace_nowake_buffer_unlock_commit() which do pretty the same than
      trace_current_buffer_unlock_commit() except than it doesn't call
      trace_wake_up().
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237759847-21025-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      07edf712
    • T
      tracing: add per-event filtering · 7ce7e424
      Tom Zanussi 提交于
      This patch adds per-event filtering to the event tracing subsystem.
      
      It adds a 'filter' debugfs file to each event directory.  This file can
      be written to to set filters; reading from it will display the current
      set of filters set for that event.
      
      Basically, any field listed in the 'format' file for an event can be
      filtered on (including strings, but not yet other array types) using
      either matching ('==') or non-matching ('!=') 'predicates'.  A
      'predicate' can be either a single expression:
      
       # echo pid != 0 > filter
      
       # cat filter
       pid != 0
      
      or a compound expression of up to 8 sub-expressions combined using '&&'
      or '||':
      
       # echo comm == Xorg > filter
       # echo "&& sig != 29" > filter
      
       # cat filter
       comm == Xorg
       && sig != 29
      
      Only events having field values matching an expression will be available
      in the trace output; non-matching events are discarded.
      
      Note that a compound expression is built up by echoing each
      sub-expression separately - it's not the most efficient way to do
      things, but it keeps the parser simple and assumes that compound
      expressions will be relatively uncommon.  In any case, a subsequent
      patch introducing a way to set filters for entire subsystems should
      mitigate any need to do this for lots of events.
      
      Setting a filter without an '&&' or '||' clears the previous filter
      completely and sets the filter to the new expression:
      
       # cat filter
       comm == Xorg
       && sig != 29
      
       # echo comm != Xorg
      
       # cat filter
       comm != Xorg
      
      To clear a filter, echo 0 to the filter file:
      
       # echo 0 > filter
       # cat filter
       none
      
      The limit of 8 predicates for a compound expression is arbitrary - for
      efficiency, it's implemented as an array of pointers to predicates, and
      8 seemed more than enough for any filter...
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710665.7703.48.camel@charm-linux>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ce7e424
    • T
      tracing: add run-time field descriptions for event filtering · cf027f64
      Tom Zanussi 提交于
      This patch makes the field descriptions defined for event tracing
      available at run-time, for the event-filtering mechanism introduced
      in a subsequent patch.
      
      The common event fields are prepended with 'common_' in the format
      display, allowing them to be distinguished from the other fields
      that might internally have same name and can therefore be
      unambiguously used in filters.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710639.7703.46.camel@charm-linux>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cf027f64
  19. 20 3月, 2009 2 次提交
  20. 11 3月, 2009 3 次提交