1. 18 9月, 2009 1 次提交
  2. 13 9月, 2009 1 次提交
  3. 12 9月, 2009 1 次提交
    • S
      tracing: add lock depth to entries · 637e7e86
      Steven Rostedt 提交于
      This patch adds the lock depth of the big kernel lock to the generic
      entry header. This way we can see the depth of the lock and help
      in removing the BKL.
      
      Example:
      
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
         <idle>-0       2.N..3 5902255250us+: lock_acquire: read rcu_read_lock
         <idle>-0       2.N..3 5902255253us+: lock_release: rcu_read_lock
         <idle>-0       2dN..3 5902255257us+: lock_acquire: xtime_lock
         <idle>-0       2dN..4 5902255259us : lock_acquire: clocksource_lock
         <idle>-0       2dN..4 5902255261us+: lock_release: clocksource_lock
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      637e7e86
  4. 11 9月, 2009 1 次提交
  5. 05 9月, 2009 1 次提交
    • S
      tracing: pass around ring buffer instead of tracer · e77405ad
      Steven Rostedt 提交于
      The latency tracers (irqsoff and wakeup) can swap trace buffers
      on the fly. If an event is happening and has reserved data on one of
      the buffers, and the latency tracer swaps the global buffer with the
      max buffer, the result is that the event may commit the data to the
      wrong buffer.
      
      This patch changes the API to the trace recording to be recieve the
      buffer that was used to reserve a commit. Then this buffer can be passed
      in to the commit.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e77405ad
  6. 31 8月, 2009 1 次提交
    • L
      tracing/filters: Defer pred allocation · 8e254c1d
      Li Zefan 提交于
      init_preds() allocates about 5392 bytes of memory (on x86_32) for
      a TRACE_EVENT. With my config, at system boot total memory occupied
      is:
      
      	5392 * (642 + 15) == 3459KB
      
      642 == cat available_events | wc -l
      15 == number of dirs in events/ftrace
      
      That's quite a lot, so we'd better defer memory allocation util
      it's needed, that's when filter is used.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <4A9B8EA5.6020700@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e254c1d
  7. 26 8月, 2009 2 次提交
    • L
      tracing/filters: Support filtering for char * strings · 87a342f5
      Li Zefan 提交于
      Usually, char * entries are dangerous in traces because the string
      can be released whereas a pointer to it can still wait to be read from
      the ring buffer.
      
      But sometimes we can assume it's safe, like in case of RO data
      (eg: __file__ or __line__, used in bkl trace event). If these RO data
      are in a module and so is the call to the trace event, then it's safe,
      because the ring buffer will be flushed once this module get unloaded.
      
      To allow char * to be treated as a string:
      
      	TRACE_EVENT(...,
      
      		TP_STRUCT__entry(
      			__field_ext(const char *, name, FILTER_PTR_STRING)
      			...
      		)
      
      		...
      	);
      
      The filtering will not dereference "char *" unless the developer
      explicitly sets FILTER_PTR_STR in __field_ext.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9287.90205@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      87a342f5
    • L
      tracing/filters: Add __field_ext() to TRACE_EVENT · 43b51ead
      Li Zefan 提交于
      Add __field_ext(), so a field can be assigned to a specific
      filter_type, which matches a corresponding filter function.
      
      For example, a later patch will allow this:
      	__field_ext(const char *, str, FILTER_PTR_STR);
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9272.60507095@cn.fujitsu.com>
      
      [
        Fixed a -1 to FILTER_OTHER
        Forward ported to latest kernel.
      ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      43b51ead
  8. 19 8月, 2009 3 次提交
    • L
      tracing/syscalls: Add filtering support · 540b7b8d
      Li Zefan 提交于
      Add filtering support for syscall events:
      
       # echo 'mode == 0666' > events/syscalls/sys_enter_open
       # echo 'ret == 0' > events/syscalls/sys_exit_open
       # echo 1 > events/syscalls/sys_enter_open
       # echo 1 > events/syscalls/sys_exit_open
       # cat trace
       ...
         modprobe-3084 [001] 117.463140: sys_open(filename: 917d3e8, flags: 0, mode: 1b6)
         modprobe-3084 [001] 117.463176: sys_open -> 0x0
             less-3086 [001] 117.510455: sys_open(filename: 9c6bdb8, flags: 8000, mode: 1b6)
         sendmail-2574 [001] 122.145840: sys_open(filename: b807a365, flags: 0, mode: 1b6)
       ...
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFCB.1040006@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      540b7b8d
    • L
      tracing/events: Add trace_define_common_fields() · e647d6b3
      Li Zefan 提交于
      Extract duplicate code. Also prepare for the later patch.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFB8.1010304@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e647d6b3
    • L
      tracing/events: Add ftrace_event_call param to define_fields() · 14be96c9
      Li Zefan 提交于
      This parameter is needed by syscall events to add define_fields()
      handler.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAF90.6060801@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14be96c9
  9. 12 8月, 2009 2 次提交
    • F
      tracing: Add ftrace event call parameter to its field descriptor handler · e8f9f4d7
      Frederic Weisbecker 提交于
      Add the struct ftrace_event_call as a parameter of its show_format()
      callback. This way we can use it from the syscall trace events to
      retrieve the syscall name from the ftrace event call parameter and
      describe its fields using the syscalls metadata.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      e8f9f4d7
    • J
      tracing: Add ftrace_event_call void * 'data' field · 69fd4f0e
      Jason Baron 提交于
      add an optional void * pointer to 'ftrace_event_call' that is
      passed in for regfunc and unregfunc.
      
      This prepares for syscall tracepoints creation by passing the name of
      the syscall we want to trace and then retrieve its number through our
      arch syscall table.
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      69fd4f0e
  10. 09 8月, 2009 1 次提交
    • F
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker 提交于
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f413cdb8
  11. 06 8月, 2009 1 次提交
  12. 21 7月, 2009 1 次提交
    • L
      tracing/filters: improve subsystem filter · 1f9963cb
      Li Zefan 提交于
      Currently a subsystem filter should be applicable to all events
      under the subsystem, and if it failed, all the event filters
      will be cleared. Those behaviors make subsys filter much less
      useful:
      
        # echo 'vec == 1' > irq/softirq_entry/filter
        # echo 'irq == 5' > irq/filter
        bash: echo: write error: Invalid argument
        # cat irq/softirq_entry/filter
        none
      
      I'd expect it set the filter for irq_handler_entry/exit, and
      not touch softirq_entry/exit.
      
      The basic idea is, try to see if the filter can be applied
      to which events, and then just apply to the those events:
      
        # echo 'vec == 1' > softirq_entry/filter
        # echo 'irq == 5' > filter
        # cat irq_handler_entry/filter
        irq == 5
        # cat softirq_entry/filter
        vec == 1
      
      Changelog for v2:
      - do some cleanups to address Frederic's comments.
      Inspired-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A63D485.7030703@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1f9963cb
  13. 02 6月, 2009 1 次提交
  14. 27 5月, 2009 2 次提交
    • S
      tracing: add __print_symbolic to trace events · 0f4fc29d
      Steven Rostedt 提交于
      This patch adds __print_symbolic which is similar to __print_flags but
      works for an enumeration type instead. That is, there is only a one to one
      mapping between the values and the symbols. When a match is made, then
      it is printed, otherwise the hex value is outputed.
      
      [ Impact: add interface for showing symbol names in events ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      0f4fc29d
    • S
      tracing: add __print_flags for events · be74b73a
      Steven Rostedt 提交于
      Developers have been asking for the ability in the ftrace event tracer
      to display names of bits in a flags variable.
      
      Instead of printing out c2, it would be easier to read FOO|BAR|GOO,
      assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7.
      
      Some examples where this would be useful are the state flags in a context
      switch, kmalloc flags, and even permision flags in accessing files.
      
      [
        v2 changes include:
      
        Frederic Weisbecker's idea of using a mask instead of bits,
        thus we can output GFP_KERNEL instead of GPF_WAIT|GFP_IO|GFP_FS.
      
        Li Zefan's idea of allowing the caller of __print_flags to add their
        own delimiter (or no delimiter) where we can get for file permissions
        rwx instead of r|w|x.
      ]
      
      [
        v3 changes:
      
         Christoph Hellwig's idea of using an array instead of va_args.
      ]
      
      [ Impact: better displaying of flags in trace output ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      be74b73a
  15. 09 5月, 2009 1 次提交
  16. 06 5月, 2009 1 次提交
  17. 29 4月, 2009 3 次提交
    • T
      tracing/filters: a better event parser · 8b372562
      Tom Zanussi 提交于
      Replace the current event parser hack with a better one.  Filters are
      no longer specified predicate by predicate, but all at once and can
      use parens and any of the following operators:
      
      numeric fields:
      
      ==, !=, <, <=, >, >=
      
      string fields:
      
      ==, !=
      
      predicates can be combined with the logical operators:
      
      &&, ||
      
      examples:
      
      "common_preempt_count > 4" > filter
      
      "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
      
      If there was an error, the erroneous string along with an error
      message can be seen by looking at the filter e.g.:
      
      ((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
      ^
      parse_error: Field not found
      
      Currently the caret for an error always appears at the beginning of
      the filter; a real position should be used, but the error message
      should be useful even without it.
      
      To clear a filter, '0' can be written to the filter file.
      
      Filters can also be set or cleared for a complete subsystem by writing
      the same filter as would be written to an individual event to the
      filter file at the root of the subsytem.  Note however, that if any
      event in the subsystem lacks a field specified in the filter being
      set, the set will fail and all filters in the subsytem are
      automatically cleared.  This change from the previous version was made
      because using only the fields that happen to exist for a given event
      would most likely result in a meaningless filter.
      
      Because the logical operators are now implemented as predicates, the
      maximum number of predicates in a filter was increased from 8 to 16.
      
      [ Impact: add new, extended trace-filter implementation ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905899.6416.121.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8b372562
    • T
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi 提交于
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a118e4d1
    • T
      tracing/filters: move preds into event_filter object · 30e673b2
      Tom Zanussi 提交于
      Create a new event_filter object, and move the pred-related members
      out of the call and subsystem objects and into the filter object - the
      details of the filter implementation don't need to be exposed in the
      call and subsystem in any case, and it will also help make the new
      parser implementation a little cleaner.
      
      [ Impact: refactor trace-filter code to prepare for new features ]
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905887.6416.119.camel@tropicana>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      30e673b2
  18. 25 4月, 2009 1 次提交
    • S
      tracing/events: reuse trace event ids after overflow · 060fa5c8
      Steven Rostedt 提交于
      With modules being able to add trace events, and the max trace event
      counter is 16 bits (65536) we can overflow the counter easily
      with a simple while loop adding and removing modules that contain
      trace events.
      
      This patch links together the registered trace events and on overflow
      searches for available trace event ids. It will still fail if
      over 65536 events are registered, but considering that a typical
      kernel only has 22000 functions, 65000 events should be sufficient.
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      060fa5c8
  19. 24 4月, 2009 1 次提交
    • S
      tracing: increase size of number of possible events · 89ec0dee
      Steven Rostedt 提交于
      With the new event tracing registration, we must increase the number
      of events that can be registered. Currently the type field is only
      one byte, which leaves us only 256 possible events.
      
      Since we do not save the CPU number in the tracer anymore (it is determined
      by the per cpu ring buffer that is used) we have an extra byte to use.
      
      This patch increases the size of type from 1 byte (256 events) to
      2 bytes (65,536 events).
      
      It also adds a WARN_ON_ONCE if we exceed that limit.
      
      [ Impact: allow more than 255 events ]
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      89ec0dee
  20. 22 4月, 2009 1 次提交
    • L
      tracing/events: make struct trace_entry->type to be int type · 7a4f453b
      Li Zefan 提交于
      struct trace_entry->type is unsigned char, while trace event's id is
      int type, thus for a event with id >= 256, it's entry->type is cast
      to (id % 256), and then we can't see the trace output of this event.
      
       # insmod trace-events-sample.ko
       # echo foo_bar > /mnt/tracing/set_event
       # cat /debug/tracing/events/trace-events-sample/foo_bar/id
       256
       # cat /mnt/tracing/trace_pipe
                 <...>-3548  [001]   215.091142: Unknown type 0
                 <...>-3548  [001]   216.089207: Unknown type 0
                 <...>-3548  [001]   217.087271: Unknown type 0
                 <...>-3548  [001]   218.085332: Unknown type 0
      
      [ Impact: fix output for trace events with id >= 256 ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49EEDB0E.5070207@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7a4f453b
  21. 15 4月, 2009 3 次提交
    • S
      tracing/events: add support for modules to TRACE_EVENT · 6d723736
      Steven Rostedt 提交于
      Impact: allow modules to add TRACE_EVENTS on load
      
      This patch adds the final hooks to allow modules to use the TRACE_EVENT
      macro. A notifier and a data structure are used to link the TRACE_EVENTs
      defined in the module to connect them with the ftrace event tracing system.
      
      It also adds the necessary automated clean ups to the trace events when a
      module is removed.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      6d723736
    • S
      tracing/events: convert event call sites to use a link list · a59fd602
      Steven Rostedt 提交于
      Impact: makes it possible to define events in modules
      
      The events are created by reading down the section that they are linked
      in by the macros. But this is not scalable to modules. This patch converts
      the manipulations to use a global link list, and on boot up it adds
      the items in the section to the list.
      
      This change will allow modules to add their tracing events to the list as
      well.
      
      Note, this change alone does not permit modules to use the TRACE_EVENT macros,
      but the change is needed for them to eventually do so.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a59fd602
    • S
      tracing/events: move declarations from trace directory to core include · 97f20251
      Steven Rostedt 提交于
      In preparation to allowing trace events to happen in modules, we need
      to move some of the local declarations in the kernel/trace directory
      into include/linux.
      
      This patch simply moves the declarations and performs no context changes.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      97f20251