1. 28 2月, 2009 3 次提交
    • S
      tracing: add raw fast tracing interface for trace events · fd994989
      Steven Rostedt 提交于
      This patch adds the interface to enable the C style trace points.
      In the directory /debugfs/tracing/events/subsystem/event
      We now have three files:
      
       enable : values 0 or 1 to enable or disable the trace event.
      
       available_types: values 'raw' and 'printf' which indicate the tracing
             types available for the trace point. If a developer does not
             use the TRACE_EVENT_FORMAT macro and just uses the TRACE_FORMAT
             macro, then only 'printf' will be available. This file is
             read only.
      
       type: values 'raw' or 'printf'. This indicates which type of tracing
             is active for that trace point. 'printf' is the default and
             if 'raw' is not available, this file is read only.
      
       # echo raw > /debug/tracing/events/sched/sched_wakeup/type
       # echo 1 > /debug/tracing/events/sched/sched_wakeup/enable
      
       Will enable the C style tracing for the sched_wakeup trace point.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      fd994989
    • S
      tracing: add raw trace point recording infrastructure · c32e827b
      Steven Rostedt 提交于
      Impact: lower overhead tracing
      
      The current event tracer can automatically pick up trace points
      that are registered with the TRACE_FORMAT macro. But it required
      a printf format string and parsing. Although, this adds the ability
      to get guaranteed information like task names and such, it took
      a hit in overhead processing. This processing can add about 500-1000
      nanoseconds overhead, but in some cases that too is considered
      too much and we want to shave off as much from this overhead as
      possible.
      
      Tom Zanussi recently posted tracing patches to lkml that are based
      on a nice idea about capturing the data via C structs using
      STRUCT_ENTER, STRUCT_EXIT type of macros.
      
      I liked that method very much, but did not like the implementation
      that required a developer to add data/code in several disjoint
      locations.
      
      This patch extends the event_tracer macros to do a similar "raw C"
      approach that Tom Zanussi did. But instead of having the developers
      needing to tweak a bunch of code all over the place, they can do it
      all in one macro - preferably placed near the code that it is
      tracing. That makes it much more likely that tracepoints will be
      maintained on an ongoing basis by the code they modify.
      
      The new macro TRACE_EVENT_FORMAT is created for this approach. (Note,
      a developer may still utilize the more low level DECLARE_TRACE macros
      if they don't care about getting their traces automatically in the event
      tracer.)
      
      They can also use the existing TRACE_FORMAT if they don't need to code
      the tracepoint in C, but just want to use the convenience of printf.
      
      So if the developer wants to "hardwire" a tracepoint in the fastest
      possible way, and wants to acquire their data via a user space utility
      in a raw binary format, or wants to see it in the trace output but not
      sacrifice any performance, then they can implement the faster but
      more complex TRACE_EVENT_FORMAT macro.
      
      Here's what usage looks like:
      
        TRACE_EVENT_FORMAT(name,
      	TPPROTO(proto),
      	TPARGS(args),
      	TPFMT(fmt, fmt_args),
      	TRACE_STUCT(
      		TRACE_FIELD(type1, item1, assign1)
      		TRACE_FIELD(type2, item2, assign2)
      			[...]
      	),
      	TPRAWFMT(raw_fmt)
      	);
      
      Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT
      uses.
      
       name: is the unique identifier of the trace point
       proto: The proto type that the trace point uses
       args: the args in the proto type
       fmt: printf format to use with the event printf tracer
       fmt_args: the printf argments to match fmt
      
       TRACE_STRUCT starts the ability to create a structure.
       Each item in the structure is defined with a TRACE_FIELD
      
        TRACE_FIELD(type, item, assign)
      
       type: the C type of item.
       item: the name of the item in the stucture
       assign: what to assign the item in the trace point callback
      
       raw_fmt is a way to pretty print the struct. It must match
        the order of the items are added in TRACE_STUCT
      
       An example of this would be:
      
       TRACE_EVENT_FORMAT(sched_wakeup,
      	TPPROTO(struct rq *rq, struct task_struct *p, int success),
      	TPARGS(rq, p, success),
      	TPFMT("task %s:%d %s",
      	      p->comm, p->pid, success?"succeeded":"failed"),
      	TRACE_STRUCT(
      		TRACE_FIELD(pid_t, pid, p->pid)
      		TRACE_FIELD(int, success, success)
      	),
      	TPRAWFMT("task %d success=%d")
      	);
      
       This creates us a unique struct of:
      
       struct {
      	pid_t		pid;
      	int		success;
       };
      
       And the way the call back would assign these values would be:
      
      	entry->pid = p->pid;
      	entry->success = success;
      
      The nice part about this is that the creation of the assignent is done
      via macro magic in the event tracer.  Once the TRACE_EVENT_FORMAT is
      created, the developer will then have a faster method to record
      into the ring buffer. They do not need to worry about the tracer itself.
      
      The developer would only need to touch the files in include/trace/*.h
      
      Again, I would like to give special thanks to Tom Zanussi for this
      nice idea.
      
      Idea-from: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      c32e827b
    • S
      tracing: add interface to write into current tracer buffer · ef5580d0
      Steven Rostedt 提交于
      Right now all tracers must manage their own trace buffers. This was
      to enforce tracers to be independent in case we finally decide to
      allow each tracer to have their own trace buffer.
      
      But now we are adding event tracing that writes to the current tracer's
      buffer. This adds an interface to allow events to write to the current
      tracer buffer without having to manage its own. Since event tracing
      has no "tracer", and is just a way to hook into any other tracer.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      ef5580d0
  2. 25 2月, 2009 2 次提交
    • F
      tracing/core: make the read callbacks reentrants · d7350c3f
      Frederic Weisbecker 提交于
      Now that several per-cpu files can be read or spliced at the
      same, we want the read/splice callbacks for tracing files to be
      reentrants.
      
      Until now, a single global mutex (trace_types_lock) serialized
      the access to tracing_read_pipe(), tracing_splice_read_pipe(),
      and the seq helpers.
      
      Ie: it means that if a user tries to read trace_pipe0 and
      trace_pipe1 at the same time, the access to the function
      tracing_read_pipe() is contended and one reader must wait for
      the other to finish its read call.
      
      The trace_type_lock mutex is mostly here to serialize the access
      to the global current tracer (current_trace), which can be
      changed concurrently. Although the iter struct keeps a private
      pointer to this tracer, its callbacks can be changed by another
      function.
      
      The method used here is to not keep anymore private reference to
      the tracer inside the iterator but to make a copy of it inside
      the iterator. Then it checks on subsequents read calls if the
      tracer has changed. This is not costly because the current
      tracer is not expected to be changed often, so we use a branch
      prediction for that.
      
      Moreover, we add a private mutex to the iterator (there is one
      iterator per file descriptor) to serialize the accesses in case
      of multiple consumers per file descriptor (which would be a
      silly idea from the user). Note that this is not to protect the
      ring buffer, since the ring buffer already serializes the
      readers accesses. This is to prevent from traces weirdness in
      case of concurrent consumers. But these mutexes can be dropped
      anyway, that would not result in any crash. Just tell me what
      you think about it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d7350c3f
    • F
      tracing/core: introduce per cpu tracing files · b04cc6b1
      Frederic Weisbecker 提交于
      Impact: split up tracing output per cpu
      
      Currently, on the tracing debugfs directory, three files are
      available to the user to let him extracting the trace output:
      
      - trace is an iterator through the ring-buffer. It's a reader
        but not a consumer It doesn't block when no more traces are
        available.
      
      - trace pretty similar to the former, except that it adds more
        informations such as prempt count, irq flag, ...
      
      - trace_pipe is a reader and a consumer, it will also block
        waiting for traces if necessary (heh, yes it's a pipe).
      
      The traces coming from different cpus are curretly mixed up
      inside these files. Sometimes it messes up the informations,
      sometimes it's useful, depending on what does the tracer
      capture.
      
      The tracing_cpumask file is useful to filter the output and
      select only the traces captured a custom defined set of cpus.
      But still it is not enough powerful to extract at the same time
      one trace buffer per cpu.
      
      So this patch creates a new directory: /debug/tracing/per_cpu/.
      
      Inside this directory, you will now find one trace_pipe file and
      one trace file per cpu.
      
      Which means if you have two cpus, you will have:
      
       trace0
       trace1
       trace_pipe0
       trace_pipe1
      
      And of course, reading these files will have the same effect
      than with the usual tracing files, except that you will only see
      the traces from the given cpu.
      
      The original all-in-one cpu trace file are still available on
      their original place.
      
      Until now, only one consumer was allowed on trace_pipe to avoid
      racy consuming on the ring-buffer. Now the approach changed a
      bit, you can have only one consumer per cpu.
      
      Which means you are allowed to read concurrently trace_pipe0 and
      trace_pipe1 But you can't have two readers on trace_pipe0 or
      trace_pipe1.
      
      Following the same logic, if there is one reader on the common
      trace_pipe, you can not have at the same time another reader on
      trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
      a consumer in all cpu buffers in essence.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b04cc6b1
  3. 18 2月, 2009 1 次提交
    • F
      tracing/core: use appropriate waiting on trace_pipe · 6eaaa5d5
      Frederic Weisbecker 提交于
      Impact: api and pipe waiting change
      
      Currently, the waiting used in tracing_read_pipe() is done through a
      100 msecs schedule_timeout() loop which periodically check if there
      are traces on the buffer.
      
      This can cause small latencies for programs which are reading the incoming
      events.
      
      This patch makes the reader waiting for the trace_wait waitqueue except
      for few tracers such as the sched and functions tracers which might be
      already hold the runqueue lock while waking up the reader.
      
      This is performed through a new callback wait_pipe() on struct tracer.
      If none is implemented on a specific tracer, the default waiting for
      trace_wait queue is attached.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6eaaa5d5
  4. 10 2月, 2009 1 次提交
  5. 09 2月, 2009 3 次提交
  6. 08 2月, 2009 2 次提交
    • W
      trace: trivial fixes in comment typos. · 57794a9d
      Wenji Huang 提交于
      Impact: clean up
      
      Fixed several typos in the comments.
      Signed-off-by: NWenji Huang <wenji.huang@oracle.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      57794a9d
    • S
      trace: remove deprecated entry->cpu · 1830b52d
      Steven Rostedt 提交于
      Impact: fix to prevent developers from using entry->cpu
      
      With the new ring buffer infrastructure, the cpu for the entry is
      implicit with which CPU buffer it is on.
      
      The original code use to record the current cpu into the generic
      entry header, which can be retrieved by entry->cpu. When the
      ring buffer was introduced, the users were convert to use the
      the cpu number of which cpu ring buffer was in use (this was passed
      to the tracers by the iterator: iter->cpu).
      
      Unfortunately, the cpu item in the entry structure was never removed.
      This allowed for developers to use it instead of the proper iter->cpu,
      unknowingly, using an uninitialized variable. This was not the fault
      of the developers, since it would seem like the logical place to
      retrieve the cpu identifier.
      
      This patch removes the cpu item from the entry structure and fixes
      all the users that should have been using iter->cpu.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      1830b52d
  7. 06 2月, 2009 2 次提交
    • A
      trace: Call tracing_reset_online_cpus before tracer->init() · b6f11df2
      Arnaldo Carvalho de Melo 提交于
      Impact: cleanup
      
      To make it easy for ftrace plugin writers, as this was open coded in
      the existing plugins
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b6f11df2
    • A
      tracing: Introduce trace_buffer_{lock_reserve,unlock_commit} · 51a763dd
      Arnaldo Carvalho de Melo 提交于
      Impact: new API
      
      These new functions do what previously was being open coded, reducing
      the number of details ftrace plugin writers have to worry about.
      
      It also standardizes the handling of stacktrace, userstacktrace and
      other trace options we may introduce in the future.
      
      With this patch, for instance, the blk tracer (and some others already
      in the tree) can use the "userstacktrace" /d/tracing/trace_options
      facility.
      
      $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
      linux-2.6-tip/kernel/trace/trace.c:
        trace_vprintk              |   -5
        trace_graph_return         |  -22
        trace_graph_entry          |  -26
        trace_function             |  -45
        __ftrace_trace_stack       |  -27
        ftrace_trace_userstack     |  -29
        tracing_sched_switch_trace |  -66
        tracing_stop               |   +1
        trace_seq_to_user          |   -1
        ftrace_trace_special       |  -63
        ftrace_special             |   +1
        tracing_sched_wakeup_trace |  -70
        tracing_reset_online_cpus  |   -1
       13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
      
      linux-2.6-tip/block/blktrace.c:
        __blk_add_trace |  -58
       1 function changed, 58 bytes removed, diff: -58
      
      linux-2.6-tip/kernel/trace/trace.c:
        trace_buffer_lock_reserve  |  +88
        trace_buffer_unlock_commit |  +86
       2 functions changed, 174 bytes added, diff: +174
      
      /tmp/vmlinux.after:
       16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      51a763dd
  8. 05 2月, 2009 1 次提交
  9. 03 2月, 2009 1 次提交
    • F
      trace: better manage the context info for events · c4a8e8be
      Frederic Weisbecker 提交于
      Impact: make trace_event more convenient for tracers
      
      All tracers (for the moment) that use the struct trace_event want to
      have the context info printed before their own output: the pid/cmdline,
      cpu, and timestamp.
      
      But some other tracers that want to implement their trace_event
      callbacks will not necessary need these information or they may want to
      format them as they want.
      
      This patch adds a new default-enabled trace option:
      TRACE_ITER_CONTEXT_INFO When disabled through:
      
      echo nocontext-info > /debugfs/tracing/trace_options
      
      The pid, cpu and timestamps headers will not be printed.
      
      IE with the sched_switch tracer with context-info (default):
      
           bash-2935 [001] 100.356561: 2935:120:S ==> [001]  0:140:R <idle>
         <idle>-0    [000] 100.412804:    0:140:R   + [000] 11:115:S events/0
         <idle>-0    [000] 100.412816:    0:140:R ==> [000] 11:115:R events/0
       events/0-11   [000] 100.412829:   11:115:S ==> [000]  0:140:R <idle>
      
      Without context-info:
      
       2935:120:S ==> [001]  0:140:R <idle>
          0:140:R   + [000] 11:115:S events/0
          0:140:R ==> [000] 11:115:R events/0
         11:115:S ==> [000]  0:140:R <idle>
      
      A tracer can disable it at runtime by clearing the bit
      TRACE_ITER_CONTEXT_INFO in trace_flags.
      
      The print routines were renamed to trace_print_context and
      trace_print_lat_context, so that they can be used by tracers if they
      want to use them for one of the trace_event callbacks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c4a8e8be
  10. 26 1月, 2009 1 次提交
    • A
      blktrace: add ftrace plugin · c71a8961
      Arnaldo Carvalho de Melo 提交于
      Impact: New way of using the blktrace infrastructure
      
      This drops the requirement of userspace utilities to use the blktrace
      facility.
      
      Configuration is done thru sysfs, adding a "trace" directory to the
      partition directory where blktrace can be enabled for the associated
      request_queue.
      
      The same filters present in the IOCTL interface are present as sysfs
      device attributes.
      
      The /sys/block/sdX/sdXN/trace/enable file allows tracing without any
      filters.
      
      The other files in this directory: pid, act_mask, start_lba and end_lba
      can be used with the same meaning as with the IOCTL interface.
      
      Using the sysfs interface will only setup the request_queue->blk_trace
      fields, tracing will only take place when the "blk" tracer is selected
      via the ftrace interface, as in the following example:
      
      To see the trace, one can use the /d/tracing/trace file or the
      /d/tracign/trace_pipe file, with semantics defined in the ftrace
      documentation in Documentation/ftrace.txt.
      
      [root@f10-1 ~]# cat /t/trace
             kjournald-305   [000]  3046.491224:   8,1    A WBS 6367 + 8 <- (8,1) 6304
             kjournald-305   [000]  3046.491227:   8,1    Q   R 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491236:   8,1    G  RB 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491239:   8,1    P  NS [kjournald]
             kjournald-305   [000]  3046.491242:   8,1    I RBS 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491251:   8,1    D  WB 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491610:   8,1    U  WS [kjournald] 1
                <idle>-0     [000]  3046.511914:   8,1    C  RS 6367 + 8 [6367]
      [root@f10-1 ~]#
      
      The default line context (prefix) format is the one described in the ftrace
      documentation, with the blktrace specific bits using its existing format,
      described in blkparse(8).
      
      If one wants to have the classic blktrace formatting, this is possible by
      using:
      
      [root@f10-1 ~]# echo blk_classic > /t/trace_options
      [root@f10-1 ~]# cat /t/trace
        8,1    0  3046.491224   305  A WBS 6367 + 8 <- (8,1) 6304
        8,1    0  3046.491227   305  Q   R 6367 + 8 [kjournald]
        8,1    0  3046.491236   305  G  RB 6367 + 8 [kjournald]
        8,1    0  3046.491239   305  P  NS [kjournald]
        8,1    0  3046.491242   305  I RBS 6367 + 8 [kjournald]
        8,1    0  3046.491251   305  D  WB 6367 + 8 [kjournald]
        8,1    0  3046.491610   305  U  WS [kjournald] 1
        8,1    0  3046.511914     0  C  RS 6367 + 8 [6367]
      [root@f10-1 ~]#
      
      Using the ftrace standard format allows more flexibility, such
      as the ability of asking for backtraces via trace_options:
      
      [root@f10-1 ~]# echo noblk_classic > /t/trace_options
      [root@f10-1 ~]# echo stacktrace > /t/trace_options
      
      [root@f10-1 ~]# cat /t/trace
             kjournald-305   [000]  3318.826779:   8,1    A WBS 6375 + 8 <- (8,1) 6312
             kjournald-305   [000]  3318.826782:
       <= submit_bio
       <= submit_bh
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
       <= kthread
       <= child_rip
             kjournald-305   [000]  3318.826836:   8,1    Q   R 6375 + 8 [kjournald]
             kjournald-305   [000]  3318.826837:
       <= generic_make_request
       <= submit_bio
       <= submit_bh
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
       <= kthread
      
      Please read the ftrace documentation to use aditional, standardized
      tracing filters such as /d/tracing/trace_cpumask, etc.
      
      See also /d/tracing/trace_mark to add comments in the trace stream,
      that is equivalent to the /d/block/sdaN/msg interface.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c71a8961
  11. 20 1月, 2009 1 次提交
  12. 16 1月, 2009 2 次提交
    • S
      ftrace: remove static from function tracer functions · a225cdd2
      Steven Rostedt 提交于
      Impact: clean up
      
      After reorganizing the functions in trace.c and trace_function.c,
      they no longer need to be in global context. This patch makes the
      functions and one variable into static.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a225cdd2
    • S
      ftrace: add stack trace to function tracer · 53614991
      Steven Rostedt 提交于
      Impact: new feature to stack trace any function
      
      Chris Mason asked about being able to pick and choose a function
      and get a stack trace from it. This feature enables his request.
      
       # echo io_schedule > /debug/tracing/set_ftrace_filter
       # echo function > /debug/tracing/current_tracer
       # echo func_stack_trace > /debug/tracing/trace_options
      
      Produces the following in /debug/tracing/trace:
      
             kjournald-702   [001]   135.673060: io_schedule <-sync_buffer
             kjournald-702   [002]   135.673671:
       <= sync_buffer
       <= __wait_on_bit
       <= out_of_line_wait_on_bit
       <= __wait_on_buffer
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
      
      Note, be careful about turning this on without filtering the functions.
      You may find that you have a 10 second lag between typing and seeing
      what you typed. This is why the stack trace for the function tracer
      does not use the same stack_trace flag as the other tracers use.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      53614991
  13. 14 1月, 2009 1 次提交
  14. 11 1月, 2009 1 次提交
    • F
      tracing/ftrace: handle more than one stat file per tracer · 034939b6
      Frederic Weisbecker 提交于
      Impact: new API for tracers
      
      Make the stat tracing API reentrant. And also provide the new directory
      /debugfs/tracing/trace_stat which will contain all the stat files for the
      current active tracer.
      
      Now a tracer will, if desired, want to provide a zero terminated array of
      tracer_stat structures.
      Each one contains the callbacks necessary for one stat file.
      It have to provide at least a name for its stat file, an iterator with
      stat_start/start_next callback and an output callback for one stat entry.
      
      Also adapt the branch tracer to this new API.
      We create two files "all" and "annotated" inside the /debugfs/tracing/trace_stat
      directory, making the both stats simultaneously available instead of needing
      to change an option to switch from one stat file to another.
      
      The output of these stats haven't changed.
      
      Changes in v2:
      
      _ Apply the previous memory leak fix (rebase against tip/master)
      
      Changes in v3:
      
      _ Merge the patch that adapted the branch tracer to this Api in this patch to
        not break the kernel build.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      034939b6
  15. 01 1月, 2009 1 次提交
  16. 30 12月, 2008 1 次提交
    • F
      tracing/kmemtrace: normalize the raw tracer event to the unified tracing API · 36994e58
      Frederic Weisbecker 提交于
      Impact: new tracer plugin
      
      This patch adapts kmemtrace raw events tracing to the unified tracing API.
      
      To enable and use this tracer, just do the following:
      
       echo kmemtrace > /debugfs/tracing/current_tracer
       cat /debugfs/tracing/trace
      
      You will have the following output:
      
       # tracer: kmemtrace
       #
       #
       # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
       # FREE   |      |     |       |              |   |            |        |
       # |
      
      type_id 1 call_site 18446744071565527833 ptr 18446612134395152256
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
      type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
      type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
      
      That was to stay backward compatible with the format output produced in
      inux/tracepoint.h.
      
      This is the default ouput, but note that I tried something else.
      
      If you change an option:
      
      echo kmem_minimalistic > /debugfs/trace_options
      
      and then cat /debugfs/trace, you will have the following output:
      
       # tracer: kmemtrace
       #
       #
       # ALLOC  TYPE  REQ   GIVEN  FLAGS           POINTER         NODE    CALLER
       # FREE   |      |     |       |              |   |            |        |
       # |
      
         -      C                            0xffff88007c088780          file_free_rcu
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc780     -1   d_alloc
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc870     -1   d_alloc
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         +      K    240    240   000000d0   0xffff8800790dc960     -1   d_alloc
         +      K   1304   1312   000000d0   0xffff8800791d7340     -1   reiserfs_alloc_inode
         -      C                            0xffff88007cad6000          putname
         +      K   4096   4096   000000d0   0xffff88007cad6000     -1   getname
         -      C                            0xffff88007cad6000          putname
         +      K    992   1000   000000d0   0xffff880079045b58     -1   alloc_inode
         +      K    768   1024   000080d0   0xffff88007c096400     -1   alloc_pipe_info
         +      K    240    240   000000d0   0xffff8800790dca50     -1   d_alloc
         +      K    272    320   000080d0   0xffff88007c088780     -1   get_empty_filp
         +      K    272    320   000080d0   0xffff88007c088000     -1   get_empty_filp
      
      Yeah I shall confess kmem_minimalistic should be: kmem_alternative.
      
      Whatever, I find it more readable but this a personal opinion of course.
      We can drop it if you want.
      
      On the ALLOC/FREE column, + means an allocation and - a free.
      
      On the type column, you have K = kmalloc, C = cache, P = page
      
      I would like the flags to be GFP_* strings but that would not be easy to not
      break the column with strings....
      
      About the node...it seems to always be -1. I don't know why but that shouldn't
      be difficult to find.
      
      I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would
      be more easy to find the tracer headers if they are all in their common
      directory.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      36994e58
  17. 29 12月, 2008 4 次提交
    • I
      tracing/ftrace: make trace_find_cmdline() generally available · f7d48cbd
      Ingo Molnar 提交于
      Impact: build fix
      
      On !CONFIG_CONTEXT_SWITCH_TRACER trace_find_cmdline() is not defined:
      
       kernel/trace/trace_output.c: In function 'trace_ctxwake_print':
       kernel/trace/trace_output.c:499: error: implicit declaration of function 'trace_find_cmdline'
       kernel/trace/trace_output.c:499: warning: assignment makes pointer from integer without a cast
      
      Move it to the generic section in trace.h.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f7d48cbd
    • F
      tracing/ftrace: provide the base infrastructure for histogram tracing · dbd0b4b3
      Frederic Weisbecker 提交于
      Impact: extend the tracing API
      
      The goal of this patch is to normalize and make more easy the
      implementation of statistical (histogram) tracing.
      
      It implements a trace_stat file into the /debugfs/tracing directory where
      one can print a one-shot output of statistics/histogram entries.
      
      A tracer has to provide two basic iterator callbacks:
      
        stat_start() => the first entry
        stat_next(prev, idx) => the next one.
      
      Note that it is adapted for arrays or hash tables or lists.... since it
      provides a pointer to the previous entry and the current index of the
      iterator.
      
      These two callbacks are called to get a snapshot of the statistics at each
      opening of the trace_stat file because. The values are so updated between
      two "cat trace_stat". And the tracer is free to lock its datas during the
      iteration to keep consistent values.
      
      Since it is almost always interesting to sort statisticals values to
      address the problems by priority, this infrastructure provides a "sorting"
      of the stat entries too if desired. A tracer has just to provide a
      stat_cmp callback to compare two entries and the stat tracing
      infrastructure will build a sorted list of the given entries.
      
      A last callback, called stat_headers, can be implemented by a tracer to
      output headers on its trace.
      
      If one of these callbacks is changed on runtime, it just have to signal it
      to the stat tracing API by calling the init_tracer_stat() helper.
      
      Changes in V2:
      
      - Fix a memory leak if the user opens multiple times the trace_stat file
        without closing it. Now we always free our list before rebuilding it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dbd0b4b3
    • S
      ftrace: set up trace event hash infrastructure · f0868d1e
      Steven Rostedt 提交于
      Impact: simplify/generalize/refactor trace.c
      
      The trace.c file is becoming more difficult to maintain due to the
      growing number of events. There is several formats that an event may
      be printed. This patch sets up the infrastructure of an event hash to
      allow for events to register how they should be printed.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f0868d1e
    • S
      ftrace: remove obsolete print continue functionality · c47956d9
      Steven Rostedt 提交于
      Impact: cleanup, remove obsolete code
      
      Now that the ring buffer used by ftrace allows for variable length
      entries, we do not need the 'cont' feature of the buffer.  This code
      makes other parts of ftrace more complex and by removing this it
      simplifies the ftrace code.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c47956d9
  18. 19 12月, 2008 1 次提交
  19. 17 12月, 2008 1 次提交
    • F
      tracing/ftrace: add the printk-msg-only option · 66896a85
      Frederic Weisbecker 提交于
      Impact: display ftrace_printk messages "as is"
      
      By default, ftrace_printk() messages find their output with some other
      informations like pid, caller, ...
      Sometimes a developer just want to have the ftrace_printk left "as is", without
      other information.
      
      This is done by providing a default-off option called printk-msg-only.
      To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options`
      
      Before the patch:
      
                 <...>-2739  [000]   145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep
                 <...>-2739  [000]   145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep
      
      After the patch and the printk-msg-only option enabled:
      
      I'm an ftrace_printk msg in __might_sleep
      I'm another ftrace_printk msg in __might_sleep
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66896a85
  20. 12 12月, 2008 1 次提交
  21. 05 12月, 2008 1 次提交
    • F
      tracing/ftrace: fix the check of ftrace_trace_task · 77d683f3
      Frederic Weisbecker 提交于
      Impact: fix default empty traces on function-graph-tracer
      
      The actual ftrace_trace_task() checks if ftrace_pid_trace is allocated
      and return 1 if it is true.
      If it is NULL, it will check the bit of pid tracing flag for the current
      task (which are not set by default).
      So by default, a task is not traced.
      Actually all tasks should be traced by default and filter_by_pid when
      ftrace_pid_trace is allocated.
      
      The appropriate condition should be to return 1 if filter_by_pid is
      set.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acke-dby: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77d683f3
  22. 04 12月, 2008 5 次提交
    • F
      tracing/function-graph-tracer: handle ftrace_printk entries · 1fd8f2a3
      Frederic Weisbecker 提交于
      Handle the TRACE_PRINT entries from the function grapg tracer
      and output them as a C comment just below the function that called
      it, as if it was a comment inside this function.
      
      Example with an ftrace_printk inside might_sleep() function:
      
      void __might_sleep(char *file, int line)
      {
      	static unsigned long prev_jiffy;	/* ratelimiting */
      
      	ftrace_printk("Hi I'm a comment in might_sleep() :-)");
      
      A chunk of a resulting trace:
      
       0)               |        _reiserfs_free_block() {
       0)               |          reiserfs_read_bitmap_block() {
       0)               |            __bread() {
       0)               |              __getblk() {
       0)               |                __find_get_block() {
       0)   0.698 us    |                  mark_page_accessed();
       0)   2.267 us    |                }
       0)               |                __might_sleep() {
       0)               |                  /* Hi I'm a comment in might_sleep() :-) */
       0)   1.321 us    |                }
       0)   5.872 us    |              }
       0)   7.313 us    |            }
       0)   8.718 us    |          }
      
      And this patch brings two minor fixes:
      
      - The newline after a switch-out task has disappeared
      - The "|" sign just before the cpu number on task-switch has been deleted.
      
       0)   0.616 us    |                pick_next_task_rt();
       0)   1.457 us    |                _spin_trylock();
       0)   0.653 us    |                _spin_unlock();
       0)   0.728 us    |                _spin_trylock();
       0)   0.631 us    |                _spin_unlock();
       0)   0.729 us    |                native_load_sp0();
       0)   0.593 us    |                native_load_tls();
       ------------------------------------------
       0)    cat-2834    =>   migrati-3
       ------------------------------------------
      
       0)               |    finish_task_switch() {
       0)   0.841 us    |      _spin_unlock_irq();
       0)   0.616 us    |      post_schedule_rt();
       0)   3.882 us    |    }
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1fd8f2a3
    • I
      tracing: fix typo and missing inline function · 6b253930
      Ingo Molnar 提交于
      Impact: fix build bugs
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6b253930
    • S
      ftrace: use struct pid · 978f3a45
      Steven Rostedt 提交于
      Impact: clean up, extend PID filtering to PID namespaces
      
      Eric Biederman suggested using the struct pid for filtering on
      pids in the kernel. This patch is based off of a demonstration
      of an implementation that Eric sent me in an email.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      978f3a45
    • S
      ftrace: trace single pid for function graph tracer · 804a6851
      Steven Rostedt 提交于
      Impact: New feature
      
      This patch makes the changes to set_ftrace_pid apply to the function
      graph tracer.
      
        # echo $$ > /debugfs/tracing/set_ftrace_pid
        # echo function_graph > /debugfs/tracing/current_tracer
      
      Will cause only the current task to be traced. Note, the trace flags are
      also inherited by child processes, so the children of the shell
      will also be traced.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      804a6851
    • S
      ftrace: graph of a single function · ea4e2bc4
      Steven Rostedt 提交于
      This patch adds the file:
      
         /debugfs/tracing/set_graph_function
      
      which can be used along with the function graph tracer.
      
      When this file is empty, the function graph tracer will act as
      usual. When the file has a function in it, the function graph
      tracer will only trace that function.
      
      For example:
      
       # echo blk_unplug > /debugfs/tracing/set_graph_function
       # cat /debugfs/tracing/trace
       [...]
       ------------------------------------------
       | 2)  make-19003  =>  kjournald-2219
       ------------------------------------------
      
       2)               |  blk_unplug() {
       2)               |    dm_unplug_all() {
       2)               |      dm_get_table() {
       2)      1.381 us |        _read_lock();
       2)      0.911 us |        dm_table_get();
       2)      1. 76 us |        _read_unlock();
       2) +   12.912 us |      }
       2)               |      dm_table_unplug_all() {
       2)               |        blk_unplug() {
       2)      0.778 us |          generic_unplug_device();
       2)      2.409 us |        }
       2)      5.992 us |      }
       2)      0.813 us |      dm_table_put();
       2) +   29. 90 us |    }
       2) +   34.532 us |  }
      
      You can add up to 32 functions into this file. Currently we limit it
      to 32, but this may change with later improvements.
      
      To add another function, use the append '>>':
      
        # echo sys_read >> /debugfs/tracing/set_graph_function
        # cat /debugfs/tracing/set_graph_function
        blk_unplug
        sys_read
      
      Using the '>' will clear out the function and write anew:
      
        # echo sys_write > /debug/tracing/set_graph_function
        # cat /debug/tracing/set_graph_function
        sys_write
      
      Note, if you have function graph running while doing this, the small
      time between clearing it and updating it will cause the graph to
      record all functions. This should not be an issue because after
      it sets the filter, only those functions will be recorded from then on.
      If you need to only record a particular function then set this
      file first before starting the function graph tracer. In the future
      this side effect may be corrected.
      
      The set_graph_function file is similar to the set_ftrace_filter but
      it does not take wild cards nor does it allow for more than one
      function to be set with a single write. There is no technical reason why
      this is the case, I just do not have the time yet to implement that.
      
      Note, dynamic ftrace must be enabled for this to appear because it
      uses the dynamic ftrace records to match the name to the mcount
      call sites.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea4e2bc4
  23. 03 12月, 2008 1 次提交
  24. 26 11月, 2008 2 次提交