1. 06 2月, 2009 3 次提交
    • A
      trace: Call tracing_reset_online_cpus before tracer->init() · b6f11df2
      Arnaldo Carvalho de Melo 提交于
      Impact: cleanup
      
      To make it easy for ftrace plugin writers, as this was open coded in
      the existing plugins
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b6f11df2
    • A
      tracing: Introduce trace_buffer_{lock_reserve,unlock_commit} · 51a763dd
      Arnaldo Carvalho de Melo 提交于
      Impact: new API
      
      These new functions do what previously was being open coded, reducing
      the number of details ftrace plugin writers have to worry about.
      
      It also standardizes the handling of stacktrace, userstacktrace and
      other trace options we may introduce in the future.
      
      With this patch, for instance, the blk tracer (and some others already
      in the tree) can use the "userstacktrace" /d/tracing/trace_options
      facility.
      
      $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
      linux-2.6-tip/kernel/trace/trace.c:
        trace_vprintk              |   -5
        trace_graph_return         |  -22
        trace_graph_entry          |  -26
        trace_function             |  -45
        __ftrace_trace_stack       |  -27
        ftrace_trace_userstack     |  -29
        tracing_sched_switch_trace |  -66
        tracing_stop               |   +1
        trace_seq_to_user          |   -1
        ftrace_trace_special       |  -63
        ftrace_special             |   +1
        tracing_sched_wakeup_trace |  -70
        tracing_reset_online_cpus  |   -1
       13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
      
      linux-2.6-tip/block/blktrace.c:
        __blk_add_trace |  -58
       1 function changed, 58 bytes removed, diff: -58
      
      linux-2.6-tip/kernel/trace/trace.c:
        trace_buffer_lock_reserve  |  +88
        trace_buffer_unlock_commit |  +86
       2 functions changed, 174 bytes added, diff: +174
      
      /tmp/vmlinux.after:
       16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      51a763dd
    • A
      ring_buffer: remove unused flags parameter · 0a987751
      Arnaldo Carvalho de Melo 提交于
      Impact: API change, cleanup
      
      >From ring_buffer_{lock_reserve,unlock_commit}.
      
      $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
      linux-2.6-tip/kernel/trace/trace.c:
        trace_vprintk              |  -14
        trace_graph_return         |  -14
        trace_graph_entry          |  -10
        trace_function             |   -8
        __ftrace_trace_stack       |   -8
        ftrace_trace_userstack     |   -8
        tracing_sched_switch_trace |   -8
        ftrace_trace_special       |  -12
        tracing_sched_wakeup_trace |   -8
       9 functions changed, 90 bytes removed, diff: -90
      
      linux-2.6-tip/block/blktrace.c:
        __blk_add_trace |   -1
       1 function changed, 1 bytes removed, diff: -1
      
      /tmp/vmlinux.after:
       10 functions changed, 91 bytes removed, diff: -91
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NFrédéric Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0a987751
  2. 05 2月, 2009 6 次提交
  3. 04 2月, 2009 1 次提交
  4. 03 2月, 2009 4 次提交
    • A
      trace: Change struct trace_event callbacks parameter list · 2c9b238e
      Arnaldo Carvalho de Melo 提交于
      Impact: API change
      
      The trace_seq and trace_entry are in trace_iterator, where there are
      more fields that may be needed by tracers, so just pass the
      tracer_iterator as is already the case for struct tracer->print_line.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2c9b238e
    • F
      trace: better manage the context info for events · c4a8e8be
      Frederic Weisbecker 提交于
      Impact: make trace_event more convenient for tracers
      
      All tracers (for the moment) that use the struct trace_event want to
      have the context info printed before their own output: the pid/cmdline,
      cpu, and timestamp.
      
      But some other tracers that want to implement their trace_event
      callbacks will not necessary need these information or they may want to
      format them as they want.
      
      This patch adds a new default-enabled trace option:
      TRACE_ITER_CONTEXT_INFO When disabled through:
      
      echo nocontext-info > /debugfs/tracing/trace_options
      
      The pid, cpu and timestamps headers will not be printed.
      
      IE with the sched_switch tracer with context-info (default):
      
           bash-2935 [001] 100.356561: 2935:120:S ==> [001]  0:140:R <idle>
         <idle>-0    [000] 100.412804:    0:140:R   + [000] 11:115:S events/0
         <idle>-0    [000] 100.412816:    0:140:R ==> [000] 11:115:R events/0
       events/0-11   [000] 100.412829:   11:115:S ==> [000]  0:140:R <idle>
      
      Without context-info:
      
       2935:120:S ==> [001]  0:140:R <idle>
          0:140:R   + [000] 11:115:S events/0
          0:140:R ==> [000] 11:115:R events/0
         11:115:S ==> [000]  0:140:R <idle>
      
      A tracer can disable it at runtime by clearing the bit
      TRACE_ITER_CONTEXT_INFO in trace_flags.
      
      The print routines were renamed to trace_print_context and
      trace_print_lat_context, so that they can be used by tracers if they
      want to use them for one of the trace_event callbacks.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c4a8e8be
    • S
      trace: let boot trace be chosen by command line · 79fb0768
      Steven Rostedt 提交于
      Now that we have a working ftrace=<tracer> function, make the boot
      tracer get activated by it. This way we can turn it on or off without
      recompiling the kernel, as well as keeping the selftests on. The
      selftests are disabled whenever a default tracer starts running.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      79fb0768
    • S
      trace: fix default boot up tracer · b2821ae6
      Steven Rostedt 提交于
      Peter Zijlstra started the functionality to start up a default
      tracing at bootup. This patch finishes the work.
      
      Now if you add 'ftrace=<tracer>' to the command line, when that tracer
      is registered on bootup, that tracer is selected and starts tracing.
      
      Note, all selftests for tracers that are registered after this tracer
      is disabled. This prevents the selftests from disturbing the running
      tracer, or the running tracer from disturbing the selftest.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b2821ae6
  5. 30 1月, 2009 1 次提交
  6. 29 1月, 2009 2 次提交
  7. 26 1月, 2009 3 次提交
    • A
      blktrace: add ftrace plugin · c71a8961
      Arnaldo Carvalho de Melo 提交于
      Impact: New way of using the blktrace infrastructure
      
      This drops the requirement of userspace utilities to use the blktrace
      facility.
      
      Configuration is done thru sysfs, adding a "trace" directory to the
      partition directory where blktrace can be enabled for the associated
      request_queue.
      
      The same filters present in the IOCTL interface are present as sysfs
      device attributes.
      
      The /sys/block/sdX/sdXN/trace/enable file allows tracing without any
      filters.
      
      The other files in this directory: pid, act_mask, start_lba and end_lba
      can be used with the same meaning as with the IOCTL interface.
      
      Using the sysfs interface will only setup the request_queue->blk_trace
      fields, tracing will only take place when the "blk" tracer is selected
      via the ftrace interface, as in the following example:
      
      To see the trace, one can use the /d/tracing/trace file or the
      /d/tracign/trace_pipe file, with semantics defined in the ftrace
      documentation in Documentation/ftrace.txt.
      
      [root@f10-1 ~]# cat /t/trace
             kjournald-305   [000]  3046.491224:   8,1    A WBS 6367 + 8 <- (8,1) 6304
             kjournald-305   [000]  3046.491227:   8,1    Q   R 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491236:   8,1    G  RB 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491239:   8,1    P  NS [kjournald]
             kjournald-305   [000]  3046.491242:   8,1    I RBS 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491251:   8,1    D  WB 6367 + 8 [kjournald]
             kjournald-305   [000]  3046.491610:   8,1    U  WS [kjournald] 1
                <idle>-0     [000]  3046.511914:   8,1    C  RS 6367 + 8 [6367]
      [root@f10-1 ~]#
      
      The default line context (prefix) format is the one described in the ftrace
      documentation, with the blktrace specific bits using its existing format,
      described in blkparse(8).
      
      If one wants to have the classic blktrace formatting, this is possible by
      using:
      
      [root@f10-1 ~]# echo blk_classic > /t/trace_options
      [root@f10-1 ~]# cat /t/trace
        8,1    0  3046.491224   305  A WBS 6367 + 8 <- (8,1) 6304
        8,1    0  3046.491227   305  Q   R 6367 + 8 [kjournald]
        8,1    0  3046.491236   305  G  RB 6367 + 8 [kjournald]
        8,1    0  3046.491239   305  P  NS [kjournald]
        8,1    0  3046.491242   305  I RBS 6367 + 8 [kjournald]
        8,1    0  3046.491251   305  D  WB 6367 + 8 [kjournald]
        8,1    0  3046.491610   305  U  WS [kjournald] 1
        8,1    0  3046.511914     0  C  RS 6367 + 8 [6367]
      [root@f10-1 ~]#
      
      Using the ftrace standard format allows more flexibility, such
      as the ability of asking for backtraces via trace_options:
      
      [root@f10-1 ~]# echo noblk_classic > /t/trace_options
      [root@f10-1 ~]# echo stacktrace > /t/trace_options
      
      [root@f10-1 ~]# cat /t/trace
             kjournald-305   [000]  3318.826779:   8,1    A WBS 6375 + 8 <- (8,1) 6312
             kjournald-305   [000]  3318.826782:
       <= submit_bio
       <= submit_bh
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
       <= kthread
       <= child_rip
             kjournald-305   [000]  3318.826836:   8,1    Q   R 6375 + 8 [kjournald]
             kjournald-305   [000]  3318.826837:
       <= generic_make_request
       <= submit_bio
       <= submit_bh
       <= sync_dirty_buffer
       <= journal_commit_transaction
       <= kjournald
       <= kthread
      
      Please read the ftrace documentation to use aditional, standardized
      tracing filters such as /d/tracing/trace_cpumask, etc.
      
      See also /d/tracing/trace_mark to add comments in the trace stream,
      that is equivalent to the /d/block/sdaN/msg interface.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c71a8961
    • A
      ftrace: add ftrace_vprintk · 9011262a
      Arnaldo Carvalho de Melo 提交于
      Impact: new helper function
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9011262a
    • R
      kmemtrace: fix printk format warnings · cc2f6d90
      Randy Dunlap 提交于
      Fix kmemtrace printk warnings:
      
        kernel/trace/kmemtrace.c:142: warning: format '%4ld' expects type 'long int', but argument 3 has type 'size_t'
        kernel/trace/kmemtrace.c:147: warning: format '%4ld' expects type 'long int', but argument 3 has type 'size_t'
      Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
      Acked-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc2f6d90
  8. 23 1月, 2009 3 次提交
    • F
      tracing/function-graph-tracer: various fixes and features · 9005f3eb
      Frederic Weisbecker 提交于
      This patch brings various bugfixes:
      
      - Drop the first irrelevant task switch on the very beginning of a trace.
      
      - Drop the OVERHEAD word from the headers, the DURATION word is sufficient
        and will not overlap other columns.
      
      - Make the headers fit well their respective columns whatever the
        selected options.
      
      Ie, default options:
      
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
      
        1)   0.646 us    |                    }
        1)               |                    mem_cgroup_del_lru_list() {
        1)   0.624 us    |                      lookup_page_cgroup();
        1)   1.970 us    |                    }
      
       echo funcgraph-proc > trace_options
      
       # tracer: function_graph
       #
       # CPU  TASK/PID        DURATION                  FUNCTION CALLS
       # |    |    |           |   |                     |   |   |   |
      
        0)   bash-2937    |   0.895 us    |                }
        0)   bash-2937    |   0.888 us    |                __rcu_read_unlock();
        0)   bash-2937    |   0.864 us    |                conv_uni_to_pc();
        0)   bash-2937    |   1.015 us    |                __rcu_read_lock();
      
       echo nofuncgraph-cpu > trace_options
       echo nofuncgraph-proc > trace_options
      
       # tracer: function_graph
       #
       #   DURATION                  FUNCTION CALLS
       #    |   |                     |   |   |   |
      
         3.752 us    |                  native_pud_val();
         0.616 us    |                  native_pud_val();
         0.624 us    |                  native_pmd_val();
      
      About features, one can now disable the duration (this will hide the
      overhead too for convenient reasons and because on  doesn't need
      overhead if it hasn't the duration):
      
       echo nofuncgraph-duration > trace_options
      
       # tracer: function_graph
       #
       #                FUNCTION CALLS
       #                |   |   |   |
      
                 cap_vm_enough_memory() {
                   __vm_enough_memory() {
                     vm_acct_memory();
                   }
                 }
               }
      
      And at last, an option to print the absolute time:
      
       //Restart from default options
       echo funcgraph-abstime > trace_options
      
       # tracer: function_graph
       #
       #      TIME       CPU  DURATION                  FUNCTION CALLS
       #       |         |     |   |                     |   |   |   |
      
         261.339774 |   1) + 42.823 us   |    }
         261.339775 |   1)   1.045 us    |    _spin_lock_irq();
         261.339777 |   1)   0.940 us    |    _spin_lock_irqsave();
         261.339778 |   1)   0.752 us    |    _spin_unlock_irqrestore();
         261.339780 |   1)   0.857 us    |    _spin_unlock_irq();
         261.339782 |   1)               |    flush_to_ldisc() {
         261.339783 |   1)               |      tty_ldisc_ref() {
         261.339783 |   1)               |        tty_ldisc_try() {
         261.339784 |   1)   1.075 us    |          _spin_lock_irqsave();
         261.339786 |   1)   0.842 us    |          _spin_unlock_irqrestore();
         261.339788 |   1)   4.211 us    |        }
         261.339788 |   1)   5.662 us    |      }
      
      The format is seconds.usecs.
      
      I guess no one needs the nanosec precision here, the main goal is to have
      an overview about the general timings of events, and to see the place when
      the trace switches from one cpu to another.
      
      ie:
      
         274.874760 |   1)   0.676 us    |      _spin_unlock();
         274.874762 |   1)   0.609 us    |      native_load_sp0();
         274.874763 |   1)   0.602 us    |      native_load_tls();
         274.878739 |   0)   0.722 us    |                  }
         274.878740 |   0)   0.714 us    |                  native_pmd_val();
         274.878741 |   0)   0.730 us    |                  native_pmd_val();
      
      Here there is a 4000 usecs difference when we switch the cpu.
      
      Changes in V2:
      
      - Completely fix the first pointless task switch.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9005f3eb
    • S
      trace: fix logic to start/stop counting · b06a8301
      Steven Rostedt 提交于
      The logic in the tracing_start/stop code prevents the WARN_ON
      from ever detecting if a start/stop pair was mismatched.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b06a8301
    • S
      trace: remove internal irqsoff disabling for trace output · 94523e81
      Steven Rostedt 提交于
      Impact: cleanup of duplicate features
      
      The trace output disables the ring buffer and prevents tracing to
      occur. The code in irqsoff to do the same thing is no longer needed.
      This patch removes it.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      94523e81
  9. 22 1月, 2009 6 次提交
  10. 21 1月, 2009 5 次提交
    • S
      trace: set max latency variable to zero on default · 1092307d
      Steven Rostedt 提交于
      Impact: trace max latencies on start of latency tracing
      
      This patch sets the max latency to zero whenever one of the
      irq variant tracers or the wakeup tracer is set to current tracer.
      
      Most developers expect to see output when starting up a latency
      tracer. But since the max_latency is already set to max, and
      it takes a latency greater than max_latency to be recorded, there
      is no trace. This is not the expected behavior and has even confused
      myself.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1092307d
    • S
      trace: stop all recording to ring buffer on ftrace_dump · a442e5e0
      Steven Rostedt 提交于
      Impact: limit ftrace dump output
      
      Currently ftrace_dump only calls ftrace_kill that is a fast way
      to prevent the function tracer functions from being called (just sets
      a flag and clears the function to call, nothing else). It is better
      to also turn off any recording to the ring buffers as well.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a442e5e0
    • S
      trace: print ftrace_dump at KERN_EMERG log level · faf6861e
      Steven Rostedt 提交于
      Impact: fix to print out ftrace_dump when expected
      
      I was debugging a hard race condition to only find out that
      after I hit the race, my log level was not at level to show
      KERN_INFO. The time it took to trigger the race was wasted because
      I did not capture the trace.
      
      Since ftrace_dump is only called from kernel oops (and only when
      it is set in the kernel command line to do so), or when a
      developer adds it to their own local tree, the log level of
      the print should be at KERN_EMERG to make sure the print appears.
      
      ftrace_dump is not called by a normal user setup, and will not
      add extra unwanted print out to the console. There is no reason
      it should be at KERN_INFO.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faf6861e
    • L
      ring_buffer: reset write when reserve buffer fail · 551b4048
      Lai Jiangshan 提交于
      Impact: reset struct buffer_page.write when interrupt storm
      
      if struct buffer_page.write is not reset, any succedent committing
      will corrupted ring_buffer:
      
      static inline void
      rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
      {
      	......
      		cpu_buffer->commit_page->commit =
      			cpu_buffer->commit_page->write;
      	......
      }
      
      when "if (RB_WARN_ON(cpu_buffer, next_page == reader_page))", ring_buffer
      is disabled, but some reserved buffers may haven't been committed.
      we need reset struct buffer_page.write.
      
      when "if (unlikely(next_page == cpu_buffer->commit_page))", ring_buffer
      is still available, we should not corrupt it.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      551b4048
    • F
      tracing/function-graph-tracer: fix a regression while suspend to disk · 00f57f54
      Frederic Weisbecker 提交于
      Impact: fix a crash while kernel image restore
      
      When the function graph tracer is running and while suspend to disk, some racy
      and dangerous things happen against this tracer.
      
      The current task will save its registers including the stack pointer which
      contains the return address hooked by the tracer. But the current task will
      continue to enter other functions after that to save the memory, and then
      it will store other return addresses, and finally loose the old depth which
      matches the return address saved in the old stack (during the registers saving).
      
      So on image restore, the code will return to wrong addresses.
      And there are other things: on restore, the task will have it's "current"
      pointer overwritten during registers restoring....switching from one task to
      another... That would be insane to try to trace function graphs at these
      stages.
      
      This patch makes the function graph tracer listening on power events, making
      it's tracing disabled for the current task (the one that performs the
      hibernation work) while suspend/resume to disk, making the tracing safe
      during hibernation.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      00f57f54
  11. 20 1月, 2009 6 次提交