1. 25 3月, 2009 4 次提交
    • S
      tracing: adding function timings to function profiler · 0706f1c4
      Steven Rostedt 提交于
      If the function graph trace is enabled, the function profiler will
      use it to take the timing of the functions.
      
       cat /debug/tracing/trace_stat/functions
      
        Function                               Hit    Time
        --------                               ---    ----
        mwait_idle                             127    183028.4 us
        schedule                                26    151997.7 us
        __schedule                              31    151975.1 us
        sys_wait4                                2    74080.53 us
        do_wait                                  2    74077.80 us
        sys_newlstat                           138    39929.16 us
        do_path_lookup                         179    39845.79 us
        vfs_lstat_fd                           138    39761.97 us
        user_path_at                           153    39469.58 us
        path_walk                              179    39435.76 us
        __link_path_walk                       189    39143.73 us
      [...]
      
      Note the times are skewed due to the function graph tracer not taking
      into account schedules.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      0706f1c4
    • S
      tracing: move function profiler data out of function struct · 493762fc
      Steven Rostedt 提交于
      Impact: reduce size of memory in function profiler
      
      The function profiler originally introduces its counters into the
      function records itself. There is 20 thousand different functions on
      a normal system, and that is adding 20 thousand counters for profiling
      event when not needed.
      
      A normal run of the profiler yields only a couple of thousand functions
      executed, depending on what is being profiled. This means we have around
      18 thousand useless counters.
      
      This patch rectifies this by moving the data out of the function
      records used by dynamic ftrace. Data is preallocated to hold the functions
      when the profiling begins. Checks are made during profiling to see if
      more recorcds should be allocated, and they are allocated if it is safe
      to do so.
      
      This also removes the dependency from using dynamic ftrace, and also
      removes the overhead by having it enabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      493762fc
    • S
      tracing: add function profiler · bac429f0
      Steven Rostedt 提交于
      Impact: new profiling feature
      
      This patch adds a function profiler. In debugfs/tracing/ two new
      files are created.
      
        function_profile_enabled  - to enable or disable profiling
      
        trace_stat/functions   - the profiled functions.
      
      For example:
      
        echo 1 > /debugfs/tracing/function_profile_enabled
        ./hackbench 50
        echo 0 > /debugfs/tracing/function_profile_enabled
      
      yields:
      
        cat /debugfs/tracing/trace_stat/functions
      
        Function                               Hit
        --------                               ---
        _spin_lock                        10106442
        _spin_unlock                      10097492
        kfree                              6013704
        _spin_unlock_irqrestore            4423941
        _spin_lock_irqsave                 4406825
        __phys_addr                        4181686
        __slab_free                        4038222
        dput                               4030130
        path_put                           4023387
        unroll_tree_refs                   4019532
      [...]
      
      The most hit functions are listed first. Functions that are not
      hit are not listed.
      
      This feature depends on and uses dynamic function tracing. When the
      function profiling is disabled, no overhead occurs. But it still
      takes up around 300KB to hold the data, thus it is not recomended
      to keep it enabled for systems low on memory.
      
      When a '1' is echoed into the function_profile_enabled file, the
      counters for is function is reset back to zero. Thus you can see what
      functions are hit most by different programs.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      bac429f0
    • S
      tracing: add handler to trace_stat · 42548008
      Steven Rostedt 提交于
      Currently, if a trace_stat user wants a handle to some private data,
      the trace_stat infrastructure does not supply a way to do that.
      
      This patch passes the trace_stat structure to the start function of
      the trace_stat code.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      42548008
  2. 24 3月, 2009 8 次提交
    • L
      tracing: use union for multi-usages field · ee000b7f
      Lai Jiangshan 提交于
      Impact: cleanup
      
      struct dyn_ftrace::ip has different usages in his lifecycle,
      we use union for it. And also for struct dyn_ftrace::flags.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49C871BE.3080405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ee000b7f
    • L
      ftrace: show virtual PID · cc59c9e8
      Lai Jiangshan 提交于
      Impact: fix PID output under namespaces
      
      When current namespace is not the global namespace,
      pid read from set_ftrace_pid is no correct.
      
       # ~/newpid_namespace_run bash
       # echo $$
       1
       # echo 1 > set_ftrace_pid
       # cat set_ftrace_pid
       3756
      
      Since we write virtual PID to set_ftrace_pid, we need get
      virtual PID when we read it.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49C84D65.9050606@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc59c9e8
    • S
      function-graph: add option for include sleep times · be6f164a
      Steven Rostedt 提交于
      Impact: give user a choice to show times spent while sleeping
      
      The user may want to see the time a function spent sleeping.
      This patch adds the trace option "sleep-time" to allow that.
      The "sleep-time" option is default on.
      
       echo sleep-time > /debug/tracing/trace_options
      
      produces:
      
       ------------------------------------------
       2)  avahi-d-3428  =>    <idle>-0
       ------------------------------------------
      
       2)               |      finish_task_switch() {
       2)   0.621 us    |        _spin_unlock_irq();
       2)   2.202 us    |      }
       2) ! 1002.197 us |    }
       2) ! 1003.521 us |  }
      
      where as,
      
       echo nosleep-time > /debug/tracing/trace_options
      
      produces:
      
       0)    <idle>-0    =>  yum-upd-3416
       ------------------------------------------
      
       0)               |              finish_task_switch() {
       0)   0.643 us    |                _spin_unlock_irq();
       0)   2.342 us    |              }
       0) + 41.302 us   |            }
       0) + 42.453 us   |          }
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      be6f164a
    • S
      function-graph: ignore times across schedule · 8aef2d28
      Steven Rostedt 提交于
      Impact: more accurate timings
      
      The current method of function graph tracing does not take into
      account the time spent when a task is not running. This shows functions
      that call schedule have increased costs:
      
       3) + 18.664 us   |      }
       ------------------------------------------
       3)    <idle>-0    =>  kblockd-123
       ------------------------------------------
      
       3)               |      finish_task_switch() {
       3)   1.441 us    |        _spin_unlock_irq();
       3)   3.966 us    |      }
       3) ! 2959.433 us |    }
       3) ! 2961.465 us |  }
      
      This patch uses the tracepoint in the scheduling context switch to
      account for time that has elapsed while a task is scheduled out.
      Now we see:
      
       ------------------------------------------
       3)    <idle>-0    =>  edac-po-1067
       ------------------------------------------
      
       3)               |      finish_task_switch() {
       3)   0.685 us    |        _spin_unlock_irq();
       3)   2.331 us    |      }
       3) + 41.439 us   |    }
       3) + 42.663 us   |  }
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      8aef2d28
    • S
      function-graph: prevent more than one tracer registering · 05ce5818
      Steven Rostedt 提交于
      Impact: prevent crash due to multiple function graph tracers
      
      The function graph tracer can currently only handle a single tracer
      being registered. If another tracer registers with the function
      graph tracer it can crash the system.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      05ce5818
    • S
      function-graph: moved the timestamp from arch to generic code · 5d1a03dc
      Steven Rostedt 提交于
      This patch move the timestamp from happening in the arch specific
      code into the general code. This allows for better control by the tracer
      to time manipulation.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      5d1a03dc
    • S
      tracing: fix memory leak in trace_stat · 09833521
      Steven Rostedt 提交于
      If the function profiler does not have any items recorded and one were
      to cat the function stat file, the kernel would take a BUG with a NULL
      pointer dereference.
      
      Looking further into this, I found that returning NULL from stat_start
      did not stop the stat logic, and would later call stat_next. This breaks
      from the way seq_file works, so I looked into fixing the stat code.
      
      This is where I noticed that the last next_entry is never freed.
      It is allocated, and if the stat_next returns NULL, the code breaks out
      of the loop, unlocks the mutex and exits. We never link the next_entry
      nor do we free it. Thus it is a real memory leak.
      
      This patch rearranges the code a bit to not only fix the memory leak,
      but also to act more like seq_file where nothing is printed if there
      is nothing to print. That is, stat_start returns NULL.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      09833521
    • A
      tracing: Fix TRACING_SUPPORT dependency for PPC32 · 45b95608
      Anton Vorontsov 提交于
      commit 40ada30f ("tracing: clean up menu"),
      despite the "clean up" in its purpose, introduced a behavioural
      change for Kconfig symbols: we no longer able to select tracing
      support on PPC32 (because IRQFLAGS_SUPPORT isn't yet implemented).
      
      The IRQFLAGS_SUPPORT is not mandatory for most tracers, tracing core
      has a special case for platforms w/o irqflags (which, by the way, has
      become useless as of the commit above).
      
      Though according to Ingo Molnar, there was periodic build failures on
      weird, unmaintained architectures that had no irqflags-tracing support
      and hence didn't know the raw_irqs_save/restore primitives. Thus we'd
      better not enable irqflags-less tracing for all architectures.
      
      This patch restores the old behaviour for PPC32, and thus brings the
      tracing back. Other architectures can either add themselves to the
      exception list or (better) implement TRACE_IRQFLAGS_SUPPORT.
      Signed-off-by: NAnton Vorontsov <avorontsov@ru.mvista.com>
      Acked-b: Steven Rostedt <rostedt@goodmis.org>
      Cc: linuxppc-dev@ozlabs.org
      LKML-Reference: <20090323220724.GA9851@oksana.dev.rtsoft.ru>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45b95608
  3. 23 3月, 2009 4 次提交
    • F
      tracing/ftrace: check if debugfs is registered before creating files · 3e1f60b8
      Frederic Weisbecker 提交于
      Impact: fix a crash with ftrace={nop,boot} parameter
      
      If the nop or initcall tracers are launched as boot tracers,
      they will attempt to create their option directory and files.
      But these tracers are registered very early and then assigned
      as "boot tracers" very early if asked to.
      
      Since they do this before debugfs has been registered (core initcall),
      a crash is triggered.
      
      Another early tracers could also come later. So we fix it by
      checking if debugfs is initialized before creating the root
      tracing directory.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237759847-21025-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3e1f60b8
    • F
      debugfs: function to know if debugfs is initialized · c0f92ba9
      Frederic Weisbecker 提交于
      Impact: add new debugfs API
      
      With ftrace, some tracers are registered in early initcalls
      and attempt to create files on the debugfs filesystem.
      Depending on when they are activated, they can try to create their
      file at any time. Some checks can be done on the tracing area
      but providing a helper to know if debugfs is registered make it
      really more easy.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237759847-21025-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c0f92ba9
    • D
      tracing: fix four sparse warnings · b8b94265
      Dmitri Vorobiev 提交于
      Impact: cleanup.
      
      This patch fixes the following sparse warnings:
      
       kernel/trace/trace.c:385:9: warning: symbol 'trace_seq_to_buffer' was
       not declared. Should it be static?
      
       kernel/trace/trace_clock.c:29:13: warning: symbol 'trace_clock_local'
       was not declared. Should it be static?
      
       kernel/trace/trace_clock.c:54:13: warning: symbol 'trace_clock' was not
       declared. Should it be static?
      
       kernel/trace/trace_clock.c:74:13: warning: symbol 'trace_clock_global'
       was not declared. Should it be static?
      Signed-off-by: NDmitri Vorobiev <dmitri.vorobiev@movial.com>
      LKML-Reference: <1237741871-5827-4-git-send-email-dmitri.vorobiev@movial.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8b94265
    • I
      Merge branches 'tracing/ftrace', 'tracing/hw-breakpoints',... · a524446f
      Ingo Molnar 提交于
      Merge branches 'tracing/ftrace', 'tracing/hw-breakpoints', 'tracing/ring-buffer', 'tracing/textedit' and 'linus' into tracing/core
      a524446f
  4. 22 3月, 2009 2 次提交
  5. 21 3月, 2009 1 次提交
    • F
      tracing/ring-buffer: don't annotate rb_cpu_notify with __cpuinit · 09c9e84d
      Frederic Weisbecker 提交于
      Impact: remove a section warning
      
      CONFIG_DEBUG_SECTION_MISMATCH raises the following warning on -tip:
      
        WARNING: kernel/trace/built-in.o(.text+0x5bc5): Section mismatch in
        reference from the function ring_buffer_alloc() to the function
        .cpuinit.text:rb_cpu_notify()
        The function ring_buffer_alloc() references
        the function __cpuinit rb_cpu_notify().
      
      This is actually harmless. The code in the ring buffer don't build
      rb_cpu_notify and other cpu hotplug stuffs when !CONFIG_HOTPLUG_CPU
      so we have no risk to reference freed memory here (it would even
      be harmless if we unconditionally build it because register_cpu_notifier
      would do nothing when !CONFIG_HOTPLUG_CPU.
      
      But since ring_buffer_alloc() can be called everytime, we don't want it
      to be annotated with __cpuinit so we drop the __cpuinit from
      rb_cpu_notify.
      
      This is not a waste of memory because it is only defined and used on
      CONFIG_HOTPLUG_CPU.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237606416-22268-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      09c9e84d
  6. 20 3月, 2009 18 次提交
  7. 19 3月, 2009 3 次提交
    • F
      tracing/ring-buffer: fix non cpu hotplug case · 3bf832ce
      Frederic Weisbecker 提交于
      Impact: fix warning with irqsoff tracer
      
      The ring buffer allocates its buffers on pre-smp time (early_initcall).
      It means that, at first, only the boot cpu buffer is allocated and
      the ring-buffer cpumask only has the boot cpu set (cpu_online_mask).
      
      Later, the secondary cpu will show up and the ring-buffer will be notified
      about this event: the appropriate buffer will be allocated and the cpumask
      will be updated.
      
      Unfortunately, if !CONFIG_CPU_HOTPLUG, the ring-buffer will not be
      notified about the secondary cpus, meaning that the cpumask will have
      only the cpu boot set, and only one cpu buffer allocated.
      
      We fix that by using cpu_possible_mask if !CONFIG_CPU_HOTPLUG.
      
      This patch fixes the following warning with irqsoff tracer running:
      
      [  169.317794] WARNING: at kernel/trace/trace.c:466 update_max_tr_single+0xcc/0xf3()
      [  169.318002] Hardware name: AMILO Li 2727
      [  169.318002] Modules linked in:
      [  169.318002] Pid: 5624, comm: bash Not tainted 2.6.29-rc8-tip-02636-g6aafa6c #11
      [  169.318002] Call Trace:
      [  169.318002]  [<ffffffff81036182>] warn_slowpath+0xea/0x13d
      [  169.318002]  [<ffffffff8100b9d6>] ? ftrace_call+0x5/0x2b
      [  169.318002]  [<ffffffff8100b9d6>] ? ftrace_call+0x5/0x2b
      [  169.318002]  [<ffffffff8100b9d1>] ? ftrace_call+0x0/0x2b
      [  169.318002]  [<ffffffff8101ef10>] ? ftrace_modify_code+0xa9/0x108
      [  169.318002]  [<ffffffff8106e27f>] ? trace_hardirqs_off+0x25/0x27
      [  169.318002]  [<ffffffff8149afe7>] ? _spin_unlock_irqrestore+0x1f/0x2d
      [  169.318002]  [<ffffffff81064f52>] ? ring_buffer_reset_cpu+0xf6/0xfb
      [  169.318002]  [<ffffffff8106637c>] ? ring_buffer_reset+0x36/0x48
      [  169.318002]  [<ffffffff8106aeda>] update_max_tr_single+0xcc/0xf3
      [  169.318002]  [<ffffffff8100bc17>] ? sysret_check+0x22/0x5d
      [  169.318002]  [<ffffffff8106e3ea>] stop_critical_timing+0x142/0x204
      [  169.318002]  [<ffffffff8106e4cf>] trace_hardirqs_on_caller+0x23/0x25
      [  169.318002]  [<ffffffff8149ac28>] trace_hardirqs_on_thunk+0x3a/0x3c
      [  169.318002]  [<ffffffff8100bc17>] ? sysret_check+0x22/0x5d
      [  169.318002] ---[ end trace db76cbf775a750cf ]---
      
      Because this tracer may try to swap two cpu ring buffers for an
      unregistered cpu on the ring buffer.
      
      This patch might also fix a fair loss of traces due to unallocated buffers
      for secondary cpus.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-b: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237470453-5427-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3bf832ce
    • S
      function-graph: consolidate prologues for output · ac5f6c96
      Steven Rostedt 提交于
      Impact: clean up
      
      The prologue of the function graph entry, return and comments all
      start out pretty much the same. Each of these duplicate code and
      do so slightly differently.
      
      This patch consolidates the printing of the pid, absolute time,
      cpu and proc (and for entry, the interrupt).
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      ac5f6c96
    • L
      ftrace: protect running nmi (V3) · e9d9df44
      Lai Jiangshan 提交于
      When I review the sensitive code ftrace_nmi_enter(), I found
      the atomic variable nmi_running does protect NMI VS do_ftrace_mod_code(),
      but it can not protects NMI(entered nmi) VS NMI(ftrace_nmi_enter()).
      
      cpu#1                   | cpu#2                 | cpu#3
      ftrace_nmi_enter()      | do_ftrace_mod_code()  |
        not modify            |                       |
      ------------------------|-----------------------|--
      executing               | set mod_code_write = 1|
      executing             --|-----------------------|--------------------
      executing               |                       | ftrace_nmi_enter()
      executing               |                       |    do modify
      ------------------------|-----------------------|-----------------
      ftrace_nmi_exit()       |                       |
      
      cpu#3 may be being modified the code which is still being executed on cpu#1,
      it will have undefined results and possibly take a GPF, this patch
      prevents it occurred.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <49C0B411.30003@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      e9d9df44