1. 27 6月, 2009 1 次提交
    • L
      tracing: Fix stack tracer sysctl handling · a32c7765
      Li Zefan 提交于
      This made my machine completely frozen:
      
        # echo 1 > /proc/sys/kernel/stack_tracer_enabled
        # echo 2 > /proc/sys/kernel/stack_tracer_enabled
      
      The cause is register_ftrace_function() was called twice.
      
      Also fix ftrace_enabled sysctl, though seems nothing bad happened
      as I tested it.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A448D17.9010305@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a32c7765
  2. 26 6月, 2009 1 次提交
  3. 24 6月, 2009 3 次提交
    • L
      ftrace: Fix t_hash_start() · d82d6244
      Li Zefan 提交于
      When the output of set_ftrace_filter is larger than PAGE_SIZE,
      t_hash_start() will be called the 2nd time, and then we start
      from the head of a hlist, which is wrong and causes some entries
      to be outputed twice.
      
      The worse is, if the hlist is large enough, reading set_ftrace_filter
      won't stop but in a dead loop.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A41876E.2060407@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d82d6244
    • L
      ftrace: Don't manipulate @pos in t_start() · 694ce0a5
      Li Zefan 提交于
      It's rather confusing that in t_start(), in some cases @pos is
      incremented, and in some cases it's decremented and then incremented.
      
      This patch rewrites t_start() in a much more general way.
      
      Thus we fix a bug that if ftrace_filtered == 1, functions have tracer
      hooks won't be printed, because the branch is always unreachable:
      
      static void *t_start(...)
      {
      	...
      	if (!p)
      		return t_hash_start(m, pos);
      	return p;
      }
      
      Before:
        # echo 'sys_open' > /mnt/tracing/set_ftrace_filter
        # echo 'sys_write:traceon:4' >> /mnt/tracing/set_ftrace_filter
        sys_open
      
      After:
        # echo 'sys_open' > /mnt/tracing/set_ftrace_filter
        # echo 'sys_write:traceon:4' >> /mnt/tracing/set_ftrace_filter
        sys_open
        sys_write:traceon:count=4
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A41874B.4090507@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      694ce0a5
    • L
      ftrace: Don't increment @pos in g_start() · 85951842
      Li Zefan 提交于
      It's wrong to increment @pos in g_start(). It causes some entries
      lost when reading set_graph_function, if the output of the file
      is larger than PAGE_SIZE.
      Reviewed-by: NLiming Wang <liming.wang@windriver.com>
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A418738.7090401@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      85951842
  4. 20 6月, 2009 1 次提交
  5. 03 6月, 2009 4 次提交
    • S
      function-graph: always initialize task ret_stack · 84047e36
      Steven Rostedt 提交于
      On creating a new task while running the function graph tracer, if
      we fail to allocate the ret_stack, and then fail the fork, the
      code will free the parent ret_stack. This is because the child
      duplicated the parent and currently points to the parent's ret_stack.
      
      This patch always initializes the task's ret_stack to NULL.
      
      [ Impact: prevent crash of parent on low memory during fork ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      84047e36
    • S
      function-graph: add memory barriers for accessing task's ret_stack · 26c01624
      Steven Rostedt 提交于
      The code that handles the tasks ret_stack allocation for every task
      assumes that only an interrupt can cause issues (even though interrupts
      are disabled).
      
      In reality, the code is allocating the ret_stack for tasks that may be
      running on other CPUs and there are not efficient memory barriers to
      handle this case.
      
      [ Impact: prevent crash due to using of uninitialized ret_stack variables ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      26c01624
    • S
      function-graph: enable the stack after initialization of other variables · 82310a32
      Steven Rostedt 提交于
      The function graph tracer checks if the task_struct has ret_stack defined
      to know if it is OK or not to use it. The initialization is done for
      all tasks by one process, but the idle tasks use the same initialization
      used by new tasks.
      
      If an interrupt happens on an idle task that just had the ret_stack
      created, but before the rest of the initialization took place, then
      we can corrupt the return address of the functions.
      
      This patch moves the setting of the task_struct's ret_stack to after
      the other variables have been initialized.
      
      [ Impact: prevent kernel panic on idle task when starting function graph ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      82310a32
    • S
      function-graph: only allocate init tasks if it was not already done · 179c498a
      Steven Rostedt 提交于
      When the function graph tracer is enabled, it calls the initialization
      needed for the init tasks that would be called on all created tasks.
      
      The problem is that this is called every time the function graph tracer
      is enabled, and the ret_stack is allocated for the idle tasks each time.
      Thus, the old ret_stack is lost and a memory leak is created.
      
      This is also dangerous because if an interrupt happened on another CPU
      with the init task and the ret_stack is replaced, we then lose all the
      return pointers for the interrupt, and a crash would take place.
      
      [ Impact: fix memory leak and possible crash due to race ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      179c498a
  6. 02 6月, 2009 2 次提交
    • S
      ftrace: do not profile functions when disabled · 0f6ce3de
      Steven Rostedt 提交于
      A race was found that if one were to enable and disable the function
      profiler repeatedly, then the system can panic. This was because a profiled
      function may be preempted just before disabling interrupts. While
      the profiler is disabled and then reenabled, the preempted function
      could start again, and access the hash as it is being initialized.
      
      This just adds a check in the irq disabled part to check if the profiler
      is enabled, and if it is not then it will just exit.
      
      When the system is disabled, the profile_enabled variable is cleared
      before calling the unregistering of the function profiler. This
      unregistering calls stop machine which also acts as a synchronize schedule.
      
      [ Impact: fix panic in enabling/disabling function profiler ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0f6ce3de
    • S
      ftrace: add kernel command line function filtering · 2af15d6a
      Steven Rostedt 提交于
      When using ftrace=function on the command line to trace functions
      on boot up, one can not filter out functions that are commonly called.
      
      This patch adds two new ftrace command line commands.
      
        ftrace_notrace=function-list
        ftrace_filter=function-list
      
      Where function-list is a comma separated list of functions to filter.
      The ftrace_notrace will make the functions listed not be included
      in the function tracing, and ftrace_filter will only trace the functions
      listed.
      
      These two act the same as the debugfs/tracing/set_ftrace_notrace and
      debugfs/tracing/set_ftrace_filter respectively.
      
      The simple glob expressions that are allowed by the filter files can also
      be used by the command line interface.
      
      	ftrace_notrace=rcu*,*lock,*spin*
      
      Will not trace any function that starts with rcu, ends with lock, or has
      the word spin in it.
      
      Note, if the self tests are enabled, they may interfere with the filtering
      set by the command lines.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2af15d6a
  7. 18 5月, 2009 1 次提交
  8. 17 4月, 2009 1 次提交
    • S
      ftrace: use module notifier for function tracer · 93eb677d
      Steven Rostedt 提交于
      The hooks in the module code for the function tracer must be called
      before any of that module code runs. The function tracer hooks
      modify the module (replacing calls to mcount to nops). If the code
      is executed while the change occurs, then the CPU can take a GPF.
      
      To handle the above with a bit of paranoia, I originally implemented
      the hooks as calls directly from the module code.
      
      After examining the notifier calls, it looks as though the start up
      notify is called before any of the module's code is executed. This makes
      the use of the notify safe with ftrace.
      
      Only the startup notify is required to be "safe". The shutdown simply
      removes the entries from the ftrace function list, and does not modify
      any code.
      
      This change has another benefit. It removes a issue with a reverse dependency
      in the mutexes of ftrace_lock and module_mutex.
      
      [ Impact: fix lock dependency bug, cleanup ]
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      93eb677d
  9. 15 4月, 2009 1 次提交
    • S
      tracing/events: move trace point headers into include/trace/events · ad8d75ff
      Steven Rostedt 提交于
      Impact: clean up
      
      Create a sub directory in include/trace called events to keep the
      trace point headers in their own separate directory. Only headers that
      declare trace points should be defined in this directory.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ad8d75ff
  10. 07 4月, 2009 2 次提交
  11. 01 4月, 2009 1 次提交
    • S
      function-graph: allow unregistering twice · 2aad1b76
      Steven Rostedt 提交于
      Impact: fix to permanent disabling of function graph tracer
      
      There should be nothing to prevent a tracer from unregistering a
      function graph callback more than once. This can simplify error paths.
      
      But currently, the counter does not account for mulitple unregistering
      of the function graph callback. If it happens, the function graph
      tracer will be permanently disabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2aad1b76
  12. 30 3月, 2009 1 次提交
  13. 26 3月, 2009 4 次提交
  14. 25 3月, 2009 5 次提交
    • S
      function-graph: add option to calculate graph time or not · a2a16d6a
      Steven Rostedt 提交于
      graph time is the time that a function is executing another function.
      Thus if function A calls B, if graph-time is set, then the time for
      A includes B. This is the default behavior. But if graph-time is off,
      then the time spent executing B is subtracted from A.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      a2a16d6a
    • S
      tracing: make the function profiler per cpu · cafb168a
      Steven Rostedt 提交于
      Impact: speed enhancement
      
      By making the function profiler record in per cpu data we not only
      get better readings, avoid races, we also do not have to take any
      locks.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      cafb168a
    • S
      tracing: adding function timings to function profiler · 0706f1c4
      Steven Rostedt 提交于
      If the function graph trace is enabled, the function profiler will
      use it to take the timing of the functions.
      
       cat /debug/tracing/trace_stat/functions
      
        Function                               Hit    Time
        --------                               ---    ----
        mwait_idle                             127    183028.4 us
        schedule                                26    151997.7 us
        __schedule                              31    151975.1 us
        sys_wait4                                2    74080.53 us
        do_wait                                  2    74077.80 us
        sys_newlstat                           138    39929.16 us
        do_path_lookup                         179    39845.79 us
        vfs_lstat_fd                           138    39761.97 us
        user_path_at                           153    39469.58 us
        path_walk                              179    39435.76 us
        __link_path_walk                       189    39143.73 us
      [...]
      
      Note the times are skewed due to the function graph tracer not taking
      into account schedules.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      0706f1c4
    • S
      tracing: move function profiler data out of function struct · 493762fc
      Steven Rostedt 提交于
      Impact: reduce size of memory in function profiler
      
      The function profiler originally introduces its counters into the
      function records itself. There is 20 thousand different functions on
      a normal system, and that is adding 20 thousand counters for profiling
      event when not needed.
      
      A normal run of the profiler yields only a couple of thousand functions
      executed, depending on what is being profiled. This means we have around
      18 thousand useless counters.
      
      This patch rectifies this by moving the data out of the function
      records used by dynamic ftrace. Data is preallocated to hold the functions
      when the profiling begins. Checks are made during profiling to see if
      more recorcds should be allocated, and they are allocated if it is safe
      to do so.
      
      This also removes the dependency from using dynamic ftrace, and also
      removes the overhead by having it enabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      493762fc
    • S
      tracing: add function profiler · bac429f0
      Steven Rostedt 提交于
      Impact: new profiling feature
      
      This patch adds a function profiler. In debugfs/tracing/ two new
      files are created.
      
        function_profile_enabled  - to enable or disable profiling
      
        trace_stat/functions   - the profiled functions.
      
      For example:
      
        echo 1 > /debugfs/tracing/function_profile_enabled
        ./hackbench 50
        echo 0 > /debugfs/tracing/function_profile_enabled
      
      yields:
      
        cat /debugfs/tracing/trace_stat/functions
      
        Function                               Hit
        --------                               ---
        _spin_lock                        10106442
        _spin_unlock                      10097492
        kfree                              6013704
        _spin_unlock_irqrestore            4423941
        _spin_lock_irqsave                 4406825
        __phys_addr                        4181686
        __slab_free                        4038222
        dput                               4030130
        path_put                           4023387
        unroll_tree_refs                   4019532
      [...]
      
      The most hit functions are listed first. Functions that are not
      hit are not listed.
      
      This feature depends on and uses dynamic function tracing. When the
      function profiling is disabled, no overhead occurs. But it still
      takes up around 300KB to hold the data, thus it is not recomended
      to keep it enabled for systems low on memory.
      
      When a '1' is echoed into the function_profile_enabled file, the
      counters for is function is reset back to zero. Thus you can see what
      functions are hit most by different programs.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      bac429f0
  15. 24 3月, 2009 5 次提交
    • L
      tracing: use union for multi-usages field · ee000b7f
      Lai Jiangshan 提交于
      Impact: cleanup
      
      struct dyn_ftrace::ip has different usages in his lifecycle,
      we use union for it. And also for struct dyn_ftrace::flags.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49C871BE.3080405@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ee000b7f
    • L
      ftrace: show virtual PID · cc59c9e8
      Lai Jiangshan 提交于
      Impact: fix PID output under namespaces
      
      When current namespace is not the global namespace,
      pid read from set_ftrace_pid is no correct.
      
       # ~/newpid_namespace_run bash
       # echo $$
       1
       # echo 1 > set_ftrace_pid
       # cat set_ftrace_pid
       3756
      
      Since we write virtual PID to set_ftrace_pid, we need get
      virtual PID when we read it.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49C84D65.9050606@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc59c9e8
    • S
      function-graph: add option for include sleep times · be6f164a
      Steven Rostedt 提交于
      Impact: give user a choice to show times spent while sleeping
      
      The user may want to see the time a function spent sleeping.
      This patch adds the trace option "sleep-time" to allow that.
      The "sleep-time" option is default on.
      
       echo sleep-time > /debug/tracing/trace_options
      
      produces:
      
       ------------------------------------------
       2)  avahi-d-3428  =>    <idle>-0
       ------------------------------------------
      
       2)               |      finish_task_switch() {
       2)   0.621 us    |        _spin_unlock_irq();
       2)   2.202 us    |      }
       2) ! 1002.197 us |    }
       2) ! 1003.521 us |  }
      
      where as,
      
       echo nosleep-time > /debug/tracing/trace_options
      
      produces:
      
       0)    <idle>-0    =>  yum-upd-3416
       ------------------------------------------
      
       0)               |              finish_task_switch() {
       0)   0.643 us    |                _spin_unlock_irq();
       0)   2.342 us    |              }
       0) + 41.302 us   |            }
       0) + 42.453 us   |          }
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      be6f164a
    • S
      function-graph: ignore times across schedule · 8aef2d28
      Steven Rostedt 提交于
      Impact: more accurate timings
      
      The current method of function graph tracing does not take into
      account the time spent when a task is not running. This shows functions
      that call schedule have increased costs:
      
       3) + 18.664 us   |      }
       ------------------------------------------
       3)    <idle>-0    =>  kblockd-123
       ------------------------------------------
      
       3)               |      finish_task_switch() {
       3)   1.441 us    |        _spin_unlock_irq();
       3)   3.966 us    |      }
       3) ! 2959.433 us |    }
       3) ! 2961.465 us |  }
      
      This patch uses the tracepoint in the scheduling context switch to
      account for time that has elapsed while a task is scheduled out.
      Now we see:
      
       ------------------------------------------
       3)    <idle>-0    =>  edac-po-1067
       ------------------------------------------
      
       3)               |      finish_task_switch() {
       3)   0.685 us    |        _spin_unlock_irq();
       3)   2.331 us    |      }
       3) + 41.439 us   |    }
       3) + 42.663 us   |  }
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      8aef2d28
    • S
      function-graph: prevent more than one tracer registering · 05ce5818
      Steven Rostedt 提交于
      Impact: prevent crash due to multiple function graph tracers
      
      The function graph tracer can currently only handle a single tracer
      being registered. If another tracer registers with the function
      graph tracer it can crash the system.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      05ce5818
  16. 17 3月, 2009 1 次提交
  17. 13 3月, 2009 4 次提交
  18. 06 3月, 2009 2 次提交