1. 21 5月, 2014 1 次提交
  2. 28 4月, 2014 1 次提交
    • S
      ftrace/module: Hardcode ftrace_module_init() call into load_module() · a949ae56
      Steven Rostedt (Red Hat) 提交于
      A race exists between module loading and enabling of function tracer.
      
      	CPU 1				CPU 2
      	-----				-----
        load_module()
         module->state = MODULE_STATE_COMING
      
      				register_ftrace_function()
      				 mutex_lock(&ftrace_lock);
      				 ftrace_startup()
      				  update_ftrace_function();
      				   ftrace_arch_code_modify_prepare()
      				    set_all_module_text_rw();
      				   <enables-ftrace>
      				    ftrace_arch_code_modify_post_process()
      				     set_all_module_text_ro();
      
      				[ here all module text is set to RO,
      				  including the module that is
      				  loading!! ]
      
         blocking_notifier_call_chain(MODULE_STATE_COMING);
          ftrace_init_module()
      
           [ tries to modify code, but it's RO, and fails!
             ftrace_bug() is called]
      
      When this race happens, ftrace_bug() will produces a nasty warning and
      all of the function tracing features will be disabled until reboot.
      
      The simple solution is to treate module load the same way the core
      kernel is treated at boot. To hardcode the ftrace function modification
      of converting calls to mcount into nops. This is done in init/main.c
      there's no reason it could not be done in load_module(). This gives
      a better control of the changes and doesn't tie the state of the
      module to its notifiers as much. Ftrace is special, it needs to be
      treated as such.
      
      The reason this would work, is that the ftrace_module_init() would be
      called while the module is in MODULE_STATE_UNFORMED, which is ignored
      by the set_all_module_text_ro() call.
      
      Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.comReported-by: NTakao Indoh <indou.takao@jp.fujitsu.com>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@vger.kernel.org # 2.6.38+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a949ae56
  3. 12 3月, 2014 1 次提交
  4. 07 3月, 2014 2 次提交
  5. 21 2月, 2014 2 次提交
  6. 03 1月, 2014 1 次提交
  7. 06 11月, 2013 1 次提交
  8. 19 10月, 2013 1 次提交
    • N
      ftrace: Add set_graph_notrace filter · 29ad23b0
      Namhyung Kim 提交于
      The set_graph_notrace filter is analogous to set_ftrace_notrace and
      can be used for eliminating uninteresting part of function graph trace
      output.  It also works with set_graph_function nicely.
      
        # cd /sys/kernel/debug/tracing/
        # echo do_page_fault > set_graph_function
        # perf ftrace live true
         2)               |  do_page_fault() {
         2)               |    __do_page_fault() {
         2)   0.381 us    |      down_read_trylock();
         2)   0.055 us    |      __might_sleep();
         2)   0.696 us    |      find_vma();
         2)               |      handle_mm_fault() {
         2)               |        handle_pte_fault() {
         2)               |          __do_fault() {
         2)               |            filemap_fault() {
         2)               |              find_get_page() {
         2)   0.033 us    |                __rcu_read_lock();
         2)   0.035 us    |                __rcu_read_unlock();
         2)   1.696 us    |              }
         2)   0.031 us    |              __might_sleep();
         2)   2.831 us    |            }
         2)               |            _raw_spin_lock() {
         2)   0.046 us    |              add_preempt_count();
         2)   0.841 us    |            }
         2)   0.033 us    |            page_add_file_rmap();
         2)               |            _raw_spin_unlock() {
         2)   0.057 us    |              sub_preempt_count();
         2)   0.568 us    |            }
         2)               |            unlock_page() {
         2)   0.084 us    |              page_waitqueue();
         2)   0.126 us    |              __wake_up_bit();
         2)   1.117 us    |            }
         2)   7.729 us    |          }
         2)   8.397 us    |        }
         2)   8.956 us    |      }
         2)   0.085 us    |      up_read();
         2) + 12.745 us   |    }
         2) + 13.401 us   |  }
        ...
      
        # echo handle_mm_fault > set_graph_notrace
        # perf ftrace live true
         1)               |  do_page_fault() {
         1)               |    __do_page_fault() {
         1)   0.205 us    |      down_read_trylock();
         1)   0.041 us    |      __might_sleep();
         1)   0.344 us    |      find_vma();
         1)   0.069 us    |      up_read();
         1)   4.692 us    |    }
         1)   5.311 us    |  }
        ...
      
      Link: http://lkml.kernel.org/r/1381739066-7531-5-git-send-email-namhyung@kernel.orgSigned-off-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      29ad23b0
  9. 20 6月, 2013 1 次提交
    • S
      tracing: Disable tracing on warning · de7edd31
      Steven Rostedt (Red Hat) 提交于
      Add a traceoff_on_warning option in both the kernel command line as well
      as a sysctl option. When set, any WARN*() function that is hit will cause
      the tracing_on variable to be cleared, which disables writing to the
      ring buffer.
      
      This is useful especially when tracing a bug with function tracing. When
      a warning is hit, the print caused by the warning can flood the trace with
      the functions that producing the output for the warning. This can make the
      resulting trace useless by either hiding where the bug happened, or worse,
      by overflowing the buffer and losing the trace of the bug totally.
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      de7edd31
  10. 12 6月, 2013 1 次提交
  11. 10 5月, 2013 1 次提交
    • M
      ftrace, kprobes: Fix a deadlock on ftrace_regex_lock · f04f24fb
      Masami Hiramatsu 提交于
      Fix a deadlock on ftrace_regex_lock which happens when setting
      an enable_event trigger on dynamic kprobe event as below.
      
      ----
      sh-2.05b# echo p vfs_symlink > kprobe_events
      sh-2.05b# echo vfs_symlink:enable_event:kprobes:p_vfs_symlink_0 > set_ftrace_filter
      
      =============================================
      [ INFO: possible recursive locking detected ]
      3.9.0+ #35 Not tainted
      ---------------------------------------------
      sh/72 is trying to acquire lock:
       (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810ba6c1>] ftrace_set_hash+0x81/0x1f0
      
      but task is already holding lock:
       (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810b7cbd>] ftrace_regex_write.isra.29.part.30+0x3d/0x220
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(ftrace_regex_lock);
        lock(ftrace_regex_lock);
      
       *** DEADLOCK ***
      ----
      
      To fix that, this introduces a finer regex_lock for each ftrace_ops.
      ftrace_regex_lock is too big of a lock which protects all
      filter/notrace_hash operations, but it doesn't need to be a global
      lock after supporting multiple ftrace_ops because each ftrace_ops
      has its own filter/notrace_hash.
      
      Link: http://lkml.kernel.org/r/20130509054417.30398.84254.stgit@mhiramat-M0-7522
      
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Tom Zanussi <tom.zanussi@intel.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      [ Added initialization flag and automate mutex initialization for
        non ftrace.c ftrace_probes. ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f04f24fb
  12. 13 4月, 2013 2 次提交
  13. 09 4月, 2013 1 次提交
    • S
      ftrace: Do not call stub functions in control loop · 395b97a3
      Steven Rostedt (Red Hat) 提交于
      The function tracing control loop used by perf spits out a warning
      if the called function is not a control function. This is because
      the control function references a per cpu allocated data structure
      on struct ftrace_ops that is not allocated for other types of
      functions.
      
      commit 0a016409 "ftrace: Optimize the function tracer list loop"
      
      Had an optimization done to all function tracing loops to optimize
      for a single registered ops. Unfortunately, this allows for a slight
      race when tracing starts or ends, where the stub function might be
      called after the current registered ops is removed. In this case we
      get the following dump:
      
      root# perf stat -e ftrace:function sleep 1
      [   74.339105] WARNING: at include/linux/ftrace.h:209 ftrace_ops_control_func+0xde/0xf0()
      [   74.349522] Hardware name: PRIMERGY RX200 S6
      [   74.357149] Modules linked in: sg igb iTCO_wdt ptp pps_core iTCO_vendor_support i7core_edac dca lpc_ich i2c_i801 coretemp edac_core crc32c_intel mfd_core ghash_clmulni_intel dm_multipath acpi_power_meter pcspk
      r microcode vhost_net tun macvtap macvlan nfsd kvm_intel kvm auth_rpcgss nfs_acl lockd sunrpc uinput xfs libcrc32c sd_mod crc_t10dif sr_mod cdrom mgag200 i2c_algo_bit drm_kms_helper ttm qla2xxx mptsas ahci drm li
      bahci scsi_transport_sas mptscsih libata scsi_transport_fc i2c_core mptbase scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
      [   74.446233] Pid: 1377, comm: perf Tainted: G        W    3.9.0-rc1 #1
      [   74.453458] Call Trace:
      [   74.456233]  [<ffffffff81062e3f>] warn_slowpath_common+0x7f/0xc0
      [   74.462997]  [<ffffffff810fbc60>] ? rcu_note_context_switch+0xa0/0xa0
      [   74.470272]  [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0
      [   74.478117]  [<ffffffff81062e9a>] warn_slowpath_null+0x1a/0x20
      [   74.484681]  [<ffffffff81102ede>] ftrace_ops_control_func+0xde/0xf0
      [   74.491760]  [<ffffffff8162f400>] ftrace_call+0x5/0x2f
      [   74.497511]  [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f
      [   74.503486]  [<ffffffff8162f400>] ? ftrace_call+0x5/0x2f
      [   74.509500]  [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50
      [   74.516088]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
      [   74.522268]  [<ffffffff810fbc65>] ? synchronize_sched+0x5/0x50
      [   74.528837]  [<ffffffff811041a2>] ? __unregister_ftrace_function+0xa2/0x1a0
      [   74.536696]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
      [   74.542878]  [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50
      [   74.548869]  [<ffffffff81105c67>] unregister_ftrace_function+0x27/0x50
      [   74.556243]  [<ffffffff8111eadf>] perf_ftrace_event_register+0x9f/0x140
      [   74.563709]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
      [   74.569887]  [<ffffffff8162402d>] ? mutex_lock+0x1d/0x50
      [   74.575898]  [<ffffffff8111e94e>] perf_trace_destroy+0x2e/0x50
      [   74.582505]  [<ffffffff81127ba9>] tp_perf_event_destroy+0x9/0x10
      [   74.589298]  [<ffffffff811295d0>] free_event+0x70/0x1a0
      [   74.595208]  [<ffffffff8112a579>] perf_event_release_kernel+0x69/0xa0
      [   74.602460]  [<ffffffff816254d5>] ? _cond_resched+0x5/0x40
      [   74.608667]  [<ffffffff8112a640>] put_event+0x90/0xc0
      [   74.614373]  [<ffffffff8112a740>] perf_release+0x10/0x20
      [   74.620367]  [<ffffffff811a3044>] __fput+0xf4/0x280
      [   74.625894]  [<ffffffff811a31de>] ____fput+0xe/0x10
      [   74.631387]  [<ffffffff81083697>] task_work_run+0xa7/0xe0
      [   74.637452]  [<ffffffff81014981>] do_notify_resume+0x71/0xb0
      [   74.643843]  [<ffffffff8162fa92>] int_signal+0x12/0x17
      
      To fix this a new ftrace_ops flag is added that denotes the ftrace_list_end
      ftrace_ops stub as just that, a stub. This flag is now checked in the
      control loop and the function is not called if the flag is set.
      
      Thanks to Jovi for not just reporting the bug, but also pointing out
      where the bug was in the code.
      
      Link: http://lkml.kernel.org/r/514A8855.7090402@redhat.com
      Link: http://lkml.kernel.org/r/1364377499-1900-15-git-send-email-jovi.zhangwei@huawei.comTested-by: NWANG Chao <chaowang@redhat.com>
      Reported-by: NWANG Chao <chaowang@redhat.com>
      Reported-by: Nzhangwei(Jovi) <jovi.zhangwei@huawei.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      395b97a3
  14. 15 3月, 2013 1 次提交
    • S
      ftrace: Clean up function probe methods · e67efb93
      Steven Rostedt (Red Hat) 提交于
      When a function probe is created, each function that the probe is
      attached to, a "callback" method is called. On release of the probe,
      each function entry calls the "free" method.
      
      First, "callback" is a confusing name and does not really match what
      it does. Callback sounds like it will be called when the probe
      triggers. But that's not the case. This is really an "init" function,
      so lets rename it as such.
      
      Secondly, both "init" and "free" do not pass enough information back
      to the handlers. Pass back the ops, ip and data for each time the
      method is called. We have the information, might as well use it.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e67efb93
  15. 22 1月, 2013 1 次提交
  16. 18 12月, 2012 1 次提交
  17. 31 7月, 2012 4 次提交
  18. 20 7月, 2012 4 次提交
    • S
      ftrace/x86: Add separate function to save regs · 08f6fba5
      Steven Rostedt 提交于
      Add a way to have different functions calling different trampolines.
      If a ftrace_ops wants regs saved on the return, then have only the
      functions with ops registered to save regs. Functions registered by
      other ops would not be affected, unless the functions overlap.
      
      If one ftrace_ops registered functions A, B and C and another ops
      registered fucntions to save regs on A, and D, then only functions
      A and D would be saving regs. Function B and C would work as normal.
      Although A is registered by both ops: normal and saves regs; this is fine
      as saving the regs is needed to satisfy one of the ops that calls it
      but the regs are ignored by the other ops function.
      
      x86_64 implements the full regs saving, and i386 just passes a NULL
      for regs to satisfy the ftrace_ops passing. Where an arch must supply
      both regs and ftrace_ops parameters, even if regs is just NULL.
      
      It is OK for an arch to pass NULL regs. All function trace users that
      require regs passing must add the flag FTRACE_OPS_FL_SAVE_REGS when
      registering the ftrace_ops. If the arch does not support saving regs
      then the ftrace_ops will fail to register. The flag
      FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED may be set that will prevent the
      ftrace_ops from failing to register. In this case, the handler may
      either check if regs is not NULL or check if ARCH_SUPPORTS_FTRACE_SAVE_REGS.
      If the arch supports passing regs it will set this macro and pass regs
      for ops that request them. All other archs will just pass NULL.
      
      Link: Link: http://lkml.kernel.org/r/20120711195745.107705970@goodmis.org
      
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08f6fba5
    • S
      ftrace: Return pt_regs to function trace callback · a1e2e31d
      Steven Rostedt 提交于
      Return as the 4th paramater to the function tracer callback the pt_regs.
      
      Later patches that implement regs passing for the architectures will require
      having the ftrace_ops set the SAVE_REGS flag, which will tell the arch
      to take the time to pass a full set of pt_regs to the ftrace_ops callback
      function. If the arch does not support it then it should pass NULL.
      
      If an arch can pass full regs, then it should define:
       ARCH_SUPPORTS_FTRACE_SAVE_REGS to 1
      
      Link: http://lkml.kernel.org/r/20120702201821.019966811@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a1e2e31d
    • S
      ftrace: Consolidate arch dependent functions with 'list' function · ccf3672d
      Steven Rostedt 提交于
      As the function tracer starts to get more features, the support for
      theses features will spread out throughout the different architectures
      over time. These features boil down to what each arch does in the
      mcount trampoline (the ftrace_caller).
      
      Currently there's two features that are not the same throughout the
      archs.
      
       1) Support to stop function tracing before the callback
       2) passing of the ftrace ops
      
      Both of these require placing an indirect function to support the
      features if the mcount trampoline does not.
      
      On a side note, for all architectures, when more than one callback
      is registered to the function tracer, an intermediate 'list' function
      is called by the mcount trampoline to iterate through the callbacks
      that are registered.
      
      Instead of making a separate function for each of these features,
      and requiring several indirect calls, just use the single 'list' function
      as the intermediate, to handle all cases. If an arch does not support
      the 'stop function tracing' or the passing of ftrace ops, just force
      it to use the list function that will handle the features required.
      
      This makes the code cleaner and simpler and removes a lot of
       #ifdefs in the code.
      
      Link: http://lkml.kernel.org/r/20120612225424.495625483@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ccf3672d
    • S
      ftrace: Pass ftrace_ops as third parameter to function trace callback · 2f5f6ad9
      Steven Rostedt 提交于
      Currently the function trace callback receives only the ip and parent_ip
      of the function that it traced. It would be more powerful to also return
      the ops that registered the function as well. This allows the same function
      to act differently depending on what ftrace_ops registered it.
      
      Link: http://lkml.kernel.org/r/20120612225424.267254552@goodmis.orgReviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2f5f6ad9
  19. 17 5月, 2012 3 次提交
  20. 08 5月, 2012 1 次提交
  21. 28 4月, 2012 1 次提交
    • S
      ftrace/x86: Have arch x86_64 use breakpoints instead of stop machine · 08d636b6
      Steven Rostedt 提交于
      This method changes x86 to add a breakpoint to the mcount locations
      instead of calling stop machine.
      
      Now that iret can be handled by NMIs, we perform the following to
      update code:
      
      1) Add a breakpoint to all locations that will be modified
      
      2) Sync all cores
      
      3) Update all locations to be either a nop or call (except breakpoint
         op)
      
      4) Sync all cores
      
      5) Remove the breakpoint with the new code.
      
      6) Sync all cores
      
      [
        Added updates that Masami suggested:
         Use unlikely(modifying_ftrace_code) in int3 trap to keep kprobes efficient.
         Don't use NOTIFY_* in ftrace handler in int3 as it is not a notifier.
      ]
      
      Cc: H. Peter Anvin <hpa@zytor.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08d636b6
  22. 22 2月, 2012 2 次提交
    • J
      ftrace, perf: Add filter support for function trace event · 5500fa51
      Jiri Olsa 提交于
      Adding support to filter function trace event via perf
      interface. It is now possible to use filter interface
      in the perf tool like:
      
        perf record -e ftrace:function --filter="(ip == mm_*)" ls
      
      The filter syntax is restricted to the the 'ip' field only,
      and following operators are accepted '==' '!=' '||', ending
      up with the filter strings like:
      
        ip == f1[, ]f2 ... || ip != f3[, ]f4 ...
      
      with comma ',' or space ' ' as a function separator. If the
      space ' ' is used as a separator, the right side of the
      assignment needs to be enclosed in double quotes '"', e.g.:
      
        perf record -e ftrace:function --filter '(ip == do_execve,sys_*,ext*)' ls
        perf record -e ftrace:function --filter '(ip == "do_execve,sys_*,ext*")' ls
        perf record -e ftrace:function --filter '(ip == "do_execve sys_* ext*")' ls
      
      The '==' operator adds trace filter with same effect as would
      be added via set_ftrace_filter file.
      
      The '!=' operator adds trace filter with same effect as would
      be added via set_ftrace_notrace file.
      
      The right side of the '!=', '==' operators is list of functions
      or regexp. to be added to filter separated by space.
      
      The '||' operator is used for connecting multiple filter definitions
      together. It is possible to have more than one '==' and '!='
      operators within one filter string.
      
      Link: http://lkml.kernel.org/r/1329317514-8131-8-git-send-email-jolsa@redhat.comSigned-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5500fa51
    • J
      ftrace: Add enable/disable ftrace_ops control interface · e248491a
      Jiri Olsa 提交于
      Adding a way to temporarily enable/disable ftrace_ops. The change
      follows the same way as 'global' ftrace_ops are done.
      
      Introducing 2 global ftrace_ops - control_ops and ftrace_control_list
      which take over all ftrace_ops registered with FTRACE_OPS_FL_CONTROL
      flag. In addition new per cpu flag called 'disabled' is also added to
      ftrace_ops to provide the control information for each cpu.
      
      When ftrace_ops with FTRACE_OPS_FL_CONTROL is registered, it is
      set as disabled for all cpus.
      
      The ftrace_control_list contains all the registered 'control' ftrace_ops.
      The control_ops provides function which iterates ftrace_control_list
      and does the check for 'disabled' flag on current cpu.
      
      Adding 3 inline functions:
        ftrace_function_local_disable/ftrace_function_local_enable
        - enable/disable the ftrace_ops on current cpu
        ftrace_function_local_disabled
        - get disabled ftrace_ops::disabled value for current cpu
      
      Link: http://lkml.kernel.org/r/1329317514-8131-2-git-send-email-jolsa@redhat.comAcked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e248491a
  23. 03 2月, 2012 1 次提交
  24. 08 1月, 2012 1 次提交
  25. 21 12月, 2011 4 次提交