1. 15 6月, 2011 13 次提交
    • M
      tracing/kprobes: Fix kprobe-tracer to support stack trace · 1fd8df2c
      Masami Hiramatsu 提交于
      Fix to support kernel stack trace correctly on kprobe-tracer.
      Since the execution path of kprobe-based dynamic events is different
      from other tracepoint-based events, normal ftrace_trace_stack() doesn't
      work correctly. To fix that, this introduces ftrace_trace_stack_regs()
      which traces stack via pt_regs instead of current stack register.
      
      e.g.
      
       # echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events
       # echo 1 > /sys/kernel/debug/tracing/options/stacktrace
       # echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
       # head -n 20 /sys/kernel/debug/tracing/trace
                  bash-2968  [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca)
                  bash-2968  [000] 10297.050247: <stack trace>
       => schedule_timeout
       => n_tty_read
       => tty_read
       => vfs_read
       => sys_read
       => system_call_fastpath
           kworker/0:1-2940  [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca)
           kworker/0:1-2940  [000] 10297.050266: <stack trace>
       => worker_thread
       => kthread
       => kernel_thread_helper
                  sshd-1132  [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca)
                  sshd-1132  [000] 10297.050365: <stack trace>
       => sysret_careful
      
      Note: Even with this fix, the first entry will be skipped
      if the probe is put on the function entry area before
      the frame pointer is set up (usually, that is 4 bytes
       (push %bp; mov %sp %bp) on x86), because stack unwinder
      depends on the frame pointer.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1fd8df2c
    • V
      ring-buffer: Set __GFP_NORETRY flag for ring buffer allocating process · d7ec4bfe
      Vaibhav Nagarnaik 提交于
      The tracing ring buffer is allocated from kernel memory. While
      allocating a large chunk of memory, OOM might happen which destabilizes
      the system. Thus random processes might get killed during the
      allocation.
      
      This patch adds __GFP_NORETRY flag to the ring buffer allocation calls
      to make it fail more gracefully if the system will not be able to
      complete the allocation request.
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: David Sharp <dhsharp@google.com>
      Link: http://lkml.kernel.org/r/1307491302-9236-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d7ec4bfe
    • P
      tracing: Convert to kstrtoul_from_user · 22fe9b54
      Peter Huewe 提交于
      This patch replaces the code for getting an unsigned long from a
      userspace buffer by a simple call to kstroul_from_user.
      This makes it easier to read and less error prone.
      Signed-off-by: NPeter Huewe <peterhuewe@gmx.de>
      Link: http://lkml.kernel.org/r/1307476707-14762-1-git-send-email-peterhuewe@gmx.deSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      22fe9b54
    • J
      tracing, function_graph: Add context-info support for function_graph tracer · 749230b0
      Jiri Olsa 提交于
      The function_graph tracer does not follow global context-info option.
      Adding TRACE_ITER_CONTEXT_INFO trace_flags check to enable it.
      
      With following commands:
      	# echo function_graph > ./current_tracer
      	# echo 0 > options/context-info
      	# cat trace
      
      This is what it looked like before:
      # tracer: function_graph
      #
      #     TIME        CPU  DURATION                  FUNCTION CALLS
      #      |          |     |   |                     |   |   |   |
       1)   0.079 us    |          } /* __vma_link_rb */
       1)   0.056 us    |          copy_page_range();
       1)               |          security_vm_enough_memory() {
      ...
      
      This is what it looks like now:
      # tracer: function_graph
      #
        } /* update_ts_time_stats */
        timekeeping_max_deferment();
      ...
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1307113131-10045-6-git-send-email-jolsa@redhat.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      749230b0
    • J
      tracing, function_graph: Remove lock-depth from latency trace · 199abfab
      Jiri Olsa 提交于
      The lock_depth was removed in commit
      e6e1e259 tracing: Remove lock_depth from event entry
      
      Removing the lock_depth info from function_graph latency header.
      
      With following commands:
      	# echo function_graph > ./current_tracer
      	# echo 1 > options/latency-format
      	# cat trace
      
      This is what it looked like before:
      # tracer: function_graph
      #
      # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+
      # --------------------------------------------------------------------
      # latency: 0 us, #59756/311298, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
      #    -----------------
      #    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
      #    -----------------
      #
      #      _-----=> irqs-off
      #     / _----=> need-resched
      #    | / _---=> hardirq/softirq
      #    || / _--=> preempt-depth
      #    ||| / _-=> lock-depth
      #    |||| /
      # CPU|||||  DURATION                  FUNCTION CALLS
      # |  |||||   |   |                     |   |   |   |
       0)  ....  0.068 us    |    } /* __rcu_read_unlock */
      ...
      
      This is what it looks like now:
      # tracer: function_graph
      #
      # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+
      # --------------------------------------------------------------------
      # latency: 0 us, #59747/1744610, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
      #    -----------------
      #    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
      #    -----------------
      #
      #      _-----=> irqs-off
      #     / _----=> need-resched
      #    | / _---=> hardirq/softirq
      #    || / _--=> preempt-depth
      #    ||| /
      # CPU||||  DURATION                  FUNCTION CALLS
      # |  ||||   |   |                     |   |   |   |
       0)  ..s.  1.641 us    |  } /* __rcu_process_callbacks */
      ...
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1307113131-10045-5-git-send-email-jolsa@redhat.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      199abfab
    • J
      tracing, function: Fix trace header to follow context-info option · f56e7f8e
      Jiri Olsa 提交于
      The header display of function tracer does not follow
      the context-info option, so field names are displayed even
      if this option is off.
      
      Added check for TRACE_ITER_CONTEXT_INFO trace_flags.
      
      With following commands:
      	# echo function > ./current_tracer
      	# echo 0 > options/context-info
      	# cat trace
      
      This is what it looked like before:
      # tracer: function
      #
      #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
      #              | |       |          |         |
      add_preempt_count <-schedule
      rcu_note_context_switch <-schedule
      ...
      
      This is what it looks like now:
      # tracer: function
      #
      _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel
      ...
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1307113131-10045-4-git-send-email-jolsa@redhat.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f56e7f8e
    • J
      tracing, function_graph: Merge overhead and duration display functions · ffeb80fc
      Jiri Olsa 提交于
      Functions print_graph_overhead() and print_graph_duration() displays
      data for one field - DURATION.
      
      I merged them into single function print_graph_duration(),
      and added a way to display the empty parts of the field.
      
      This way the print_graph_irq() function can use this column to display
      the IRQ signs if needed and the DURATION field details stays inside
      the print_graph_duration() function.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1307113131-10045-3-git-send-email-jolsa@redhat.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ffeb80fc
    • J
      tracing, function_graph: Remove dependency of abstime and duration fields on latency · 321e68b0
      Jiri Olsa 提交于
      The display of absolute time and duration fields is based on the
      latency field. This was added during the irqsoff/wakeup tracers
      graph support changes.
      
      It's causing confusion in what fields will be displayed for the
      function_graph tracer itself. So I'm removing this depency, and
      adding absolute time and duration fields to the preemptirqsoff
      preemptoff irqsoff wakeup tracers.
      
      With following commands:
      	# echo function_graph > ./current_tracer
      	# cat trace
      
      This is what it looked like before:
      # tracer: function_graph
      #
      #     TIME        CPU  DURATION                  FUNCTION CALLS
      #      |          |     |   |                     |   |   |   |
       0)   0.068 us    |          } /* page_add_file_rmap */
       0)               |          _raw_spin_unlock() {
      ...
      
      This is what it looks like now:
      # tracer: function_graph
      #
      # CPU  DURATION                  FUNCTION CALLS
      # |     |   |                     |   |   |   |
       0)   0.068 us    |                } /* add_preempt_count */
       0)   0.993 us    |              } /* vfsmount_lock_local_lock */
      ...
      
      For preemptirqsoff preemptoff irqsoff wakeup tracers,
      this is what it looked like before:
      SNIP
      #                       _-----=> irqs-off
      #                      / _----=> need-resched
      #                     | / _---=> hardirq/softirq
      #                     || / _--=> preempt-depth
      #                     ||| / _-=> lock-depth
      #                     |||| /
      # CPU  TASK/PID       |||||  DURATION                  FUNCTION CALLS
      # |     |    |        |||||   |   |                     |   |   |   |
       1)    <idle>-0    |  d..1  0.000 us    |  acpi_idle_enter_simple();
      ...
      
      This is what it looks like now:
      SNIP
      #
      #                                       _-----=> irqs-off
      #                                      / _----=> need-resched
      #                                     | / _---=> hardirq/softirq
      #                                     || / _--=> preempt-depth
      #                                     ||| /
      #     TIME        CPU  TASK/PID       ||||  DURATION                  FUNCTION CALLS
      #      |          |     |    |        ||||   |   |                     |   |   |   |
         19.847735 |   1)    <idle>-0    |  d..1  0.000 us    |  acpi_idle_enter_simple();
      ...
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1307113131-10045-2-git-send-email-jolsa@redhat.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      321e68b0
    • P
      ftrace: Fixed an include coding style issue · bd38c0e6
      Paul McQuade 提交于
      Removed <asm/ftrace.h> because <linux/ftrace.h> was already declared.
      Braces of struct's coding style fixed.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPaul McQuade <tungstentide@gmail.com>
      Link: http://lkml.kernel.org/r/4DE59711.3090900@gmail.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bd38c0e6
    • S
      tracing: Add disable_on_free option · cf30cf67
      Steven Rostedt 提交于
      Add a trace option to disable tracing on free. When this option is
      set, a write into the free_buffer file will not only shrink the
      ring buffer down to zero, but it will also disable tracing.
      
      Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cf30cf67
    • V
      tracing: Add a proc file to stop tracing and free buffer · 4f271a2a
      Vaibhav Nagarnaik 提交于
      The proc file entry buffer_size_kb is used to set the size of tracing
      buffer. The memory to expand the buffer size is kernel memory. Consider
      a use case where tracing is handled by a user space utility, which acts
      as a gate keeper for tracing requests. In an OOM condition, tracing is
      considered a low priority task and if the utility gets killed the ring
      buffer memory cannot be released back to the kernel.
      
      This patch adds a proc file called "free_buffer" whose purpose is to
      stop tracing and free up the ring buffer when it is closed.
      
      The user space process can then set the desired size in buffer_size_kb
      file and open the fd to the "free_buffer" file. Under OOM condition, if
      the process gets killed, the kernel closes the file descriptor. The
      release handler stops the tracing and releases the kernel memory
      automatically.
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: David Sharp <dhsharp@google.com>
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Link: http://lkml.kernel.org/r/1308012717-11148-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4f271a2a
    • V
      tracing: Use NUMA allocation for per-cpu ring buffer pages · 7ea59064
      Vaibhav Nagarnaik 提交于
      The tracing ring buffer is a group of per-cpu ring buffers where
      allocation and logging is done on a per-cpu basis. The events that are
      generated on a particular CPU are logged in the corresponding buffer.
      This is to provide wait-free writes between CPUs and good NUMA node
      locality while accessing the ring buffer.
      
      However, the allocation routines consider NUMA locality only for buffer
      page metadata and not for the actual buffer page. This causes the pages
      to be allocated on the NUMA node local to the CPU where the allocation
      routine is running at the time.
      
      This patch fixes the problem by using a NUMA node specific allocation
      routine so that the pages are allocated from a NUMA node local to the
      logging CPU.
      
      I tested with the getuid_microbench from autotest. It is a simple binary
      that calls getuid() in a loop and measures the average time for the
      syscall to complete. The following command was used to test:
      $ getuid_microbench 1000000
      
      Compared the numbers found on kernel with and without this patch and
      found that logging latency decreases by 30-50 ns/call.
      tracing with non-NUMA allocation - 569 ns/call
      tracing with NUMA allocation     - 512 ns/call
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: David Sharp <dhsharp@google.com>
      Link: http://lkml.kernel.org/r/1304470602-20366-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7ea59064
    • V
      tracing: Schedule a delayed work to call wakeup() · e7e2ee89
      Vaibhav Nagarnaik 提交于
      In using syscall tracing by concurrent processes, the wakeup() that is
      called in the event commit function causes contention on the spin lock
      of the waitqueue. I enabled sys_enter_getuid and sys_exit_getuid
      tracepoints, and by running getuid_microbench from autotest in parallel
      I found that the contention causes exponential latency increase in the
      tracing path.
      
      The autotest binary getuid_microbench calls getuid() in a tight loop for
      the given number of iterations and measures the average time required to
      complete a single invocation of syscall.
      
      The patch schedules a delayed work after 2 ms once an event commit calls
      to wake up the trace wait_queue. This removes the delay caused by
      contention on spin lock in wakeup() and amortizes the wakeup() calls
      scheduled over the 2 ms period.
      
      In the following example, the script enables the sys_enter_getuid and
      sys_exit_getuid tracepoints and runs the getuid_microbench in parallel
      with the given number of processes. The output clearly shows the latency
      increase caused by contentions.
      
      $ ~/getuid.sh 1
      1000000 calls in 0.720974253 s (720.974253 ns/call)
      
      $ ~/getuid.sh 2
      1000000 calls in 1.166457554 s (1166.457554 ns/call)
      1000000 calls in 1.168933765 s (1168.933765 ns/call)
      
      $ ~/getuid.sh 3
      1000000 calls in 1.783827516 s (1783.827516 ns/call)
      1000000 calls in 1.795553270 s (1795.553270 ns/call)
      1000000 calls in 1.796493376 s (1796.493376 ns/call)
      
      $ ~/getuid.sh 4
      1000000 calls in 4.483041796 s (4483.041796 ns/call)
      1000000 calls in 4.484165388 s (4484.165388 ns/call)
      1000000 calls in 4.484850762 s (4484.850762 ns/call)
      1000000 calls in 4.485643576 s (4485.643576 ns/call)
      
      $ ~/getuid.sh 5
      1000000 calls in 6.497521653 s (6497.521653 ns/call)
      1000000 calls in 6.502000236 s (6502.000236 ns/call)
      1000000 calls in 6.501709115 s (6501.709115 ns/call)
      1000000 calls in 6.502124100 s (6502.124100 ns/call)
      1000000 calls in 6.502936358 s (6502.936358 ns/call)
      
      After the patch, the latencies scale better.
      1000000 calls in 0.728720455 s (728.720455 ns/call)
      
      1000000 calls in 0.842782857 s (842.782857 ns/call)
      1000000 calls in 0.883803135 s (883.803135 ns/call)
      
      1000000 calls in 0.902077764 s (902.077764 ns/call)
      1000000 calls in 0.902838202 s (902.838202 ns/call)
      1000000 calls in 0.908896885 s (908.896885 ns/call)
      
      1000000 calls in 0.932523515 s (932.523515 ns/call)
      1000000 calls in 0.958009672 s (958.009672 ns/call)
      1000000 calls in 0.986188020 s (986.188020 ns/call)
      1000000 calls in 0.989771102 s (989.771102 ns/call)
      
      1000000 calls in 0.933518391 s (933.518391 ns/call)
      1000000 calls in 0.958897947 s (958.897947 ns/call)
      1000000 calls in 1.031038897 s (1031.038897 ns/call)
      1000000 calls in 1.089516025 s (1089.516025 ns/call)
      1000000 calls in 1.141998347 s (1141.998347 ns/call)
      Signed-off-by: NVaibhav Nagarnaik <vnagarnaik@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Rubin <mrubin@google.com>
      Cc: David Sharp <dhsharp@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1305059241-7629-1-git-send-email-vnagarnaik@google.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e7e2ee89
  2. 09 6月, 2011 1 次提交
    • S
      tracing: Fix regression in printk_formats file · db5e7ecc
      Steven Rostedt 提交于
      The fix to fix the printk_formats of modules broke the
      printk_formats of trace_printks in the kernel.
      
      The update of what to show via the seq_file was only updated
      if the passed in fmt was NULL, which happens only on the first
      iteration. The result was showing the first format every time
      instead of iterating through the available formats.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      db5e7ecc
  3. 08 6月, 2011 2 次提交
  4. 07 6月, 2011 1 次提交
  5. 26 5月, 2011 5 次提交
    • S
      ftrace: Add internal recursive checks · b1cff0ad
      Steven Rostedt 提交于
      Witold reported a reboot caused by the selftests of the dynamic function
      tracer. He sent me a config and I used ktest to do a config_bisect on it
      (as my config did not cause the crash). It pointed out that the problem
      config was CONFIG_PROVE_RCU.
      
      What happened was that if multiple callbacks are attached to the
      function tracer, we iterate a list of callbacks. Because the list is
      managed by synchronize_sched() and preempt_disable, the access to the
      pointers uses rcu_dereference_raw().
      
      When PROVE_RCU is enabled, the rcu_dereference_raw() calls some
      debugging functions, which happen to be traced. The tracing of the debug
      function would then call rcu_dereference_raw() which would then call the
      debug function and then... well you get the idea.
      
      I first wrote two different patches to solve this bug.
      
      1) add a __rcu_dereference_raw() that would not do any checks.
      2) add notrace to the offending debug functions.
      
      Both of these patches worked.
      
      Talking with Paul McKenney on IRC, he suggested to add recursion
      detection instead. This seemed to be a better solution, so I decided to
      implement it. As the task_struct already has a trace_recursion to detect
      recursion in the ring buffer, and that has a very small number it
      allows, I decided to use that same variable to add flags that can detect
      the recursion inside the infrastructure of the function tracer.
      
      I plan to change it so that the task struct bit can be checked in
      mcount, but as that requires changes to all archs, I will hold that off
      to the next merge window.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.comReported-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b1cff0ad
    • L
      tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine · 2fc1b6f0
      liubo 提交于
      Filesystem, like Btrfs, has some "ULL" macros, and when these macros are passed
      to tracepoints'__print_symbolic(), there will be 64->32 truncate WARNINGS during
      compiling on 32bit box.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/4DACE6E0.7000507@cn.fujitsu.comSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2fc1b6f0
    • S
      ftrace: Set ops->flag to enabled even on static function tracing · 3b6cfdb1
      Steven Rostedt 提交于
      When dynamic ftrace is not configured, the ops->flags still needs
      to have its FTRACE_OPS_FL_ENABLED bit set in ftrace_startup().
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3b6cfdb1
    • S
      tracing: Have event with function tracer check error return · 17bb615a
      Steven Rostedt 提交于
      The self tests for event tracer does not check if the function
      tracing was successfully activated. It needs to before it continues
      the tests, otherwise the wrong errors may be reported.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      17bb615a
    • S
      ftrace: Have ftrace_startup() return failure code · a1cd6173
      Steven Rostedt 提交于
      The register_ftrace_function() returns an error code on failure
      except if the call to ftrace_startup() fails. Add a error return to
      ftrace_startup() if it fails to start, allowing register_ftrace_funtion()
      to return a proper error value.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a1cd6173
  6. 19 5月, 2011 13 次提交
    • S
      ftrace: Add self-tests for multiple function trace users · 95950c2e
      Steven Rostedt 提交于
      Add some basic sanity tests for multiple users of the function
      tracer at startup.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      95950c2e
    • S
      ftrace: Modify ftrace_set_filter/notrace to take ops · 936e074b
      Steven Rostedt 提交于
      Since users of the function tracer can now pick and choose which
      functions they want to trace agnostically from other users of the
      function tracer, we need to pass the ops struct to the ftrace_set_filter()
      functions.
      
      The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
      is added to keep the old filter functions which are used to modify
      the generic function tracers.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      936e074b
    • S
      ftrace: Allow dynamically allocated function tracers · cdbe61bf
      Steven Rostedt 提交于
      Now that functions may be selected individually, it only makes sense
      that we should allow dynamically allocated trace structures to
      be traced. This will allow perf to allocate a ftrace_ops structure
      at runtime and use it to pick and choose which functions that
      structure will trace.
      
      Note, a dynamically allocated ftrace_ops will always be called
      indirectly instead of being called directly from the mcount in
      entry.S. This is because there's no safe way to prevent mcount
      from being preempted before calling the function, unless we
      modify every entry.S to do so (not likely). Thus, dynamically allocated
      functions will now be called by the ftrace_ops_list_func() that
      loops through the ops that are allocated if there are more than
      one op allocated at a time. This loop is protected with a
      preempt_disable.
      
      To determine if an ftrace_ops structure is allocated or not, a new
      util function was added to the kernel/extable.c called
      core_kernel_data(), which returns 1 if the address is between
      _sdata and _edata.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cdbe61bf
    • S
      ftrace: Implement separate user function filtering · b848914c
      Steven Rostedt 提交于
      ftrace_ops that are registered to trace functions can now be
      agnostic to each other in respect to what functions they trace.
      Each ops has their own hash of the functions they want to trace
      and a hash to what they do not want to trace. A empty hash for
      the functions they want to trace denotes all functions should
      be traced that are not in the notrace hash.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b848914c
    • S
      ftrace: Free hash with call_rcu_sched() · 07fd5515
      Steven Rostedt 提交于
      When a hash is modified and might be in use, we need to perform
      a schedule RCU operation on it, as the hashes will soon be used
      directly in the function tracer callback.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      07fd5515
    • S
      ftrace: Have global_ops store the functions that are to be traced · 2b499381
      Steven Rostedt 提交于
      This is a step towards each ops structure defining its own set
      of functions to trace. As the current code with pid's and such
      are specific to the global_ops, it is restructured to be used
      with the global ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      2b499381
    • S
      ftrace: Add ops parameter to ftrace_startup/shutdown functions · bd69c30b
      Steven Rostedt 提交于
      In order to allow different ops to enable different functions,
      the ftrace_startup() and ftrace_shutdown() functions need the
      ops parameter passed to them.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      bd69c30b
    • S
      ftrace: Add enabled_functions file · 647bcd03
      Steven Rostedt 提交于
      Add the enabled_functions file that is used to show all the
      functions that have been enabled for tracing as well as their
      ref counts. This helps seeing if any function has been registered
      and what functions are being traced.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      647bcd03
    • S
      ftrace: Use counters to enable functions to trace · ed926f9b
      Steven Rostedt 提交于
      Every function has its own record that stores the instruction
      pointer and flags for the function to be traced. There are only
      two flags: enabled and free. The enabled flag states that tracing
      for the function has been enabled (actively traced), and the free
      flag states that the record no longer points to a function and can
      be used by new functions (loaded modules).
      
      These flags are now moved to the MSB of the flags (actually just
      the top 32bits). The rest of the bits (30 bits) are now used as
      a ref counter. Everytime a tracer register functions to trace,
      those functions will have its counter incremented.
      
      When tracing is enabled, to determine if a function should be traced,
      the counter is examined, and if it is non-zero it is set to trace.
      
      When a ftrace_ops is registered to trace functions, its hashes
      are examined. If the ftrace_ops filter_hash count is zero, then
      all functions are set to be traced, otherwise only the functions
      in the hash are to be traced. The exception to this is if a function
      is also in the ftrace_ops notrace_hash. Then that function's counter
      is not incremented for this ftrace_ops.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ed926f9b
    • S
      ftrace: Separate hash allocation and assignment · 33dc9b12
      Steven Rostedt 提交于
      When filtering, allocate a hash to insert the function records.
      After the filtering is complete, assign it to the ftrace_ops structure.
      
      This allows the ftrace_ops structure to have a much smaller array of
      hash buckets instead of wasting a lot of memory.
      
      A read only empty_hash is created to be the minimum size that any ftrace_ops
      can point to.
      
      When a new hash is created, it has the following steps:
      
      o Allocate a default hash.
      o Walk the function records assigning the filtered records to the hash
      o Allocate a new hash with the appropriate size buckets
      o Move the entries from the default hash to the new hash.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      33dc9b12
    • S
      ftrace: Create a global_ops to hold the filter and notrace hashes · f45948e8
      Steven Rostedt 提交于
      Combine the filter and notrace hashes to be accessed by a single entity,
      the global_ops. The global_ops is a ftrace_ops structure that is passed
      to different functions that can read or modify the filtering of the
      function tracer.
      
      The ftrace_ops structure was modified to hold a filter and notrace
      hashes so that later patches may allow each ftrace_ops to have its own
      set of rules to what functions may be filtered.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f45948e8
    • S
      ftrace: Use hash instead for FTRACE_FL_FILTER · 1cf41dd7
      Steven Rostedt 提交于
      When multiple users are allowed to have their own set of functions
      to trace, having the FTRACE_FL_FILTER flag will not be enough to
      handle the accounting of those users. Each user will need their own
      set of functions.
      
      Replace the FTRACE_FL_FILTER with a filter_hash instead. This is
      temporary until the rest of the function filtering accounting
      gets in.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1cf41dd7
    • S
      ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions · b448c4e3
      Steven Rostedt 提交于
      To prepare for the accounting system that will allow multiple users of
      the function tracer, having the FTRACE_FL_NOTRACE as a flag in the
      dyn_trace record does not make sense.
      
      All ftrace_ops will soon have a hash of functions they should trace
      and not trace. By making a global hash of functions not to trace makes
      this easier for the transition.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b448c4e3
  7. 07 5月, 2011 1 次提交
  8. 30 4月, 2011 4 次提交