1. 13 10月, 2016 1 次提交
    • L
      Disable the __builtin_return_address() warning globally after all · ef6000b4
      Linus Torvalds 提交于
      This affectively reverts commit 377ccbb4 ("Makefile: Mute warning
      for __builtin_return_address(>0) for tracing only") because it turns out
      that it really isn't tracing only - it's all over the tree.
      
      We already also had the warning disabled separately for mm/usercopy.c
      (which this commit also removes), and it turns out that we will also
      want to disable it for get_lock_parent_ip(), that is used for at least
      TRACE_IRQFLAGS.  Which (when enabled) ends up being all over the tree.
      
      Steven Rostedt had a patch that tried to limit it to just the config
      options that actually triggered this, but quite frankly, the extra
      complexity and abstraction just isn't worth it.  We have never actually
      had a case where the warning is actually useful, so let's just disable
      it globally and not worry about it.
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef6000b4
  2. 03 9月, 2016 1 次提交
    • S
      tracing: Added hardware latency tracer · e7c15cd8
      Steven Rostedt (Red Hat) 提交于
      The hardware latency tracer has been in the PREEMPT_RT patch for some time.
      It is used to detect possible SMIs or any other hardware interruptions that
      the kernel is unaware of. Note, NMIs may also be detected, but that may be
      good to note as well.
      
      The logic is pretty simple. It simply creates a thread that spins on a
      single CPU for a specified amount of time (width) within a periodic window
      (window). These numbers may be adjusted by their cooresponding names in
      
         /sys/kernel/tracing/hwlat_detector/
      
      The defaults are window = 1000000 us (1 second)
                       width  =  500000 us (1/2 second)
      
      The loop consists of:
      
      	t1 = trace_clock_local();
      	t2 = trace_clock_local();
      
      Where trace_clock_local() is a variant of sched_clock().
      
      The difference of t2 - t1 is recorded as the "inner" timestamp and also the
      timestamp  t1 - prev_t2 is recorded as the "outer" timestamp. If either of
      these differences are greater than the time denoted in
      /sys/kernel/tracing/tracing_thresh then it records the event.
      
      When this tracer is started, and tracing_thresh is zero, it changes to the
      default threshold of 10 us.
      
      The hwlat tracer in the PREEMPT_RT patch was originally written by
      Jon Masters. I have modified it quite a bit and turned it into a
      tracer.
      Based-on-code-by: NJon Masters <jcm@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e7c15cd8
  3. 03 8月, 2016 1 次提交
    • S
      Makefile: Mute warning for __builtin_return_address(>0) for tracing only · 377ccbb4
      Steven Rostedt 提交于
      With the latest gcc compilers, they give a warning if
      __builtin_return_address() parameter is greater than 0. That is because if
      it is used by a function called by a top level function (or in the case of
      the kernel, by assembly), it can try to access stack frames outside the
      stack and crash the system.
      
      The tracing system uses __builtin_return_address() of up to 2! But it is
      well aware of the dangers that it may have, and has even added precautions
      to protect against it (see the thunk code in arch/x86/entry/thunk*.S)
      
      Linus originally added KBUILD_CFLAGS that would suppress the warning for the
      entire kernel, as simply adding KBUILD_CFLAGS to the tracing directory
      wouldn't work. The tracing directory plays a bit with the CFLAGS and
      requires a little more logic.
      
      This adds that special logic to only suppress the warning for the tracing
      directory. If it is used anywhere else outside of tracing, the warning will
      still be triggered.
      
      Link: http://lkml.kernel.org/r/20160728223043.51996267@grimm.local.homeTested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      377ccbb4
  4. 20 4月, 2016 2 次提交
    • T
      tracing: Add 'hist' event trigger command · 7ef224d1
      Tom Zanussi 提交于
      'hist' triggers allow users to continually aggregate trace events,
      which can then be viewed afterwards by simply reading a 'hist' file
      containing the aggregation in a human-readable format.
      
      The basic idea is very simple and boils down to a mechanism whereby
      trace events, rather than being exhaustively dumped in raw form and
      viewed directly, are automatically 'compressed' into meaningful tables
      completely defined by the user.
      
      This is done strictly via single-line command-line commands and
      without the aid of any kind of programming language or interpreter.
      
      A surprising number of typical use cases can be accomplished by users
      via this simple mechanism.  In fact, a large number of the tasks that
      users typically do using the more complicated script-based tracing
      tools, at least during the initial stages of an investigation, can be
      accomplished by simply specifying a set of keys and values to be used
      in the creation of a hash table.
      
      The Linux kernel trace event subsystem happens to provide an extensive
      list of keys and values ready-made for such a purpose in the form of
      the event format files associated with each trace event.  By simply
      consulting the format file for field names of interest and by plugging
      them into the hist trigger command, users can create an endless number
      of useful aggregations to help with investigating various properties
      of the system.  See Documentation/trace/events.txt for examples.
      
      hist triggers are implemented on top of the existing event trigger
      infrastructure, and as such are consistent with the existing triggers
      from a user's perspective as well.
      
      The basic syntax follows the existing trigger syntax.  Users start an
      aggregation by writing a 'hist' trigger to the event of interest's
      trigger file:
      
        # echo hist:keys=xxx [ if filter] > event/trigger
      
      Once a hist trigger has been set up, by default it continually
      aggregates every matching event into a hash table using the event key
      and a value field named 'hitcount'.
      
      To view the aggregation at any point in time, simply read the 'hist'
      file in the same directory as the 'trigger' file:
      
        # cat event/hist
      
      The detailed syntax provides additional options for user control, and
      is described exhaustively in Documentation/trace/events.txt and in the
      virtual tracing/README file in the tracing subsystem.
      
      Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.comSigned-off-by: NTom Zanussi <tom.zanussi@linux.intel.com>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Reviewed-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7ef224d1
    • T
      tracing: Add lock-free tracing_map · 08d43a5f
      Tom Zanussi 提交于
      Add tracing_map, a special-purpose lock-free map for tracing.
      
      tracing_map is designed to aggregate or 'sum' one or more values
      associated with a specific object of type tracing_map_elt, which
      is associated by the map to a given key.
      
      It provides various hooks allowing per-tracer customization and is
      separated out into a separate file in order to allow it to be shared
      between multiple tracers, but isn't meant to be generally used outside
      of that context.
      
      The tracing_map implementation was inspired by lock-free map
      algorithms originated by Dr. Cliff Click:
      
       http://www.azulsystems.com/blog/cliff/2007-03-26-non-blocking-hashtable
       http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf
      
      Link: http://lkml.kernel.org/r/b43d68d1add33582a396f553c8ef705a33a6a748.1449767187.git.tom.zanussi@linux.intel.comSigned-off-by: NTom Zanussi <tom.zanussi@linux.intel.com>
      Tested-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Reviewed-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      08d43a5f
  5. 02 4月, 2015 2 次提交
    • I
      bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable · e1abf2cc
      Ingo Molnar 提交于
      So bpf_tracing.o depends on CONFIG_BPF_SYSCALL - but that's not its only
      dependency, it also depends on the tracing infrastructure and on kprobes,
      without which it will fail to build with:
      
        In file included from kernel/trace/bpf_trace.c:14:0:
        kernel/trace/trace.h: In function ‘trace_test_and_set_recursion’:
        kernel/trace/trace.h:491:28: error: ‘struct task_struct’ has no member named ‘trace_recursion’
          unsigned int val = current->trace_recursion;
        [...]
      
      It took quite some time to trigger this build failure, because right now
      BPF_SYSCALL is very obscure, depends on CONFIG_EXPERT. So also make BPF_SYSCALL
      more configurable, not just under CONFIG_EXPERT.
      
      If BPF_SYSCALL, tracing and kprobes are enabled then enable the bpf_tracing
      gateway as well.
      
      We might want to make this an interactive option later on, although
      I'd not complicate it unnecessarily: enabling BPF_SYSCALL is enough of
      an indicator that the user wants BPF support.
      
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e1abf2cc
    • A
      tracing, perf: Implement BPF programs attached to kprobes · 2541517c
      Alexei Starovoitov 提交于
      BPF programs, attached to kprobes, provide a safe way to execute
      user-defined BPF byte-code programs without being able to crash or
      hang the kernel in any way. The BPF engine makes sure that such
      programs have a finite execution time and that they cannot break
      out of their sandbox.
      
      The user interface is to attach to a kprobe via the perf syscall:
      
      	struct perf_event_attr attr = {
      		.type	= PERF_TYPE_TRACEPOINT,
      		.config	= event_id,
      		...
      	};
      
      	event_fd = perf_event_open(&attr,...);
      	ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
      
      'prog_fd' is a file descriptor associated with BPF program
      previously loaded.
      
      'event_id' is an ID of the kprobe created.
      
      Closing 'event_fd':
      
      	close(event_fd);
      
      ... automatically detaches BPF program from it.
      
      BPF programs can call in-kernel helper functions to:
      
        - lookup/update/delete elements in maps
      
        - probe_read - wraper of probe_kernel_read() used to access any
          kernel data structures
      
      BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is
      architecture dependent) and return 0 to ignore the event and 1 to store
      kprobe event into the ring buffer.
      
      Note, kprobes are a fundamentally _not_ a stable kernel ABI,
      so BPF programs attached to kprobes must be recompiled for
      every kernel version and user must supply correct LINUX_VERSION_CODE
      in attr.kern_version during bpf_prog_load() call.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-4-git-send-email-ast@plumgrid.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2541517c
  6. 29 1月, 2015 1 次提交
  7. 13 12月, 2014 1 次提交
  8. 20 11月, 2014 2 次提交
  9. 01 7月, 2014 1 次提交
  10. 30 5月, 2014 1 次提交
    • S
      tracing: Add tracepoint benchmark tracepoint · 81dc9f0e
      Steven Rostedt (Red Hat) 提交于
      In order to help benchmark the time tracepoints take, a new config
      option is added called CONFIG_TRACEPOINT_BENCHMARK. When this option
      is set a tracepoint is created called "benchmark:benchmark_event".
      When the tracepoint is enabled, it kicks off a kernel thread that
      goes into an infinite loop (calling cond_sched() to let other tasks
      run), and calls the tracepoint. Each iteration will record the time
      it took to write to the tracepoint and the next iteration that
      data will be passed to the tracepoint itself. That is, the tracepoint
      will report the time it took to do the previous tracepoint.
      The string written to the tracepoint is a static string of 128 bytes
      to keep the time the same. The initial string is simply a write of
      "START". The second string records the cold cache time of the first
      write which is not added to the rest of the calculations.
      
      As it is a tight loop, it benchmarks as hot cache. That's fine because
      we care most about hot paths that are probably in cache already.
      
      An example of the output:
      
           START
           first=3672 [COLD CACHED]
           last=632 first=3672 max=632 min=632 avg=316 std=446 std^2=199712
           last=278 first=3672 max=632 min=278 avg=303 std=316 std^2=100337
           last=277 first=3672 max=632 min=277 avg=296 std=258 std^2=67064
           last=273 first=3672 max=632 min=273 avg=292 std=224 std^2=50411
           last=273 first=3672 max=632 min=273 avg=288 std=200 std^2=40389
           last=281 first=3672 max=632 min=273 avg=287 std=183 std^2=33666
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      81dc9f0e
  11. 21 12月, 2013 1 次提交
    • T
      tracing: Add basic event trigger framework · 85f2b082
      Tom Zanussi 提交于
      Add a 'trigger' file for each trace event, enabling 'trace event
      triggers' to be set for trace events.
      
      'trace event triggers' are patterned after the existing 'ftrace
      function triggers' implementation except that triggers are written to
      per-event 'trigger' files instead of to a single file such as the
      'set_ftrace_filter' used for ftrace function triggers.
      
      The implementation is meant to be entirely separate from ftrace
      function triggers, in order to keep the respective implementations
      relatively simple and to allow them to diverge.
      
      The event trigger functionality is built on top of SOFT_DISABLE
      functionality.  It adds a TRIGGER_MODE bit to the ftrace_event_file
      flags which is checked when any trace event fires.  Triggers set for a
      particular event need to be checked regardless of whether that event
      is actually enabled or not - getting an event to fire even if it's not
      enabled is what's already implemented by SOFT_DISABLE mode, so trigger
      mode directly reuses that.  Event trigger essentially inherit the soft
      disable logic in __ftrace_event_enable_disable() while adding a bit of
      logic and trigger reference counting via tm_ref on top of that in a
      new trace_event_trigger_enable_disable() function.  Because the base
      __ftrace_event_enable_disable() code now needs to be invoked from
      outside trace_events.c, a wrapper is also added for those usages.
      
      The triggers for an event are actually invoked via a new function,
      event_triggers_call(), and code is also added to invoke them for
      ftrace_raw_event calls as well as syscall events.
      
      The main part of the patch creates a new trace_events_trigger.c file
      to contain the trace event triggers implementation.
      
      The standard open, read, and release file operations are implemented
      here.
      
      The open() implementation sets up for the various open modes of the
      'trigger' file.  It creates and attaches the trigger iterator and sets
      up the command parser.  If opened for reading set up the trigger
      seq_ops.
      
      The read() implementation parses the event trigger written to the
      'trigger' file, looks up the trigger command, and passes it along to
      that event_command's func() implementation for command-specific
      processing.
      
      The release() implementation does whatever cleanup is needed to
      release the 'trigger' file, like releasing the parser and trigger
      iterator, etc.
      
      A couple of functions for event command registration and
      unregistration are added, along with a list to add them to and a mutex
      to protect them, as well as an (initially empty) registration function
      to add the set of commands that will be added by future commits, and
      call to it from the trace event initialization code.
      
      also added are a couple trigger-specific data structures needed for
      these implementations such as a trigger iterator and a struct for
      trigger-specific data.
      
      A couple structs consisting mostly of function meant to be implemented
      in command-specific ways, event_command and event_trigger_ops, are
      used by the generic event trigger command implementations.  They're
      being put into trace.h alongside the other trace_event data structures
      and functions, in the expectation that they'll be needed in several
      trace_event-related files such as trace_events_trigger.c and
      trace_events.c.
      
      The event_command.func() function is meant to be called by the trigger
      parsing code in order to add a trigger instance to the corresponding
      event.  It essentially coordinates adding a live trigger instance to
      the event, and arming the triggering the event.
      
      Every event_command func() implementation essentially does the
      same thing for any command:
      
         - choose ops - use the value of param to choose either a number or
           count version of event_trigger_ops specific to the command
         - do the register or unregister of those ops
         - associate a filter, if specified, with the triggering event
      
      The reg() and unreg() ops allow command-specific implementations for
      event_trigger_op registration and unregistration, and the
      get_trigger_ops() op allows command-specific event_trigger_ops
      selection to be parameterized.  When a trigger instance is added, the
      reg() op essentially adds that trigger to the triggering event and
      arms it, while unreg() does the opposite.  The set_filter() function
      is used to associate a filter with the trigger - if the command
      doesn't specify a set_filter() implementation, the command will ignore
      filters.
      
      Each command has an associated trigger_type, which serves double duty,
      both as a unique identifier for the command as well as a value that
      can be used for setting a trigger mode bit during trigger invocation.
      
      The signature of func() adds a pointer to the event_command struct,
      used to invoke those functions, along with a command_data param that
      can be passed to the reg/unreg functions.  This allows func()
      implementations to use command-specific blobs and supports code
      re-use.
      
      The event_trigger_ops.func() command corrsponds to the trigger 'probe'
      function that gets called when the triggering event is actually
      invoked.  The other functions are used to list the trigger when
      needed, along with a couple mundane book-keeping functions.
      
      This also moves event_file_data() into trace.h so it can be used
      outside of trace_events.c.
      
      Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.comSigned-off-by: NTom Zanussi <tom.zanussi@linux.intel.com>
      Idea-by: NSteve Rostedt <rostedt@goodmis.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      85f2b082
  12. 14 9月, 2012 1 次提交
  13. 31 7月, 2012 1 次提交
  14. 07 5月, 2012 2 次提交
    • S
      tracing: Provide trace events interface for uprobes · f3f096cf
      Srikar Dronamraju 提交于
      Implements trace_event support for uprobes. In its current form
      it can be used to put probes at a specified offset in a file and
      dump the required registers when the code flow reaches the
      probed address.
      
      The following example shows how to dump the instruction pointer
      and %ax a register at the probed text address.  Here we are
      trying to probe zfree in /bin/zsh:
      
       # cd /sys/kernel/debug/tracing/
       # cat /proc/`pgrep  zsh`/maps | grep /bin/zsh | grep r-xp
       00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
       # objdump -T /bin/zsh | grep -w zfree
       0000000000446420 g    DF .text  0000000000000012  Base
       zfree # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
       # cat uprobe_events
       p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420
       # echo 1 > events/uprobes/enable
       # sleep 20
       # echo 0 > events/uprobes/enable
       # cat trace
       # tracer: nop
       #
       #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
       #              | |       |          |         |
                    zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
                    zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120411103043.GB29437@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f3f096cf
    • S
      tracing: Extract out common code for kprobes/uprobes trace events · 8ab83f56
      Srikar Dronamraju 提交于
      Move parts of trace_kprobe.c that can be shared with upcoming
      trace_uprobe.c. Common code to kernel/trace/trace_probe.h and
      kernel/trace/trace_probe.c. There are no functional changes.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120409091144.8343.76218.sendpatchset@srdronam.in.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8ab83f56
  15. 12 4月, 2012 1 次提交
  16. 30 9月, 2011 1 次提交
  17. 28 9月, 2011 1 次提交
  18. 20 8月, 2011 1 次提交
  19. 08 1月, 2011 1 次提交
  20. 05 8月, 2010 1 次提交
  21. 20 7月, 2010 1 次提交
  22. 16 7月, 2010 1 次提交
    • F
      tracing: Remove ksym tracer · 5d550467
      Frederic Weisbecker 提交于
      The ksym (breakpoint) ftrace plugin has been superseded by perf
      tools that are much more poweful to use the cpu breakpoints.
      This tracer doesn't bring more feature. It has been deprecated
      for a while now, lets remove it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      5d550467
  23. 09 6月, 2010 2 次提交
  24. 26 3月, 2010 1 次提交
    • P
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra 提交于
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      faa4602e
  25. 10 3月, 2010 1 次提交
  26. 28 12月, 2009 1 次提交
    • L
      perf events: Remove CONFIG_EVENT_PROFILE · 07b139c8
      Li Zefan 提交于
      Quoted from Ingo:
      
      | This reminds me - i think we should eliminate CONFIG_EVENT_PROFILE -
      | it's an unnecessary Kconfig complication. If both PERF_EVENTS and
      | EVENT_TRACING is enabled we should expose generic tracepoints.
      |
      | Nor is it limited to event 'profiling', so it has become a misnomer as
      | well.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <4B2F1557.2050705@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      07b139c8
  27. 04 11月, 2009 1 次提交
    • M
      tracing/kprobes: Rename Kprobe-tracer to kprobe-event · 77b44d1b
      Masami Hiramatsu 提交于
      Rename Kprobes-based event tracer to kprobes-based tracing event
      (kprobe-event), since it is not a tracer but an extensible
      tracing event interface.
      
      This also changes CONFIG_KPROBE_TRACER to CONFIG_KPROBE_EVENT
      and sets it y by default.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      LKML-Reference: <20091104001247.3454.14131.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      77b44d1b
  28. 19 9月, 2009 1 次提交
  29. 27 8月, 2009 2 次提交
    • M
      tracing: Add kprobe-based event tracer · 413d37d1
      Masami Hiramatsu 提交于
      Add kprobes-based event tracer on ftrace.
      
      This tracer is similar to the events tracer which is based on Tracepoint
      infrastructure. Instead of Tracepoint, this tracer is based on kprobes
      (kprobe and kretprobe). It probes anywhere where kprobes can probe(this
       means, all functions body except for __kprobes functions).
      
      Similar to the events tracer, this tracer doesn't need to be activated
      via current_tracer, instead of that, just set probe points via
      /sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
      probe events via /sys/kernel/debug/tracing/events/kprobes/<EVENT>/filter.
      
      This tracer supports following probe arguments for each probe.
      
        %REG  : Fetch register REG
        sN    : Fetch Nth entry of stack (N >= 0)
        sa    : Fetch stack address.
        @ADDR : Fetch memory at ADDR (ADDR should be in kernel)
        @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
        aN    : Fetch function argument. (N >= 0)
        rv    : Fetch return value.
        ra    : Fetch return address.
        +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.
      
      See Documentation/trace/kprobetrace.txt in the next patch for details.
      
      Changes from v13:
       - Support 'sa' for stack address.
       - Use call->data instead of container_of() macro.
      
      [fweisbec@gmail.com: Fixed conflict against latest tracing/core]
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203510.31965.29123.stgit@localhost.localdomain>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      413d37d1
    • D
      net: Temporarily backout SKB sources tracer. · 31ffe249
      David S. Miller 提交于
      Steven Rostedt has suggested that Neil work with the tracing
      folks, trying to use TRACE_EVENT as the mechanism for
      implementation.  And if that doesn't workout we can investigate
      other solutions such as that one which was tried here.
      
      This reverts the following 2 commits:
      
      5a165657
      ("net: skb ftracer - Add config option to enable new ftracer (v3)")
      
      9ec04da7
      ("net: skb ftracer - Add actual ftrace code to kernel (v3)")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31ffe249
  30. 14 8月, 2009 1 次提交
  31. 10 6月, 2009 1 次提交
    • L
      tracing/events: convert block trace points to TRACE_EVENT() · 55782138
      Li Zefan 提交于
      TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
      these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
        ...
      
      Cons:
      
        - no dev_t info for the output of plug, unplug_timer and unplug_io events.
          no dev_t info for getrq and sleeprq events if bio == NULL.
          no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.
      
          This is mainly because we can't get the deivce from a request queue.
          But this may change in the future.
      
        - A packet command is converted to a string in TP_assign, not TP_print.
          While blktrace do the convertion just before output.
      
          Since pc requests should be rather rare, this is not a big issue.
      
        - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
          has a unique format, which means we have some unused data in a trace entry.
      
          The overhead is minimized by using __dynamic_array() instead of __array().
      
      I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:
      
            dd                   dd + ioctl blktrace       dd + TRACE_EVENT (splice)
      1     7.36s, 42.7 MB/s     7.50s, 42.0 MB/s          7.41s, 42.5 MB/s
      2     7.43s, 42.3 MB/s     7.48s, 42.1 MB/s          7.43s, 42.4 MB/s
      3     7.38s, 42.6 MB/s     7.45s, 42.2 MB/s          7.41s, 42.5 MB/s
      
      So the overhead of tracing is very small, and no regression when using
      those trace events vs blktrace.
      
      And the binary output of TRACE_EVENT is much smaller than blktrace:
      
       # ls -l -h
       -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
       -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
       -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out
      
      Following are some comparisons between TRACE_EVENT and blktrace:
      
      plug:
        kjournald-480   [000]   303.084981: block_plug: [kjournald]
        kjournald-480   [000]   303.084981:   8,0    P   N [kjournald]
      
      unplug_io:
        kblockd/0-118   [000]   300.052973: block_unplug_io: [kblockd/0] 1
        kblockd/0-118   [000]   300.052974:   8,0    U   N [kblockd/0] 1
      
      remap:
        kjournald-480   [000]   303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384
        kjournald-480   [000]   303.085043:   8,0    A   W 102736992 + 8 <- (8,8) 33384
      
      bio_backmerge:
        kjournald-480   [000]   303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald]
        kjournald-480   [000]   303.085086:   8,0    M   W 102737032 + 8 [kjournald]
      
      getrq:
        kjournald-480   [000]   303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084975:   8,0    G   W 102736984 + 8 [kjournald]
      
        bash-2066  [001]  1072.953770:   8,0    G   N [bash]
        bash-2066  [001]  1072.953773: block_getrq: 0,0 N 0 + 0 [bash]
      
      rq_complete:
        konsole-2065  [001]   300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0]
        konsole-2065  [001]   300.053191:   8,0    C   W 103669040 + 16 [0]
      
        ksoftirqd/1-7   [001]  1072.953811:   8,0    C   N (5a 00 08 00 00 00 00 00 24 00) [0]
        ksoftirqd/1-7   [001]  1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0]
      
      rq_insert:
        kjournald-480   [000]   303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald]
        kjournald-480   [000]   303.084986:   8,0    I   W 102736984 + 8 [kjournald]
      
      Changelog from v2 -> v3:
      
      - use the newly introduced __dynamic_array().
      
      Changelog from v1 -> v2:
      
      - use __string() instead of __array() to minimize the memory required
        to store hex dump of rq->cmd().
      
      - support large pc requests.
      
      - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.
      
      - some cleanups.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      55782138
  32. 03 6月, 2009 1 次提交
  33. 06 5月, 2009 1 次提交
    • S
      ring-buffer: add benchmark and tester · 5092dbc9
      Steven Rostedt 提交于
      This patch adds code that can benchmark the ring buffer as well as
      test it. This code can be compiled into the kernel (not recommended)
      or as a module.
      
      A separate ring buffer is used to not interfer with other users, like
      ftrace. It creates a producer and a consumer (option to disable creation
      of the consumer) and will run for 10 seconds, then sleep for 10 seconds
      and then repeat.
      
      While running, the producer will write 10 byte loads into the ring
      buffer with just putting in the current CPU number. The reader will
      continually try to read the buffer. The reader will alternate from reading
      the buffer via event by event, or by full pages.
      
      The output is a pr_info, thus it will fill up the syslogs.
      
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9000349 (usecs)
        Overruns: 12578640
        Read:     5358440  (by events)
        Entries:  0
        Total:    17937080
        Missed:   0
        Hit:      17937080
        Entries per millisec: 1993
        501 ns per entry
        Sleeping for 10 secs
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9936350 (usecs)
        Overruns: 0
        Read:     28146644  (by pages)
        Entries:  74
        Total:    28146718
        Missed:   0
        Hit:      28146718
        Entries per millisec: 2832
        353 ns per entry
        Sleeping for 10 secs
      
      Time:      is the time the test ran
      Overruns:  the number of events that were overwritten and not read
      Read:      the number of events read (either by pages or events)
      Entries:   the number of entries left in the buffer
                       (the by pages will only read full pages)
      Total:     Entries + Read + Overruns
      Missed:    the number of entries that failed to write
      Hit:       the number of entries that were written
      
      The above example shows that it takes ~353 nanosecs per entry when
      there is a reader, reading by pages (and no overruns)
      
      The event by event reader slowed the producer down to 501 nanosecs.
      
      [ Impact: see how changes to the ring buffer affect stability and performance ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5092dbc9
  34. 15 4月, 2009 1 次提交