1. 04 3月, 2010 1 次提交
  2. 27 2月, 2010 1 次提交
    • S
      ftrace: Add function names to dangling } in function graph tracer · f1c7f517
      Steven Rostedt 提交于
      The function graph tracer is currently the most invasive tracer
      in the ftrace family. It can easily overflow the buffer even with
      10megs per CPU. This means that events can often be lost.
      
      On start up, or after events are lost, if the function return is
      recorded but the function enter was lost, all we get to see is the
      exiting '}'.
      
      Here is how a typical trace output starts:
      
       [tracing] cat trace
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
        0) + 91.897 us   |                  }
        0) ! 567.961 us  |                }
        0)   <========== |
        0) ! 579.083 us  |                _raw_spin_lock_irqsave();
        0)   4.694 us    |                _raw_spin_unlock_irqrestore();
        0) ! 594.862 us  |              }
        0) ! 603.361 us  |            }
        0) ! 613.574 us  |          }
        0) ! 623.554 us  |        }
        0)   3.653 us    |        fget_light();
        0)               |        sock_poll() {
      
      There are a series of '}' with no matching "func() {". There's no information
      to what functions these ending brackets belong to.
      
      This patch adds a stack on the per cpu structure used in outputting
      the function graph tracer to keep track of what function was outputted.
      Then on a function exit event, it checks the depth to see if the
      function exit has a matching entry event. If it does, then it only
      prints the '}', otherwise it adds the function name after the '}'.
      
      This allows function exit events to show what function they belong to
      at trace output startup, when the entry was lost due to ring buffer
      overflow, or even after a new task is scheduled in.
      
      Here is what the above trace will look like after this patch:
      
       [tracing] cat trace
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
        0) + 91.897 us   |                  } (irq_exit)
        0) ! 567.961 us  |                } (smp_apic_timer_interrupt)
        0)   <========== |
        0) ! 579.083 us  |                _raw_spin_lock_irqsave();
        0)   4.694 us    |                _raw_spin_unlock_irqrestore();
        0) ! 594.862 us  |              } (add_wait_queue)
        0) ! 603.361 us  |            } (__pollwait)
        0) ! 613.574 us  |          } (tcp_poll)
        0) ! 623.554 us  |        } (sock_poll)
        0)   3.653 us    |        fget_light();
        0)               |        sock_poll() {
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f1c7f517
  3. 25 2月, 2010 6 次提交
  4. 17 2月, 2010 2 次提交
  5. 14 2月, 2010 1 次提交
  6. 12 2月, 2010 1 次提交
  7. 10 2月, 2010 1 次提交
    • S
      tracing: Add correct/incorrect to sort keys for branch annotation output · ede55c9d
      Steven Rostedt 提交于
      The branch annotation is a bit difficult to see the worst offenders
      because it only sorts by percentage:
      
       correct incorrect  %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
             0      163 100 qdisc_restart                  sch_generic.c        179
             0      163 100 pfifo_fast_dequeue             sch_generic.c        447
             0        4 100 pskb_trim_rcsum                skbuff.h             1689
             0        4 100 llc_rcv                        llc_input.c          170
             0       18 100 psmouse_interrupt              psmouse-base.c       304
             0        3 100 atkbd_interrupt                atkbd.c              389
             0        5 100 usb_alloc_dev                  usb.c                437
             0       11 100 vsscanf                        vsprintf.c           1897
             0        2 100 IS_ERR                         err.h                34
             0       23 100 __rmqueue_fallback             page_alloc.c         865
             0        4 100 probe_wakeup_sched_switch      trace_sched_wakeup.c 142
             0        3 100 move_masked_irq                migration.c          11
      
      Adding the incorrect and correct values as sort keys makes this file a
      bit more informative:
      
       correct incorrect  %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
             0   366541 100 audit_syscall_entry            auditsc.c            1637
             0   366538 100 audit_syscall_exit             auditsc.c            1685
             0   115839 100 sched_info_switch              sched_stats.h        269
             0    74567 100 sched_info_queued              sched_stats.h        222
             0    66578 100 sched_info_dequeued            sched_stats.h        177
             0    15113 100 trace_workqueue_insertion      workqueue.h          38
             0    15107 100 trace_workqueue_execution      workqueue.h          45
             0     3622 100 syscall_trace_leave            ptrace.c             1772
             0     2750 100 sched_move_task                sched.c              10100
             0     2750 100 sched_move_task                sched.c              10110
             0     1815 100 pre_schedule_rt                sched_rt.c           1462
             0      837 100 audit_alloc                    auditsc.c            879
             0      814 100 tcp_mss_split_point            tcp_output.c         1302
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ede55c9d
  8. 04 2月, 2010 2 次提交
    • M
      ftrace: Remove record freezing · f24bb999
      Masami Hiramatsu 提交于
      Remove record freezing. Because kprobes never puts probe on
      ftrace's mcount call anymore, it doesn't need ftrace to check
      whether kprobes on it.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100202214925.4694.73469.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f24bb999
    • M
      ftrace/alternatives: Introducing *_text_reserved functions · 2cfa1978
      Masami Hiramatsu 提交于
      Introducing *_text_reserved functions for checking the text
      address range is partially reserved or not. This patch provides
      checking routines for x86 smp alternatives and dynamic ftrace.
      Since both functions modify fixed pieces of kernel text, they
      should reserve and protect those from other dynamic text
      modifier, like kprobes.
      
      This will also be extended when introducing other subsystems
      which modify fixed pieces of kernel text. Dynamic text modifiers
      should avoid those.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: przemyslaw@pawelczyk.it
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <20100202214911.4694.16587.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2cfa1978
  9. 02 2月, 2010 1 次提交
    • L
      tracing: Fix circular dead lock in stack trace · 4f48f8b7
      Lai Jiangshan 提交于
      When we cat <debugfs>/tracing/stack_trace, we may cause circular lock:
      sys_read()
        t_start()
           arch_spin_lock(&max_stack_lock);
      
        t_show()
           seq_printf(), vsnprintf() .... /* they are all trace-able,
             when they are traced, max_stack_lock may be required again. */
      
      The following script can trigger this circular dead lock very easy:
      #!/bin/bash
      
      echo 1 > /proc/sys/kernel/stack_tracer_enabled
      
      mount -t debugfs xxx /mnt > /dev/null 2>&1
      
      (
      # make check_stack() zealous to require max_stack_lock
      for ((; ;))
      {
      	echo 1 > /mnt/tracing/stack_max_size
      }
      ) &
      
      for ((; ;))
      {
      	cat /mnt/tracing/stack_trace > /dev/null
      }
      
      To fix this bug, we increase the percpu trace_active before
      require the lock.
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B67D4F9.9080905@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4f48f8b7
  10. 29 1月, 2010 3 次提交
  11. 27 1月, 2010 3 次提交
    • M
      tracing/documentation: Cover new frame pointer semantics · 03688970
      Mike Frysinger 提交于
      Update the graph tracer examples to cover the new frame pointer semantics
      (in terms of passing it along).  Move the HAVE_FUNCTION_GRAPH_FP_TEST docs
      out of the Kconfig, into the right place, and expand on the details.
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      LKML-Reference: <1264165967-18938-1-git-send-email-vapier@gentoo.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      03688970
    • S
      ring-buffer: Check for end of page in iterator · 3c05d748
      Steven Rostedt 提交于
      If the iterator comes to an empty page for some reason, or if
      the page is emptied by a consuming read. The iterator code currently
      does not check if the iterator is pass the contents, and may
      return a false entry.
      
      This patch adds a check to the ring buffer iterator to test if the
      current page has been completely read and sets the iterator to the
      next page if necessary.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3c05d748
    • S
      ring-buffer: Check if ring buffer iterator has stale data · 492a74f4
      Steven Rostedt 提交于
      Usually reads of the ring buffer is performed by a single task.
      There are two types of reads from the ring buffer.
      
      One is a consuming read which will consume the entry that was read
      and the next read will be the entry that follows.
      
      The other is an iterator that will let the user read the contents of
      the ring buffer without modifying it. When an iterator is allocated,
      writes to the ring buffer are disabled to protect the iterator.
      
      The problem exists when consuming reads happen while an iterator is
      allocated. Specifically, the kind of read that swaps out an entire
      page (used by splice) and replaces it with a new read. If the iterator
      is on the page that is swapped out, then the next read may read
      from this swapped out page and return garbage.
      
      This patch adds a check when reading the iterator to make sure that
      the iterator contents are still valid. If a consuming read has taken
      place, the iterator is reset.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      492a74f4
  12. 26 1月, 2010 1 次提交
    • S
      tracing: Prevent kernel oops with corrupted buffer · 74bf4076
      Steven Rostedt 提交于
      If the contents of the ftrace ring buffer gets corrupted and the trace
      file is read, it could create a kernel oops (usualy just killing the user
      task thread). This is caused by the checking of the pid in the buffer.
      If the pid is negative, it still references the cmdline cache array,
      which could point to an invalid address.
      
      The simple fix is to test for negative PIDs.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      74bf4076
  13. 17 1月, 2010 2 次提交
    • M
      tracing/kprobe: Update kprobe tracing self test for new syntax · 231e36f4
      Masami Hiramatsu 提交于
      Update kprobe tracing self test for new syntax (it supports
      deleting individual probes, and drops $argN support)
      and behavior change (new probes are disabled in default).
      
      This selftest includes the following checks:
      
       - Adding function-entry probe and return probe with arguments.
       - Enabling these probes.
       - Deleting it individually.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100114051211.7814.29436.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      231e36f4
    • F
      tracing: Drop the tr check from the graph tracing path · 24a53652
      Frederic Weisbecker 提交于
      Each time we save a function entry from the function graph
      tracer, we check if the trace array is set, which is wasteful
      because it is set anyway before we start the tracer. All we need
      is to ensure we have good read and write orderings. When we set
      the trace array, we just need to guarantee it to be visible
      before starting tracing.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <1263453795-7496-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      24a53652
  14. 15 1月, 2010 6 次提交
  15. 13 1月, 2010 1 次提交
    • M
      tracing/kprobe: Drop function argument access syntax · 14640106
      Masami Hiramatsu 提交于
      Drop function argument access syntax, because the function
      arguments depend on not only architecture but also
      compile-options and function API. And now, we have perf-probe
      for finding register/memory assigned to each argument.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: linuxppc-dev@ozlabs.org
      LKML-Reference: <20100105224648.19431.52309.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14640106
  16. 07 1月, 2010 8 次提交
    • S
      ring-buffer: Add rb_list_head() wrapper around new reader page next field · 0e1ff5d7
      Steven Rostedt 提交于
      If the very unlikely case happens where the writer moves the head by one
      between where the head page is read and where the new reader page
      is assigned _and_ the writer then writes and wraps the entire ring buffer
      so that the head page is back to what was originally read as the head page,
      the page to be swapped will have a corrupted next pointer.
      
      Simple solution is to wrap the assignment of the next pointer with a
      rb_list_head().
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0e1ff5d7
    • D
      ring-buffer: Wrap a list.next reference with rb_list_head() · 5ded3dc6
      David Sharp 提交于
      This reference at the end of rb_get_reader_page() was causing off-by-one
      writes to the prev pointer of the page after the reader page when that
      page is the head page, and therefore the reader page has the RB_PAGE_HEAD
      flag in its list.next pointer. This eventually results in a GPF in a
      subsequent call to rb_set_head_page() (usually from rb_get_reader_page())
      when that prev pointer is dereferenced. The dereferenced register would
      characteristically have an address that appears shifted left by one byte
      (eg, ffxxxxxxxxxxxxyy instead of ffffxxxxxxxxxxxx) due to being written at
      an address one byte too high.
      Signed-off-by: NDavid Sharp <dhsharp@google.com>
      LKML-Reference: <1262826727-9090-1-git-send-email-dhsharp@google.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5ded3dc6
    • S
      tracing: Add stack dump to trace_printk if stacktrace option is set · d931369b
      Steven Rostedt 提交于
      If the ftrace stacktrace option is set, then add the stack dumps to
      trace_printk.
      Requested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d931369b
    • L
      tracing: Consolidate protection of reader access to the ring buffer · 7e53bd42
      Lai Jiangshan 提交于
      At the beginning, access to the ring buffer was fully serialized
      by trace_types_lock. Patch d7350c3f gives more freedom to readers,
      and patch b04cc6b1 adds code to protect trace_pipe and cpu#/trace_pipe.
      
      But actually it is not enough, ring buffer readers are not always
      read-only, they may consume data.
      
      This patch makes accesses to trace, trace_pipe, trace_pipe_raw
      cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
      And removes tracing_reader_cpumask which is used to protect trace_pipe.
      
      Details:
      
      Ring buffer serializes readers, but it is low level protection.
      The validity of the events (which returns by ring_buffer_peek() ..etc)
      are not protected by ring buffer.
      
      The content of events may become garbage if we allow another process to consume
      these events concurrently:
        A) the page of the consumed events may become a normal page
           (not reader page) in ring buffer, and this page will be rewritten
           by the events producer.
        B) The page of the consumed events may become a page for splice_read,
           and this page will be returned to system.
      
      This patch adds trace_access_lock() and trace_access_unlock() primitives.
      
      These primitives allow multi process access to different cpu ring buffers
      concurrently.
      
      These primitives don't distinguish read-only and read-consume access.
      Multi read-only access is also serialized.
      
      And we don't use these primitives when we open files,
      we only use them when we read files.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7e53bd42
    • L
      tracing: Remove show_format and related macros from TRACE_EVENT · 0fa0edaf
      Lai Jiangshan 提交于
      The previous patches added the use of print_fmt string and changes
      the trace_define_field() function to also create the fields and
      format output for the event format files.
      
         text	   data	    bss	    dec	    hex	filename
      5857201	1355780	9336808	16549789	 fc879d	vmlinux
      5884589	1351684	9337896	16574169	 fce6d9	vmlinux-orig
      
      The above shows the size of the vmlinux after this patch set
      compared to the vmlinux-orig which is before the patch set.
      
      This saves us 27k on text, 1k on bss and adds just 4k of data.
      
      The total savings of 24k in size.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B273D4D.40604@cn.fujitsu.com>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      0fa0edaf
    • L
      tracing: Use defined fields and print_fmt to print formats · 5a65e956
      Lai Jiangshan 提交于
      The calls ftrace_format_##call() and ftrace_define_fields_##call()
      are almost duplicate in functionality. With the addition of the
      print_fmt in previous patches, these two functions can be merged
      into one.
      
      The trace_define_field() defines the fields and links them into
      the struct ftrace_event_call. The previous patches introduced
      the print_fmt field and this can now be used with the trace_define_field()
      to create the event format file fields and print_fmt field.
      
      The struct ftrace_event_call->fields are used to print the fields
      The struct ftrace_event_call->print_fmt is used to print
      the "print fmt: XXXXXXXXXXX" line.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B273D49.5000006@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5a65e956
    • S
      tracing: Have syscall tracing call its own init function · c7ef3a90
      Steven Rostedt 提交于
      In the clean up of having all events call one specific function,
      the syscall event init was changed to call this helper function.
      
      With the new print_fmt updates, the syscalls need to do special
      initializations. This patch converts the syscall events to call
      its own init function again.
      
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c7ef3a90
    • L
      tracing/kprobes: Init print_fmt for kprobe events · a342a028
      Lai Jiangshan 提交于
      This is part of a patch set that removes the show_format method
      in the ftrace event macros.
      
      Add the print_fmt initialization to the kprobe events.
      The print_fmt is still not used, but will be in the follow up
      patches.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B273D45.3080100@cn.fujitsu.com>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a342a028