1. 16 12月, 2009 1 次提交
  2. 15 12月, 2009 4 次提交
  3. 14 12月, 2009 13 次提交
  4. 12 12月, 2009 1 次提交
    • S
      tracing: Add stack trace to irqsoff tracer · cc51a0fc
      Steven Rostedt 提交于
      The irqsoff and friends tracers help in finding causes of latency in the
      kernel. The also work with the function tracer to show what was happening
      when interrupts or preemption are disabled. But the function tracer has
      a bit of an overhead and can cause exagerated readings.
      
      Currently, when tracing with /proc/sys/kernel/ftrace_enabled = 0, where the
      function tracer is disabled, the information that is provided can end up
      being useless. For example, a 2 and a half millisecond latency only showed:
      
       # tracer: preemptirqsoff
       #
       # preemptirqsoff latency trace v1.1.5 on 2.6.32
       # --------------------------------------------------------------------
       # latency: 2463 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
       #    -----------------
       #    | task: -4242 (uid:0 nice:0 policy:0 rt_prio:0)
       #    -----------------
       #  => started at: _spin_lock_irqsave
       #  => ended at:   remove_wait_queue
       #
       #
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
       hackbenc-4242    2d....    0us!: trace_hardirqs_off <-_spin_lock_irqsave
       hackbenc-4242    2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue
       hackbenc-4242    2...1. 2466us : trace_preempt_on <-remove_wait_queue
      
      The above lets us know that hackbench with pid 2463 grabbed a spin lock
      somewhere and enabled preemption at remove_wait_queue. This helps a little
      but where this actually happened is not informative.
      
      This patch adds the stack dump to the end of the irqsoff tracer. This provides
      the following output:
      
       hackbenc-4242    2d....    0us!: trace_hardirqs_off <-_spin_lock_irqsave
       hackbenc-4242    2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue
       hackbenc-4242    2...1. 2466us : trace_preempt_on <-remove_wait_queue
       hackbenc-4242    2...1. 2467us : <stack trace>
        => sub_preempt_count
        => _spin_unlock_irqrestore
        => remove_wait_queue
        => free_poll_entry
        => poll_freewait
        => do_sys_poll
        => sys_poll
        => system_call_fastpath
      
      Now we see that the culprit of this latency was the free_poll_entry code.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cc51a0fc
  5. 11 12月, 2009 4 次提交
    • S
      tracing: Add trace_dump_stack() · 03889384
      Steven Rostedt 提交于
      I've been asked a few times about how to find out what is calling
      some location in the kernel. One way is to use dynamic function tracing
      and implement the func_stack_trace. But this only finds out who is
      calling a particular function. It does not tell you who is calling
      that function and entering a specific if conditional.
      
      I have myself implemented a quick version of trace_dump_stack() for
      this purpose a few times, and just needed it now. This is when I realized
      that this would be a good tool to have in the kernel like trace_printk().
      
      Using trace_dump_stack() is similar to dump_stack() except that it
      writes to the trace buffer instead and can be used in critical locations.
      
      For example:
      
      @@ -5485,8 +5485,12 @@ need_resched_nonpreemptible:
       	if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
       		if (unlikely(signal_pending_state(prev->state, prev)))
       			prev->state = TASK_RUNNING;
      -		else
      +		else {
       			deactivate_task(rq, prev, 1);
      +			trace_printk("Deactivating task %s:%d\n",
      +				     prev->comm, prev->pid);
      +			trace_dump_stack();
      +		}
       		switch_count = &prev->nvcsw;
       	}
      
      Produces:
      
                 <...>-3249  [001]   296.105269: schedule: Deactivating task ntpd:3249
                 <...>-3249  [001]   296.105270: <stack trace>
       => schedule
       => schedule_hrtimeout_range
       => poll_schedule_timeout
       => do_select
       => core_sys_select
       => sys_select
       => system_call_fastpath
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      03889384
    • S
      ring-buffer: Move resize integrity check under reader lock · dd7f5943
      Steven Rostedt 提交于
      While using an application that does splice on the ftrace ring
      buffer at start up, I triggered an integrity check failure.
      
      Looking into this, I discovered that resizing the buffer performs
      an integrity check after the buffer is resized. This check unfortunately
      is preformed after it releases the reader lock. If a reader is
      reading the buffer it may cause the integrity check to trigger a
      false failure.
      
      This patch simply moves the integrity checker under the protection
      of the ring buffer reader lock.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      dd7f5943
    • S
      ring-buffer: Use sync sched protection on ring buffer resizing · 18421015
      Steven Rostedt 提交于
      There was a comment in the ring buffer code that says the calling
      layers should prevent tracing or reading of the ring buffer while
      resizing. I have discovered that the tracers do not honor this
      arrangement.
      
      This patch moves the disabling and synchronizing the ring buffer to
      a higher layer during resizing. This guarantees that no writes
      are occurring while the resize takes place.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      18421015
    • T
      tracing: Fix wrong usage of strstrip in trace_ksyms · d954fbf0
      Thomas Gleixner 提交于
      strstrip returns a pointer to the first non space character, but the
      code in parse_ksym_trace_str() ignores that.
      
      strstrip is now must_check and therefor we get the correct warning:
      kernel/trace/trace_ksym.c:294: warning:
      ignoring return value of ‘strstrip’, declared with attribute warn_unused_result
      
      We are really not interested in leading whitespace here.
      
      Fix that and cleanup the dozen kfree() exit pathes.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      d954fbf0
  6. 10 12月, 2009 5 次提交
    • C
      tracing: Remove comparing of NULL to va_list in trace_array_vprintk() · f2942487
      Carsten Emde 提交于
      Olof Johansson stated the following:
      
        Comparing a va_list with NULL is bogus. It's supposed to be treated like
        an opaque type and only be manipulated with va_* accessors.
      
      Olof noticed that this code broke the ARM builds:
      
          kernel/trace/trace.c: In function 'trace_array_vprintk':
          kernel/trace/trace.c:1364: error: invalid operands to binary == (have 'va_list' and 'void *')
          kernel/trace/trace.c: In function 'tracing_mark_write':
          kernel/trace/trace.c:3349: error: incompatible type for argument 3 of 'trace_vprintk'
      
      This patch partly reverts c13d2f7c and
      re-installs the original mark_printk() mechanism.
      Reported-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NCarsten Emde <C.Emde@osadl.org>
      LKML-Reference: <4B1BAB74.104@osadl.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f2942487
    • J
      tracing: Fix function graph trace_pipe to properly display failed entries · be1eca39
      Jiri Olsa 提交于
      There is a case where the graph tracer might get confused and omits
      displaying of a single record.  This applies mostly with the trace_pipe
      since it is unlikely that the trace_seq buffer will overflow with the
      trace file.
      
      As the function_graph tracer goes through the trace entries keeping a
      pointer to the current record:
      
      current ->  func1 ENTRY
                  func2 ENTRY
                  func2 RETURN
                  func1 RETURN
      
      When an function ENTRY is encountered, it moves the pointer to the
      next entry to check if the function is a nested or leaf function.
      
                  func1 ENTRY
      current ->  func2 ENTRY
                  func2 RETURN
                  func1 RETURN
      
      If the rest of the writing of the function fills the trace_seq buffer,
      then the trace_pipe read will ignore this entry. The next read will
      Now start at the current location, but the first entry (func1) will
      be discarded.
      
      This patch keeps a copy of the current entry in the iterator private
      storage and will keep track of when the trace_seq buffer fills. When
      the trace_seq buffer fills, it will reuse the copy of the entry in the
      next iteration.
      
      [
        This patch has been largely modified by Steven Rostedt in order to
        clean it up and simplify it. The original idea and concept was from
        Jirka and for that, this patch will go under his name to give him
        the credit he deserves. But because this was modify by Steven Rostedt
        anything wrong with the patch should be blamed on Steven.
      ]
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1259067458-27143-1-git-send-email-jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      be1eca39
    • J
      tracing: Add full state to trace_seq · d184b31c
      Johannes Berg 提交于
      The trace_seq buffer might fill up, and right now one needs to check the
      return value of each printf into the buffer to check for that.
      
      Instead, have the buffer keep track of whether it is full or not, and
      reject more input if it is full or would have overflowed with an input
      that wasn't added.
      
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d184b31c
    • S
      tracing: Buffer the output of seq_file in case of filled buffer · a63ce5b3
      Steven Rostedt 提交于
      If the seq_read fills the buffer it will call s_start again on the next
      itertation with the same position. This causes a problem with the
      function_graph tracer because it consumes the iteration in order to
      determine leaf functions.
      
      What happens is that the iterator stores the entry, and the function
      graph plugin will look at the next entry. If that next entry is a return
      of the same function and task, then the function is a leaf and the
      function_graph plugin calls ring_buffer_read which moves the ring buffer
      iterator forward (the trace iterator still points to the function start
      entry).
      
      The copying of the trace_seq to the seq_file buffer will fail if the
      seq_file buffer is full. The seq_read will not show this entry.
      The next read by userspace will cause seq_read to again call s_start
      which will reuse the trace iterator entry (the function start entry).
      But the function return entry was already consumed. The function graph
      plugin will think that this entry is a nested function and not a leaf.
      
      To solve this, the trace code now checks the return status of the
      seq_printf (trace_print_seq). If the writing to the seq_file buffer
      fails, we set a flag in the iterator (leftover) and we do not reset
      the trace_seq buffer. On the next call to s_start, we check the leftover
      flag, and if it is set, we just reuse the trace_seq buffer and do not
      call into the plugin print functions.
      
      Before this patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)               |          __fsnotify_parent() {
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      After the patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)   0.548 us    |          __fsnotify_parent();
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      [
        Updated the patch to fix a missing return 0 from the trace_print_seq()
        stub when CONFIG_TRACING is disabled.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      ]
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a63ce5b3
    • S
      tracing: Only call pipe_close if pipe_close is defined · 29bf4a5e
      Steven Rostedt 提交于
      This fixes a cut and paste error that had pipe_close get called
      if pipe_open was defined (not pipe_close).
      Reported-by: NKosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20091209153204.F4CD.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      29bf4a5e
  7. 09 12月, 2009 2 次提交
    • F
      tracing/kprobes: Fix field creation's bad error handling · 822a6961
      Frederic Weisbecker 提交于
      When we define the common event fields in kprobe, we invert the error
      handling and return immediately in case of success. Then we omit
      to define specific kprobes fields (ip and nargs), and specific
      kretprobes fields (func, ret_ip, nargs). And we only define them
      when we fail to create common fields.
      
      The most visible consequence is that we can't create filter for
      k(ret)probes specific fields.
      
      This patch re-invert the success/error handling to fix it.
      Reported-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1260263815-5167-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      822a6961
    • M
      trace-kprobe: Support delete probe syntax · a7c312be
      Masami Hiramatsu 提交于
      Support delete probe syntax. The syntax is "-:[group/]event".
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      LKML-Reference: <20091208220316.10142.39192.stgit@dhcp-100-2-132.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      a7c312be
  8. 08 12月, 2009 1 次提交
    • S
      tracing: Add pipe_close interface · c521efd1
      Steven Rostedt 提交于
      An ftrace plugin can add a pipe_open interface when the user opens
      trace_pipe. But if the plugin allocates something within the pipe_open
      it can not free it because there exists no pipe_close. The hook to
      the trace file open has a corresponding close. The closing of the
      trace_pipe file should also have a corresponding close.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c521efd1
  9. 06 12月, 2009 1 次提交
  10. 02 12月, 2009 7 次提交
  11. 01 12月, 2009 1 次提交
    • M
      trace_kprobes: Fix a memory leak bug and check kstrdup() return value · ba8665d7
      Masami Hiramatsu 提交于
      Fix a memory leak case in create_trace_probe(). When an argument
      is too long (> MAX_ARGSTR_LEN), it just jumps to error path. In
      that case tp->args[i].name is not released.
      This also fixes a bug to check kstrdup()'s return value.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20091201001919.10235.56455.stgit@harusame>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ba8665d7