1. 06 3月, 2010 3 次提交
    • A
      tracing: Update the comm field in the right variable in update_max_tr · 1acaa1b2
      Arnaldo Carvalho de Melo 提交于
      The latency output showed:
      
       #    | task: -3 (uid:0 nice:0 policy:1 rt_prio:99)
      
      The comm is missing in the "task:" and it looks like a minus 3 is
      the output. The correct display should be:
      
       #    | task: migration/0-3 (uid:0 nice:0 policy:1 rt_prio:99)
      
      The problem is that the comm is being stored in the wrong data
      structure. The max_tr.data[cpu] is what stores the comm, not the
      tr->data[cpu].
      
      Before this patch the max_tr.data[cpu]->comm was zeroed and the /debug/trace
      ended up showing just the '-' sign followed by the pid.
      
      Also remove a needless initialization of max_data.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1267824230-23861-1-git-send-email-acme@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      1acaa1b2
    • S
      function-graph: Use comment notation for func names of dangling '}' · a094fe04
      Steven Rostedt 提交于
      When a '}' does not have a matching function start, the name is printed
      within parenthesis. But this makes it confusing between ending '}'
      and function starts. This patch makes the function name appear in C comment
      notation.
      
      Old view:
       3)   1.281 us    |            } (might_fault)
       3)   3.620 us    |          } (filldir)
       3)   5.251 us    |        } (call_filldir)
       3)               |        call_filldir() {
       3)               |          filldir() {
      
      New view:
       3)   1.281 us    |            } /* might_fault */
       3)   3.620 us    |          } /* filldir */
       3)   5.251 us    |        } /* call_filldir */
       3)               |        call_filldir() {
       3)               |          filldir() {
      Requested-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a094fe04
    • S
      function-graph: Fix unused reference to ftrace_set_func() · 801c29fd
      Steven Rostedt 提交于
      The declaration of ftrace_set_func() is at the start of the ftrace.c file
      and wrapped with a #ifdef CONFIG_FUNCTION_GRAPH condition. If function
      graph tracing is enabled but CONFIG_DYNAMIC_FTRACE is not, a warning
      about that function being declared static and unused is given.
      
      This really should have been placed within the CONFIG_FUNCTION_GRAPH
      condition that uses ftrace_set_func().
      
      Moving the declaration down fixes the warning and makes the code cleaner.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      801c29fd
  2. 01 3月, 2010 1 次提交
    • F
      tracing: Include irqflags headers from trace clock · ae1f3038
      Frederic Weisbecker 提交于
      trace_clock.c includes spinlock.h, which ends up including
      asm/system.h, which in turn includes linux/irqflags.h in x86.
      
      So the definition of raw_local_irq_save is luckily covered there,
      but this is not the case in parisc:
      
         tip/kernel/trace/trace_clock.c:86: error: implicit declaration of function 'raw_local_irq_save'
         tip/kernel/trace/trace_clock.c:112: error: implicit declaration of function 'raw_local_irq_restore'
      
      We need to include linux/irqflags.h directly from trace_clock.c
      to avoid such build error.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ae1f3038
  3. 27 2月, 2010 1 次提交
    • S
      ftrace: Add function names to dangling } in function graph tracer · f1c7f517
      Steven Rostedt 提交于
      The function graph tracer is currently the most invasive tracer
      in the ftrace family. It can easily overflow the buffer even with
      10megs per CPU. This means that events can often be lost.
      
      On start up, or after events are lost, if the function return is
      recorded but the function enter was lost, all we get to see is the
      exiting '}'.
      
      Here is how a typical trace output starts:
      
       [tracing] cat trace
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
        0) + 91.897 us   |                  }
        0) ! 567.961 us  |                }
        0)   <========== |
        0) ! 579.083 us  |                _raw_spin_lock_irqsave();
        0)   4.694 us    |                _raw_spin_unlock_irqrestore();
        0) ! 594.862 us  |              }
        0) ! 603.361 us  |            }
        0) ! 613.574 us  |          }
        0) ! 623.554 us  |        }
        0)   3.653 us    |        fget_light();
        0)               |        sock_poll() {
      
      There are a series of '}' with no matching "func() {". There's no information
      to what functions these ending brackets belong to.
      
      This patch adds a stack on the per cpu structure used in outputting
      the function graph tracer to keep track of what function was outputted.
      Then on a function exit event, it checks the depth to see if the
      function exit has a matching entry event. If it does, then it only
      prints the '}', otherwise it adds the function name after the '}'.
      
      This allows function exit events to show what function they belong to
      at trace output startup, when the entry was lost due to ring buffer
      overflow, or even after a new task is scheduled in.
      
      Here is what the above trace will look like after this patch:
      
       [tracing] cat trace
       # tracer: function_graph
       #
       # CPU  DURATION                  FUNCTION CALLS
       # |     |   |                     |   |   |   |
        0) + 91.897 us   |                  } (irq_exit)
        0) ! 567.961 us  |                } (smp_apic_timer_interrupt)
        0)   <========== |
        0) ! 579.083 us  |                _raw_spin_lock_irqsave();
        0)   4.694 us    |                _raw_spin_unlock_irqrestore();
        0) ! 594.862 us  |              } (add_wait_queue)
        0) ! 603.361 us  |            } (__pollwait)
        0) ! 613.574 us  |          } (tcp_poll)
        0) ! 623.554 us  |        } (sock_poll)
        0)   3.653 us    |        fget_light();
        0)               |        sock_poll() {
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f1c7f517
  4. 25 2月, 2010 6 次提交
  5. 23 2月, 2010 1 次提交
  6. 17 2月, 2010 4 次提交
  7. 15 2月, 2010 1 次提交
    • P
      perf_events: Fix FORK events · 6f93d0a7
      Peter Zijlstra 提交于
      Commit 22e19085 ("Honour event state for aux stream data")
      introduced a bug where we would drop FORK events.
      
      The thing is that we deliver FORK events to the child process'
      event, which at that time will be PERF_EVENT_STATE_INACTIVE
      because the child won't be scheduled in (we're in the middle of
      fork).
      
      Solve this twice, change the event state filter to exclude only
      disabled (STATE_OFF) or worse, and deliver FORK events to the
      current (parent).
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      LKML-Reference: <1266142324.5273.411.camel@laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6f93d0a7
  8. 14 2月, 2010 1 次提交
  9. 12 2月, 2010 1 次提交
  10. 10 2月, 2010 2 次提交
    • S
      tracing: Add correct/incorrect to sort keys for branch annotation output · ede55c9d
      Steven Rostedt 提交于
      The branch annotation is a bit difficult to see the worst offenders
      because it only sorts by percentage:
      
       correct incorrect  %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
             0      163 100 qdisc_restart                  sch_generic.c        179
             0      163 100 pfifo_fast_dequeue             sch_generic.c        447
             0        4 100 pskb_trim_rcsum                skbuff.h             1689
             0        4 100 llc_rcv                        llc_input.c          170
             0       18 100 psmouse_interrupt              psmouse-base.c       304
             0        3 100 atkbd_interrupt                atkbd.c              389
             0        5 100 usb_alloc_dev                  usb.c                437
             0       11 100 vsscanf                        vsprintf.c           1897
             0        2 100 IS_ERR                         err.h                34
             0       23 100 __rmqueue_fallback             page_alloc.c         865
             0        4 100 probe_wakeup_sched_switch      trace_sched_wakeup.c 142
             0        3 100 move_masked_irq                migration.c          11
      
      Adding the incorrect and correct values as sort keys makes this file a
      bit more informative:
      
       correct incorrect  %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
             0   366541 100 audit_syscall_entry            auditsc.c            1637
             0   366538 100 audit_syscall_exit             auditsc.c            1685
             0   115839 100 sched_info_switch              sched_stats.h        269
             0    74567 100 sched_info_queued              sched_stats.h        222
             0    66578 100 sched_info_dequeued            sched_stats.h        177
             0    15113 100 trace_workqueue_insertion      workqueue.h          38
             0    15107 100 trace_workqueue_execution      workqueue.h          45
             0     3622 100 syscall_trace_leave            ptrace.c             1772
             0     2750 100 sched_move_task                sched.c              10100
             0     2750 100 sched_move_task                sched.c              10110
             0     1815 100 pre_schedule_rt                sched_rt.c           1462
             0      837 100 audit_alloc                    auditsc.c            879
             0      814 100 tcp_mss_split_point            tcp_output.c         1302
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ede55c9d
    • J
      Export the symbol of getboottime and mmonotonic_to_bootbased · c93d89f3
      Jason Wang 提交于
      Export getboottime and monotonic_to_bootbased in order to let them
      could be used by following patch.
      
      Cc: stable@kernel.org
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
      c93d89f3
  11. 04 2月, 2010 2 次提交
  12. 03 2月, 2010 6 次提交
  13. 02 2月, 2010 1 次提交
    • L
      tracing: Fix circular dead lock in stack trace · 4f48f8b7
      Lai Jiangshan 提交于
      When we cat <debugfs>/tracing/stack_trace, we may cause circular lock:
      sys_read()
        t_start()
           arch_spin_lock(&max_stack_lock);
      
        t_show()
           seq_printf(), vsnprintf() .... /* they are all trace-able,
             when they are traced, max_stack_lock may be required again. */
      
      The following script can trigger this circular dead lock very easy:
      #!/bin/bash
      
      echo 1 > /proc/sys/kernel/stack_tracer_enabled
      
      mount -t debugfs xxx /mnt > /dev/null 2>&1
      
      (
      # make check_stack() zealous to require max_stack_lock
      for ((; ;))
      {
      	echo 1 > /mnt/tracing/stack_max_size
      }
      ) &
      
      for ((; ;))
      {
      	cat /mnt/tracing/stack_trace > /dev/null
      }
      
      To fix this bug, we increase the percpu trace_active before
      require the lock.
      Reported-by: NLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B67D4F9.9080905@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      4f48f8b7
  14. 01 2月, 2010 1 次提交
    • J
      softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume · d6ad3e28
      Jason Wessel 提交于
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, sched_clock() gets
      the time from hardware such as the TSC on x86. In this
      configuration kgdb will report a softlock warning message on
      resuming or detaching from a debug session.
      
      Sequence of events in the problem case:
      
       1) "cpu sched clock" and "hardware time" are at 100 sec prior
          to a call to kgdb_handle_exception()
      
       2) Debugger waits in kgdb_handle_exception() for 80 sec and on
          exit the following is called ...  touch_softlockup_watchdog() -->
          __raw_get_cpu_var(touch_timestamp) = 0;
      
       3) "cpu sched clock" = 100s (it was not updated, because the
          interrupt was disabled in kgdb) but the "hardware time" = 180 sec
      
       4) The first timer interrupt after resuming from
          kgdb_handle_exception updates the watchdog from the "cpu sched clock"
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> check (touch_timestamp == 0) (it is "YES"
      here, we have set "touch_timestamp = 0" at kgdb) -->
      __touch_softlockup_watchdog() ***(A)--> reset "touch_timestamp"
      to "get_timestamp()" (Here, the "touch_timestamp" will still be
      set to 100s.)  ...
      
          scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched
          clock" to "hardware time" = 180s) ...  }
      
       5) The Second timer interrupt handler appears to have a large
          jump and trips the softlockup warning.
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> "cpu sched clock" - "touch_timestamp" =
      180s-100s > 60s --> printk "soft lockup error messages" ...  }
      
      note: ***(A) reset "touch_timestamp" to
      "get_timestamp(this_cpu)"
      
      Why is "touch_timestamp" 100 sec, instead of 180 sec?
      
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, the call trace of
      get_timestamp() is:
      
      get_timestamp(this_cpu)
       -->cpu_clock(this_cpu)
       -->sched_clock_cpu(this_cpu)
       -->__update_sched_clock(sched_clock_data, now)
      
      The __update_sched_clock() function uses the GTOD tick value to
      create a window to normalize the "now" values.  So if "now"
      value is too big for sched_clock_data, it will be ignored.
      
      The fix is to invoke sched_clock_tick() to update "cpu sched
      clock" in order to recover from this state.  This is done by
      introducing the function touch_softlockup_watchdog_sync(). This
      allows kgdb to request that the sched clock is updated when the
      watchdog thread runs the first time after a resume from kgdb.
      
      [yong.zhang0@gmail.com: Use per cpu instead of an array]
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: NDongdong Deng <Dongdong.Deng@windriver.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: peterz@infradead.org
      LKML-Reference: <1264631124-4837-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d6ad3e28
  15. 30 1月, 2010 2 次提交
    • J
      perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger · 5352ae63
      Jason Wessel 提交于
      This patch fixes the regression in functionality where the
      kernel debugger and the perf API do not nicely share hw
      breakpoint reservations.
      
      The kernel debugger cannot use any mutex_lock() calls because it
      can start the kernel running from an invalid context.
      
      A mutex free version of the reservation API needed to get
      created for the kernel debugger to safely update hw breakpoint
      reservations.
      
      The possibility for a breakpoint reservation to be concurrently
      processed at the time that kgdb interrupts the system is
      improbable. Should this corner case occur the end user is
      warned, and the kernel debugger will prohibit updating the
      hardware breakpoint reservations.
      
      Any time the kernel debugger reserves a hardware breakpoint it
      will be a system wide reservation.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: torvalds@linux-foundation.org
      LKML-Reference: <1264719883-7285-3-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5352ae63
    • J
      x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API · cc096749
      Jason Wessel 提交于
      In the 2.6.33 kernel, the hw_breakpoint API is now used for the
      performance event counters.  The hw_breakpoint_handler() now
      consumes the hw breakpoints that were previously set by kgdb
      arch specific code.  In order for kgdb to work in conjunction
      with this core API change, kgdb must use some of the low level
      functions of the hw_breakpoint API to install, uninstall, and
      deal with hw breakpoint reservations.
      
      The kgdb core required a change to call kgdb_disable_hw_debug
      anytime a slave cpu enters kgdb_wait() in order to keep all the
      hw breakpoints in sync as well as to prevent hitting a hw
      breakpoint while kgdb is active.
      
      During the architecture specific initialization of kgdb, it will
      pre-allocate 4 disabled (struct perf event **) structures.  Kgdb
      will use these to manage the capabilities for the 4 hw
      breakpoint registers, per cpu.  Right now the hw_breakpoint API
      does not have a way to ask how many breakpoints are available,
      on each CPU so it is possible that the install of a breakpoint
      might fail when kgdb restores the system to the run state.  The
      intent of this patch is to first get the basic functionality of
      hw breakpoints working and leave it to the person debugging the
      kernel to understand what hw breakpoints are in use and what
      restrictions have been imposed as a result.  Breakpoint
      constraints will be dealt with in a future patch.
      
      While atomic, the x86 specific kgdb code will call
      arch_uninstall_hw_breakpoint() and arch_install_hw_breakpoint()
      to manage the cpu specific hw breakpoints.
      
      The net result of these changes allow kgdb to use the same pool
      of hw_breakpoints that are used by the perf event API, but
      neither knows about future reservations for the available hw
      breakpoint slots.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: torvalds@linux-foundation.org
      LKML-Reference: <1264719883-7285-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cc096749
  16. 29 1月, 2010 1 次提交
    • L
      tracing: Simplify test for function_graph tracing start point · ea2c68a0
      Lai Jiangshan 提交于
      In the function graph tracer, a calling function is to be traced
      only when it is enabled through the set_graph_function file,
      or when it is nested in an enabled function.
      
      Current code uses TSK_TRACE_FL_GRAPH to test whether it is nested
      or not. Looking at the code, we can get this:
      (trace->depth > 0) <==> (TSK_TRACE_FL_GRAPH is set)
      
      trace->depth is more explicit to tell that it is nested.
      So we use trace->depth directly and simplify the code.
      
      No functionality is changed.
      TSK_TRACE_FL_GRAPH is not removed yet, it is left for future usage.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <4B4DB0B6.7040607@cn.fujitsu.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      ea2c68a0
  17. 28 1月, 2010 3 次提交
  18. 27 1月, 2010 3 次提交
    • O
      lockdep: Fix check_usage_backwards() error message · 48d50674
      Oleg Nesterov 提交于
      Lockdep has found the real bug, but the output doesn't look right to me:
      
      > =========================================================
      > [ INFO: possible irq lock inversion dependency detected ]
      > 2.6.33-rc5 #77
      > ---------------------------------------------------------
      > emacs/1609 just changed the state of lock:
      >  (&(&tty->ctrl_lock)->rlock){+.....}, at: [<ffffffff8127c648>] tty_fasync+0xe8/0x190
      > but this lock took another, HARDIRQ-unsafe lock in the past:
      >  (&(&sighand->siglock)->rlock){-.....}
      
      "HARDIRQ-unsafe" and "this lock took another" looks wrong, afaics.
      
      >   ... key      at: [<ffffffff81c054a4>] __key.46539+0x0/0x8
      >   ... acquired at:
      >    [<ffffffff81089af6>] __lock_acquire+0x1056/0x15a0
      >    [<ffffffff8108a0df>] lock_acquire+0x9f/0x120
      >    [<ffffffff81423012>] _raw_spin_lock_irqsave+0x52/0x90
      >    [<ffffffff8127c1be>] __proc_set_tty+0x3e/0x150
      >    [<ffffffff8127e01d>] tty_open+0x51d/0x5e0
      
      The stack-trace shows that this lock (ctrl_lock) was taken under
      ->siglock (which is hopefully irq-safe).
      
      This is a clear typo in check_usage_backwards() where we tell the print a
      fancy routine we're forwards.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100126181641.GA10460@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      48d50674
    • M
      tracing/documentation: Cover new frame pointer semantics · 03688970
      Mike Frysinger 提交于
      Update the graph tracer examples to cover the new frame pointer semantics
      (in terms of passing it along).  Move the HAVE_FUNCTION_GRAPH_FP_TEST docs
      out of the Kconfig, into the right place, and expand on the details.
      Signed-off-by: NMike Frysinger <vapier@gentoo.org>
      LKML-Reference: <1264165967-18938-1-git-send-email-vapier@gentoo.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      03688970
    • S
      ring-buffer: Check for end of page in iterator · 3c05d748
      Steven Rostedt 提交于
      If the iterator comes to an empty page for some reason, or if
      the page is emptied by a consuming read. The iterator code currently
      does not check if the iterator is pass the contents, and may
      return a false entry.
      
      This patch adds a check to the ring buffer iterator to test if the
      current page has been completely read and sets the iterator to the
      next page if necessary.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3c05d748