1. 22 12月, 2009 1 次提交
  2. 15 12月, 2009 1 次提交
  3. 14 12月, 2009 13 次提交
  4. 12 12月, 2009 2 次提交
    • A
      tty: Move the leader test in disassociate · 5ec93d11
      Alan Cox 提交于
      There are two call points, both want to check that tty->signal->leader is
      set. Move the test into disassociate_ctty() as that will make locking
      changes easier in a bit
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      5ec93d11
    • S
      tracing: Add stack trace to irqsoff tracer · cc51a0fc
      Steven Rostedt 提交于
      The irqsoff and friends tracers help in finding causes of latency in the
      kernel. The also work with the function tracer to show what was happening
      when interrupts or preemption are disabled. But the function tracer has
      a bit of an overhead and can cause exagerated readings.
      
      Currently, when tracing with /proc/sys/kernel/ftrace_enabled = 0, where the
      function tracer is disabled, the information that is provided can end up
      being useless. For example, a 2 and a half millisecond latency only showed:
      
       # tracer: preemptirqsoff
       #
       # preemptirqsoff latency trace v1.1.5 on 2.6.32
       # --------------------------------------------------------------------
       # latency: 2463 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
       #    -----------------
       #    | task: -4242 (uid:0 nice:0 policy:0 rt_prio:0)
       #    -----------------
       #  => started at: _spin_lock_irqsave
       #  => ended at:   remove_wait_queue
       #
       #
       #                  _------=> CPU#
       #                 / _-----=> irqs-off
       #                | / _----=> need-resched
       #                || / _---=> hardirq/softirq
       #                ||| / _--=> preempt-depth
       #                |||| /_--=> lock-depth
       #                |||||/     delay
       #  cmd     pid   |||||| time  |   caller
       #     \   /      ||||||   \   |   /
       hackbenc-4242    2d....    0us!: trace_hardirqs_off <-_spin_lock_irqsave
       hackbenc-4242    2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue
       hackbenc-4242    2...1. 2466us : trace_preempt_on <-remove_wait_queue
      
      The above lets us know that hackbench with pid 2463 grabbed a spin lock
      somewhere and enabled preemption at remove_wait_queue. This helps a little
      but where this actually happened is not informative.
      
      This patch adds the stack dump to the end of the irqsoff tracer. This provides
      the following output:
      
       hackbenc-4242    2d....    0us!: trace_hardirqs_off <-_spin_lock_irqsave
       hackbenc-4242    2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue
       hackbenc-4242    2...1. 2466us : trace_preempt_on <-remove_wait_queue
       hackbenc-4242    2...1. 2467us : <stack trace>
        => sub_preempt_count
        => _spin_unlock_irqrestore
        => remove_wait_queue
        => free_poll_entry
        => poll_freewait
        => do_sys_poll
        => sys_poll
        => system_call_fastpath
      
      Now we see that the culprit of this latency was the free_poll_entry code.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      cc51a0fc
  5. 11 12月, 2009 10 次提交
    • S
      tracing: Add trace_dump_stack() · 03889384
      Steven Rostedt 提交于
      I've been asked a few times about how to find out what is calling
      some location in the kernel. One way is to use dynamic function tracing
      and implement the func_stack_trace. But this only finds out who is
      calling a particular function. It does not tell you who is calling
      that function and entering a specific if conditional.
      
      I have myself implemented a quick version of trace_dump_stack() for
      this purpose a few times, and just needed it now. This is when I realized
      that this would be a good tool to have in the kernel like trace_printk().
      
      Using trace_dump_stack() is similar to dump_stack() except that it
      writes to the trace buffer instead and can be used in critical locations.
      
      For example:
      
      @@ -5485,8 +5485,12 @@ need_resched_nonpreemptible:
       	if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
       		if (unlikely(signal_pending_state(prev->state, prev)))
       			prev->state = TASK_RUNNING;
      -		else
      +		else {
       			deactivate_task(rq, prev, 1);
      +			trace_printk("Deactivating task %s:%d\n",
      +				     prev->comm, prev->pid);
      +			trace_dump_stack();
      +		}
       		switch_count = &prev->nvcsw;
       	}
      
      Produces:
      
                 <...>-3249  [001]   296.105269: schedule: Deactivating task ntpd:3249
                 <...>-3249  [001]   296.105270: <stack trace>
       => schedule
       => schedule_hrtimeout_range
       => poll_schedule_timeout
       => do_select
       => core_sys_select
       => sys_select
       => system_call_fastpath
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      03889384
    • J
      kgdb: Always process the whole breakpoint list on activate or deactivate · 7f8b7ed6
      Jason Wessel 提交于
      This patch fixes 2 edge cases in using kgdb in conjunction with gdb.
      
      1) kgdb_deactivate_sw_breakpoints() should process the entire array of
         breakpoints.  The failure to do so results in breakpoints that you
         cannot remove, because a break point can only be removed if its
         state flag is set to BP_SET.
      
         The easy way to duplicate this problem is to plant a break point in
         a kernel module and then unload the kernel module.
      
      2) kgdb_activate_sw_breakpoints() should process the entire array of
         breakpoints.  The failure to do so results in missed breakpoints
         when a breakpoint cannot be activated.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      7f8b7ed6
    • J
      kgdb: continue and warn on signal passing from gdb · d625e9c0
      Jason Wessel 提交于
      On some architectures for the segv trap, gdb wants to pass the signal
      back on continue.  For kgdb this is not the default behavior, because
      it can cause the kernel to crash if you arbitrarily pass back a
      exception outside of kgdb.
      
      Instead of causing instability, pass a message back to gdb about the
      supported kgdb signal passing and execute a standard kgdb continue
      operation.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      d625e9c0
    • J
      kgdb: allow for cpu switch when single stepping · 028e7b17
      Jason Wessel 提交于
      The kgdb core should not assume that a single step operation of a
      kernel thread will complete on the same CPU.  The single step flag is
      set at the "thread" level and it is possible in a multi cpu system
      that a kernel thread can get scheduled on another cpu the next time it
      is run.
      
      As a further safety net in case a slave cpu is hung, the debug master
      cpu will try 100 times before giving up and assuming control of the
      slave cpus is no longer possible.  It is more useful to be able to get
      some information out of kgdb instead of spinning forever.
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      028e7b17
    • J
      kgdb: Read buffer overflow · 84667d48
      Jason Wessel 提交于
      Roel Kluin reported an error found with Parfait.  Where we want to
      ensure that that kgdb_info[-1] never gets accessed.
      
      Also check to ensure any negative tid does not exceed the size of the
      shadow CPU array, else report critical debug context because it is an
      internal kgdb failure.
      Reported-by: NRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      84667d48
    • S
      ring-buffer: Move resize integrity check under reader lock · dd7f5943
      Steven Rostedt 提交于
      While using an application that does splice on the ftrace ring
      buffer at start up, I triggered an integrity check failure.
      
      Looking into this, I discovered that resizing the buffer performs
      an integrity check after the buffer is resized. This check unfortunately
      is preformed after it releases the reader lock. If a reader is
      reading the buffer it may cause the integrity check to trigger a
      false failure.
      
      This patch simply moves the integrity checker under the protection
      of the ring buffer reader lock.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      dd7f5943
    • S
      ring-buffer: Use sync sched protection on ring buffer resizing · 18421015
      Steven Rostedt 提交于
      There was a comment in the ring buffer code that says the calling
      layers should prevent tracing or reading of the ring buffer while
      resizing. I have discovered that the tracers do not honor this
      arrangement.
      
      This patch moves the disabling and synchronizing the ring buffer to
      a higher layer during resizing. This guarantees that no writes
      are occurring while the resize takes place.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      18421015
    • T
      tracing: Fix wrong usage of strstrip in trace_ksyms · d954fbf0
      Thomas Gleixner 提交于
      strstrip returns a pointer to the first non space character, but the
      code in parse_ksym_trace_str() ignores that.
      
      strstrip is now must_check and therefor we get the correct warning:
      kernel/trace/trace_ksym.c:294: warning:
      ignoring return value of ‘strstrip’, declared with attribute warn_unused_result
      
      We are really not interested in leading whitespace here.
      
      Fix that and cleanup the dozen kfree() exit pathes.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      d954fbf0
    • I
      sched: Remove forced2_migrations stats · b9889ed1
      Ingo Molnar 提交于
      This build warning:
      
       kernel/sched.c: In function 'set_task_cpu':
       kernel/sched.c:2070: warning: unused variable 'old_rq'
      
      Made me realize that the forced2_migrations stat looks pretty
      pointless (and a misnomer) - remove it.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b9889ed1
    • X
      perf_event: Fix variable initialization in other codepaths · 5e855db5
      Xiao Guangrong 提交于
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <4B20BAA6.7010609@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5e855db5
  6. 10 12月, 2009 11 次提交
    • P
      sched: Fix memory leak in two error corner cases · dfc12eb2
      Phil Carmody 提交于
      If the second in each of these pairs of allocations fails, then the
      first one will not be freed in the error route out.
      
      Found by a static code analysis tool.
      Signed-off-by: NPhil Carmody <ext-phil.2.carmody@nokia.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1260448177-28448-1-git-send-email-ext-phil.2.carmody@nokia.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dfc12eb2
    • H
      hrtimer: move timer stats helper functions to hrtimer.c · 5f201907
      Heiko Carstens 提交于
      There is no reason to make timer_stats_hrtimer_set_start_info and
      friends visible to the rest of the kernel. So move all of them to
      hrtimer.c.  Also make timer_stats_hrtimer_set_start_info a static
      inline function so it gets inlined and we avoid another function call.
      Based on a patch by Thomas Gleixner.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <20091210095629.GC4144@osiris.boeblingen.de.ibm.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      5f201907
    • T
      hrtimer: Tune hrtimer_interrupt hang logic · 41d2e494
      Thomas Gleixner 提交于
      The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
      execution time of the hrtimer callbacks.
      
      This is error-prone for virtual machines, where a guest vcpu can be
      scheduled out during the execution of the callbacks (and the callbacks
      themselves can do operations that translate to blocking operations in
      the hypervisor), which in can lead to large min_delta_ns rendering the
      system unusable.
      
      Replace the current heuristics with something more reliable. Allow the
      interrupt code to try 3 times to catch up with the lost time. If that
      fails use the total time spent in the interrupt handler to defer the
      next timer interrupt so the system can catch up with other things
      which got delayed. Limit that deferment to 100ms.
      
      The retry events and the maximum time spent in the interrupt handler
      are recorded and exposed via /proc/timer_list
      
      Inspired by a patch from Marcelo.
      Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NMarcelo Tosatti <mtosatti@redhat.com>
      Cc: kvm@vger.kernel.org
      41d2e494
    • M
      sched: Fix build warning in get_update_sysctl_factor() · 4ca3ef71
      Mike Galbraith 提交于
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      LKML-Reference: <new-submission>
      4ca3ef71
    • L
      lockdep: Avoid out of bounds array reference in save_trace() · ea5b41f9
      Luck, Tony 提交于
      ia64 found this the hard way (because we currently have a stub
      for save_stack_trace() that does nothing). But it would be a
      good idea to  be cautious in case a real save_stack_trace()
      bailed out with an error before it set trace->nr_entries.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: luming.yu@intel.com
      LKML-Reference: <4b2024d085302c2a2@agluck-desktop.sc.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea5b41f9
    • X
      perf_event: Fix perf_swevent_hrtimer() variable initialization · 21140f4d
      Xiao Guangrong 提交于
      fix:
      
       [<c0477471>] ? printk+0x1d/0x24
       [<c01c98f9>] ? perf_prepare_sample+0x269/0x280
       [<c0149231>] warn_slowpath_common+0x71/0xd0
       [<c01c98f9>] ? perf_prepare_sample+0x269/0x280
       [<c01492aa>] warn_slowpath_null+0x1a/0x20
       [<c01c98f9>] perf_prepare_sample+0x269/0x280
       [<c016e9f3>] ? cpu_clock+0x53/0x90
       [<c01cc368>] __perf_event_overflow+0x2a8/0x300
       [<c01ccc3b>] perf_event_overflow+0x1b/0x30
       [<c01ccccf>] perf_swevent_hrtimer+0x7f/0x120
      
      This is because 'data.raw' variable not initialize.
      Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <4B208E93.1010801@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      21140f4d
    • C
      tracing: Remove comparing of NULL to va_list in trace_array_vprintk() · f2942487
      Carsten Emde 提交于
      Olof Johansson stated the following:
      
        Comparing a va_list with NULL is bogus. It's supposed to be treated like
        an opaque type and only be manipulated with va_* accessors.
      
      Olof noticed that this code broke the ARM builds:
      
          kernel/trace/trace.c: In function 'trace_array_vprintk':
          kernel/trace/trace.c:1364: error: invalid operands to binary == (have 'va_list' and 'void *')
          kernel/trace/trace.c: In function 'tracing_mark_write':
          kernel/trace/trace.c:3349: error: incompatible type for argument 3 of 'trace_vprintk'
      
      This patch partly reverts c13d2f7c and
      re-installs the original mark_printk() mechanism.
      Reported-by: NOlof Johansson <olof@lixom.net>
      Signed-off-by: NCarsten Emde <C.Emde@osadl.org>
      LKML-Reference: <4B1BAB74.104@osadl.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      f2942487
    • J
      tracing: Fix function graph trace_pipe to properly display failed entries · be1eca39
      Jiri Olsa 提交于
      There is a case where the graph tracer might get confused and omits
      displaying of a single record.  This applies mostly with the trace_pipe
      since it is unlikely that the trace_seq buffer will overflow with the
      trace file.
      
      As the function_graph tracer goes through the trace entries keeping a
      pointer to the current record:
      
      current ->  func1 ENTRY
                  func2 ENTRY
                  func2 RETURN
                  func1 RETURN
      
      When an function ENTRY is encountered, it moves the pointer to the
      next entry to check if the function is a nested or leaf function.
      
                  func1 ENTRY
      current ->  func2 ENTRY
                  func2 RETURN
                  func1 RETURN
      
      If the rest of the writing of the function fills the trace_seq buffer,
      then the trace_pipe read will ignore this entry. The next read will
      Now start at the current location, but the first entry (func1) will
      be discarded.
      
      This patch keeps a copy of the current entry in the iterator private
      storage and will keep track of when the trace_seq buffer fills. When
      the trace_seq buffer fills, it will reuse the copy of the entry in the
      next iteration.
      
      [
        This patch has been largely modified by Steven Rostedt in order to
        clean it up and simplify it. The original idea and concept was from
        Jirka and for that, this patch will go under his name to give him
        the credit he deserves. But because this was modify by Steven Rostedt
        anything wrong with the patch should be blamed on Steven.
      ]
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1259067458-27143-1-git-send-email-jolsa@redhat.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      be1eca39
    • J
      tracing: Add full state to trace_seq · d184b31c
      Johannes Berg 提交于
      The trace_seq buffer might fill up, and right now one needs to check the
      return value of each printf into the buffer to check for that.
      
      Instead, have the buffer keep track of whether it is full or not, and
      reject more input if it is full or would have overflowed with an input
      that wasn't added.
      
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d184b31c
    • S
      tracing: Buffer the output of seq_file in case of filled buffer · a63ce5b3
      Steven Rostedt 提交于
      If the seq_read fills the buffer it will call s_start again on the next
      itertation with the same position. This causes a problem with the
      function_graph tracer because it consumes the iteration in order to
      determine leaf functions.
      
      What happens is that the iterator stores the entry, and the function
      graph plugin will look at the next entry. If that next entry is a return
      of the same function and task, then the function is a leaf and the
      function_graph plugin calls ring_buffer_read which moves the ring buffer
      iterator forward (the trace iterator still points to the function start
      entry).
      
      The copying of the trace_seq to the seq_file buffer will fail if the
      seq_file buffer is full. The seq_read will not show this entry.
      The next read by userspace will cause seq_read to again call s_start
      which will reuse the trace iterator entry (the function start entry).
      But the function return entry was already consumed. The function graph
      plugin will think that this entry is a nested function and not a leaf.
      
      To solve this, the trace code now checks the return status of the
      seq_printf (trace_print_seq). If the writing to the seq_file buffer
      fails, we set a flag in the iterator (leftover) and we do not reset
      the trace_seq buffer. On the next call to s_start, we check the leftover
      flag, and if it is set, we just reuse the trace_seq buffer and do not
      call into the plugin print functions.
      
      Before this patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)               |          __fsnotify_parent() {
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      After the patch:
      
       2)               |      fput() {
       2)               |        __fput() {
       2)   0.550 us    |          inotify_inode_queue_event();
       2)   0.548 us    |          __fsnotify_parent();
       2)   0.540 us    |          inotify_dentry_parent_queue_event();
      
      [
        Updated the patch to fix a missing return 0 from the trace_print_seq()
        stub when CONFIG_TRACING is disabled.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      ]
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a63ce5b3
    • S
      tracing: Only call pipe_close if pipe_close is defined · 29bf4a5e
      Steven Rostedt 提交于
      This fixes a cut and paste error that had pipe_close get called
      if pipe_open was defined (not pipe_close).
      Reported-by: NKosaki Motohiro <kosaki.motohiro@jp.fujitsu.com>
      LKML-Reference: <20091209153204.F4CD.A69D9226@jp.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      29bf4a5e
  7. 09 12月, 2009 2 次提交