1. 01 5月, 2013 1 次提交
    • T
      dump_stack: unify debug information printed by show_regs() · a43cb95d
      Tejun Heo 提交于
      show_regs() is inherently arch-dependent but it does make sense to print
      generic debug information and some archs already do albeit in slightly
      different forms.  This patch introduces a generic function to print debug
      information from show_regs() so that different archs print out the same
      information and it's much easier to modify what's printed.
      
      show_regs_print_info() prints out the same debug info as dump_stack()
      does plus task and thread_info pointers.
      
      * Archs which didn't print debug info now do.
      
        alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
        metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
        um, xtensa
      
      * Already prints debug info.  Replaced with show_regs_print_info().
        The printed information is superset of what used to be there.
      
        arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
      
      * s390 is special in that it used to print arch-specific information
        along with generic debug info.  Heiko and Martin think that the
        arch-specific extra isn't worth keeping s390 specfic implementation.
        Converted to use the generic version.
      
      Note that now all archs print the debug info before actual register
      dumps.
      
      An example BUG() dump follows.
      
       kernel BUG at /work/os/work/kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      v2: Typo fix in x86-32.
      
      v3: CPU number dropped from show_regs_print_info() as
          dump_stack_print_info() has been updated to print it.  s390
          specific implementation dropped as requested by s390 maintainers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a43cb95d
  2. 20 6月, 2012 1 次提交
  3. 06 6月, 2012 1 次提交
  4. 09 5月, 2012 1 次提交
  5. 28 1月, 2012 1 次提交
  6. 27 1月, 2012 1 次提交
    • P
      bugs, x86: Fix printk levels for panic, softlockups and stack dumps · b0f4c4b3
      Prarit Bhargava 提交于
      rsyslog will display KERN_EMERG messages on a connected
      terminal.  However, these messages are useless/undecipherable
      for a general user.
      
      For example, after a softlockup we get:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Stack:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Call Trace:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89
       d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <e8> ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89
      
      This happens because the printk levels for these messages are
      incorrect. Only an informational message should be displayed on
      a terminal.
      
      I modified the printk levels for various messages in the kernel
      and tested the output by using the drivers/misc/lkdtm.c kernel
      modules (ie, softlockups, panics, hard lockups, etc.) and
      confirmed that the console output was still the same and that
      the output to the terminals was correct.
      
      For example, in the case of a softlockup we now see the much
      more informative:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ...
       BUG: soft lockup - CPU4 stuck for 60s!
      
      instead of the above confusing messages.
      
      AFAICT, the messages no longer have to be KERN_EMERG.  In the
      most important case of a panic we set console_verbose().  As for
      the other less severe cases the correct data is output to the
      console and /var/log/messages.
      
      Successfully tested by me using the drivers/misc/lkdtm.c module.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: dzickus@redhat.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b0f4c4b3
  7. 20 12月, 2011 1 次提交
  8. 03 7月, 2011 2 次提交
    • F
      x86: Don't use frame pointer to save old stack on irq entry · a2bbe750
      Frederic Weisbecker 提交于
      rbp is used in SAVE_ARGS_IRQ to save the old stack pointer
      in order to restore it later in ret_from_intr.
      
      It is convenient because we save its value in the irq regs
      and it's easily restored using the leave instruction.
      
      However this is a kind of abuse of the frame pointer which
      role is to help unwinding the kernel by chaining frames
      together, each node following the return address to the
      previous frame.
      
      But although we are breaking the frame by changing the stack
      pointer, there is no preceding return address before the new
      frame. Hence using the frame pointer to link the two stacks
      breaks the stack unwinders that find a random value instead of
      a return address here.
      
      There is no workaround that can work in every case. We are using
      the fixup_bp_irq_link() function to dereference that abused frame
      pointer in the case of non nesting interrupt (which means stack
      changed).
      But that doesn't fix the case of interrupts that don't change the
      stack (but we still have the unconditional frame link), which is
      the case of hardirq interrupting softirq. We have no way to detect
      this transition so the frame irq link is considered as a real frame
      pointer and the return address is dereferenced but it is still a
      spurious one.
      
      There are two possible results of this: either the spurious return
      address, a random stack value, luckily belongs to the kernel text
      and then the unwinding can continue and we just have a weird entry
      in the stack trace. Or it doesn't belong to the kernel text and
      unwinding stops there.
      
      This is the reason why stacktraces (including perf callchains) on
      irqs that interrupted softirqs don't work very well.
      
      To solve this, we don't save the old stack pointer on rbp anymore
      but we save it to a scratch register that we push on the new
      stack and that we pop back later on irq return.
      
      This preserves the whole frame chain without spurious return addresses
      in the middle and drops the need for the horrid fixup_bp_irq_link()
      workaround.
      
      And finally irqs that interrupt softirq are sanely unwinded.
      
      Before:
      
          99.81%         perf  [kernel.kallsyms]  [k] perf_pending_event
                         |
                         --- perf_pending_event
                             irq_work_run
                             smp_irq_work_interrupt
                             irq_work_interrupt
                            |
                            |--41.60%-- __read
                            |          |
                            |          |--99.90%-- create_worker
                            |          |          bench_sched_messaging
                            |          |          cmd_bench
                            |          |          run_builtin
                            |          |          main
                            |          |          __libc_start_main
                            |           --0.10%-- [...]
      
      After:
      
           1.64%  swapper  [kernel.kallsyms]  [k] perf_pending_event
                  |
                  --- perf_pending_event
                      irq_work_run
                      smp_irq_work_interrupt
                      irq_work_interrupt
                     |
                     |--95.00%-- arch_irq_work_raise
                     |          irq_work_queue
                     |          __perf_event_overflow
                     |          perf_swevent_overflow
                     |          perf_swevent_event
                     |          perf_tp_event
                     |          perf_trace_softirq
                     |          __do_softirq
                     |          call_softirq
                     |          do_softirq
                     |          irq_exit
                     |          |
                     |          |--73.68%-- smp_apic_timer_interrupt
                     |          |          apic_timer_interrupt
                     |          |          |
                     |          |          |--96.43%-- amd_e400_idle
                     |          |          |          cpu_idle
                     |          |          |          start_secondary
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      a2bbe750
    • F
      x86: Fetch stack from regs when possible in dump_trace() · 47ce11a2
      Frederic Weisbecker 提交于
      When regs are passed to dump_stack(), we fetch the frame
      pointer from the regs but the stack pointer is taken from
      the current frame.
      
      Thus the frame and stack pointers may not come from the same
      context. For example this can result in the unwinder to
      think the context is in irq, due to the current value of
      the stack, but the frame pointer coming from the regs points
      to a frame from another place. It then tries to fix up
      the irq link but ends up dereferencing a random frame
      pointer that doesn't belong to the irq stack:
      
      [ 9131.706906] ------------[ cut here ]------------
      [ 9131.707003] WARNING: at arch/x86/kernel/dumpstack_64.c:129 dump_trace+0x2aa/0x330()
      [ 9131.707003] Hardware name: AMD690VM-FMH
      [ 9131.707003] Perf: bad frame pointer = 0000000000000005 in callchain
      [ 9131.707003] Modules linked in:
      [ 9131.707003] Pid: 1050, comm: perf Not tainted 3.0.0-rc3+ #181
      [ 9131.707003] Call Trace:
      [ 9131.707003]  <IRQ>  [<ffffffff8104bd4a>] warn_slowpath_common+0x7a/0xb0
      [ 9131.707003]  [<ffffffff8104be21>] warn_slowpath_fmt+0x41/0x50
      [ 9131.707003]  [<ffffffff8178b873>] ? bad_to_user+0x6d/0x10be
      [ 9131.707003]  [<ffffffff8100c2da>] dump_trace+0x2aa/0x330
      [ 9131.707003]  [<ffffffff810107d3>] ? native_sched_clock+0x13/0x50
      [ 9131.707003]  [<ffffffff8101b164>] perf_callchain_kernel+0x54/0x70
      [ 9131.707003]  [<ffffffff810d391f>] perf_prepare_sample+0x19f/0x2a0
      [ 9131.707003]  [<ffffffff810d546c>] __perf_event_overflow+0x16c/0x290
      [ 9131.707003]  [<ffffffff810d5430>] ? __perf_event_overflow+0x130/0x290
      [ 9131.707003]  [<ffffffff810107d3>] ? native_sched_clock+0x13/0x50
      [ 9131.707003]  [<ffffffff8100fbb9>] ? sched_clock+0x9/0x10
      [ 9131.707003]  [<ffffffff810752e5>] ? T.375+0x15/0x90
      [ 9131.707003]  [<ffffffff81084da4>] ? trace_hardirqs_on_caller+0x64/0x180
      [ 9131.707003]  [<ffffffff810817bd>] ? trace_hardirqs_off+0xd/0x10
      [ 9131.707003]  [<ffffffff810d5764>] perf_event_overflow+0x14/0x20
      [ 9131.707003]  [<ffffffff810d588c>] perf_swevent_hrtimer+0x11c/0x130
      [ 9131.707003]  [<ffffffff817821a1>] ? error_exit+0x51/0xb0
      [ 9131.707003]  [<ffffffff81072e93>] __run_hrtimer+0x83/0x1e0
      [ 9131.707003]  [<ffffffff810d5770>] ? perf_event_overflow+0x20/0x20
      [ 9131.707003]  [<ffffffff81073256>] hrtimer_interrupt+0x106/0x250
      [ 9131.707003]  [<ffffffff812a3bfd>] ? trace_hardirqs_off_thunk+0x3a/0x3c
      [ 9131.707003]  [<ffffffff81024833>] smp_apic_timer_interrupt+0x53/0x90
      [ 9131.707003]  [<ffffffff81789053>] apic_timer_interrupt+0x13/0x20
      [ 9131.707003]  <EOI>  [<ffffffff817821a1>] ? error_exit+0x51/0xb0
      [ 9131.707003]  [<ffffffff8178219c>] ? error_exit+0x4c/0xb0
      [ 9131.707003] ---[ end trace b2560d4876709347 ]---
      
      Fix this by simply taking the stack pointer from regs->sp
      when regs are provided.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      47ce11a2
  9. 18 3月, 2011 1 次提交
    • N
      x86, dumpstack: Correct stack dump info when frame pointer is available · e8e999cf
      Namhyung Kim 提交于
      Current stack dump code scans entire stack and check each entry
      contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
      could mark whether the pointer is valid or not based on value of
      the frame pointer. Invalid entries could be preceded by '?' sign.
      
      However this was not going to happen because scan start point
      was always higher than the frame pointer so that they could not
      meet.
      
      Commit 9c0729dc ("x86: Eliminate bp argument from the stack
      tracing routines") delayed bp acquisition point, so the bp was
      read in lower frame, thus all of the entries were marked
      invalid.
      
      This patch fixes this by reverting above commit while retaining
      stack_frame() helper as suggested by Frederic Weisbecker.
      
      End result looks like below:
      
      before:
      
       [    3.508329] Call Trace:
       [    3.508551]  [<ffffffff814f35c9>] ? panic+0x91/0x199
       [    3.508662]  [<ffffffff814f3739>] ? printk+0x68/0x6a
       [    3.508770]  [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
       [    3.508876]  [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
       [    3.508975]  [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
       [    3.509216]  [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
       [    3.509335]  [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
       [    3.509442]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.509542]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.509641]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
      after:
      
       [    3.522991] Call Trace:
       [    3.523351]  [<ffffffff814f35b9>] panic+0x91/0x199
       [    3.523468]  [<ffffffff814f3729>] ? printk+0x68/0x6a
       [    3.523576]  [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
       [    3.523681]  [<ffffffff81a9821f>] mount_root+0x56/0x5a
       [    3.523780]  [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
       [    3.523885]  [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
       [    3.523987]  [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
       [    3.524228]  [<ffffffff814f6880>] ? restore_args+0x0/0x30
       [    3.524345]  [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
       [    3.524445]  [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
      
       -v5:
         * fix build breakage with oprofile
      
       -v4:
         * use 0 instead of regs->bp
         * separate out printk changes
      
       -v3:
         * apply comment from Frederic
         * add a couple of printk fixes
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Soren Sandmann <ssp@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e8e999cf
  10. 25 1月, 2011 1 次提交
    • J
      x86-64: Don't use pointer to out-of-scope variable in dump_trace() · 2e5aa682
      Jesper Juhl 提交于
      In arch/x86/kernel/dumpstack_64.c::dump_trace() we have this code:
      
      ...
        		if (!stack) {
        			unsigned long dummy;
        			stack = &dummy;
        			if (task && task != current)
        				stack = (unsigned long *)task->thread.sp;
        		}
      
        		bp = stack_frame(task, regs);
        		/*
        		 * Print function call entries in all stacks, starting at the
        		 * current stack address. If the stacks consist of nested
        		 * exceptions
        		 */
        		tinfo = task_thread_info(task);
      
        		for (;;) {
        			char *id;
        			unsigned long *estack_end;
        			estack_end = in_exception_stack(cpu, (unsigned long)stack,
        							&used, &id);
      ...
      
      You'll notice that we assign to 'stack' the address of the variable
      'dummy' which is only in-scope inside the 'if (!stack)'. So when we later
      access stack (at the end of the above, and assuming we did not take the
      'if (task && task != current)' branch) we'll be using the address of a
      variable that is no longer in scope. I believe this patch is the proper
      fix, but I freely admit that I'm not 100% certain.
      Signed-off-by: NJesper Juhl <jj@chaosbits.net>
      LKML-Reference: <alpine.LNX.2.00.1101242232590.10252@swampdragon.chaosbits.net>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      2e5aa682
  11. 18 11月, 2010 1 次提交
    • S
      x86: Eliminate bp argument from the stack tracing routines · 9c0729dc
      Soeren Sandmann Pedersen 提交于
      The various stack tracing routines take a 'bp' argument in which the
      caller is supposed to provide the base pointer to use, or 0 if doesn't
      have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not
      defined, this means all callers in principle should either always pass
      0, or be conditional on CONFIG_FRAME_POINTER.
      
      However, there are only really three use cases for stack tracing:
      
      (a) Trace the current task, including IRQ stack if any
      (b) Trace the current task, but skip IRQ stack
      (c) Trace some other task
      
      In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just
      be 0.  If it _is_ defined, then
      
      - in case (a) bp should be gotten directly from the CPU's register, so
        the caller should pass NULL for regs,
      
      - in case (b) the caller should should pass the IRQ registers to
        dump_trace(),
      
      - in case (c) bp should be gotten from the top of the task's stack, so
        the caller should pass NULL for regs.
      
      Hence, the bp argument is not necessary because the combination of
      task and regs is sufficient to determine an appropriate value for bp.
      
      This patch introduces a new inline function stack_frame(task, regs)
      that computes the desired bp. This function is then called from the
      two versions of dump_stack().
      Signed-off-by: NSoren Sandmann <ssp@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arjan van de Ven <arjan@infradead.org>,
      Cc: Frederic Weisbecker <fweisbec@gmail.com>,
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>,
      LKML-Reference: <m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com>>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      9c0729dc
  12. 24 10月, 2010 1 次提交
    • J
      x86, printk: Get rid of <0> from stack output · e4072a9a
      Jiri Slaby 提交于
      The stack output currently looks like this:
      
       7fffffffffffffff 0000000a00000000 ffffffff81093341 0000000000000046
      <0> ffff88003a545fd8 0000000000000000 0000000000000000 00007fffa39769c0
      <0> ffff88003e403f58 ffffffff8102fc4c ffff88003e403f58 ffff88003e403f78
      
      The superfluous <0> are caused by recent printk KERN_CONT
      change. <*> is now ignored in printk unless some text follows
      the level and even then it still has to be the first in the
      format message.
      
      Note that the log_lvl parameter is now completely ignored in
      show_stack_log_lvl and the stack is dumped with the default
      level (like for quite some time already). It behaves the same as
      the rest of the dump, function traces are dumped in the very
      same manner. Only Code and maybe some lines are printed with
      EMERG level.
      
      Unfortunately I see no way how to fix this conceptually to have
      the whole oops/BUG/panic output with the same level, so this
      removed only the superfluous characters for the time being.
      
      Just for illustration:
      
      <4>Process kworker/0:0 (pid: 0, threadinfo ffff88003c8a6000, task ffff88003c85c100)
      <0>Stack:
      <4> ffffffff818022c0 0000000a00000001 0000000000000001 0000000000000046
      <4> ffff88003c8a7fd8 0000000000000001 ffff88003c8a7e58 0000000000000000
      <4> ffff88003e503f48 ffffffff8102fc4c ffff88003e503f48 ffff88003e503f68
      <0>Call Trace:
      <0> <IRQ>
      <4> [<ffffffff8102fc4c>] ? call_softirq+0x1c/0x30 ...
      <0>Code: 00 01 00 00 65 8b 04 25 80 c5 00 00 c7 45 ...
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: jirislaby@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1287586131-16222-1-git-send-email-jslaby@suse.cz>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e4072a9a
  13. 09 6月, 2010 1 次提交
    • F
      x86: Unify dumpstack.h and stacktrace.h · c9cf4dbb
      Frederic Weisbecker 提交于
      arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h
      declare headers of objects that deal with the same topic.
      Actually most of the files that include stacktrace.h also include
      dumpstack.h
      
      Although dumpstack.h seems more reserved for internals of stack
      traces, those are quite often needed to define specialized stack
      trace operations. And perf event arch headers are going to need
      access to such low level operations anyway. So don't continue to
      bother with dumpstack.h as it's not anymore about isolated deep
      internals.
      
      v2: fix struct stack_frame definition conflict in sysprof
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Soeren Sandmann <sandmann@daimi.au.dk>
      c9cf4dbb
  14. 10 3月, 2010 1 次提交
  15. 03 3月, 2010 1 次提交
    • F
      x86/stacktrace: Don't dereference bad frame pointers · 29044ad1
      Frederic Weisbecker 提交于
      Callers of a stacktrace might pass bad frame pointers. Those
      are usually checked for safety in stack walking helpers before
      any dereferencing, but this is not the case when we need to go
      through one more frame pointer that backlinks the irq stack to
      the previous one, as we don't have any reliable address boudaries
      to compare this frame pointer against.
      
      This raises crashes when we record callchains for ftrace events
      with perf because we don't use the right helpers to capture
      registers there. We get wrong frame pointers as we call
      task_pt_regs() even on kernel threads, which is a wrong thing
      as it gives us the initial state of any kernel threads freshly
      created. This is even not what we want for user tasks. What we want
      is a hot snapshot of registers when the ftrace event triggers, not
      the state before a task entered the kernel.
      
      This requires more thoughts to do it correctly though.
      So first put a guardian to ensure the given frame pointer
      can be dereferenced to avoid crashes. We'll think about how to fix
      the callers in a subsequent patch.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: 2.6.33.x <stable@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      29044ad1
  16. 04 2月, 2010 1 次提交
  17. 13 1月, 2010 1 次提交
  18. 17 12月, 2009 1 次提交
    • F
      perf events, x86/stacktrace: Make stack walking optional · 61c1917f
      Frederic Weisbecker 提交于
      The current print_context_stack helper that does the stack
      walking job is good for usual stacktraces as it walks through
      all the stack and reports even addresses that look unreliable,
      which is nice when we don't have frame pointers for example.
      
      But we have users like perf that only require reliable
      stacktraces, and those may want a more adapted stack walker, so
      lets make this function a callback in stacktrace_ops that users
      can tune for their needs.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      61c1917f
  19. 06 12月, 2009 1 次提交
    • F
      x86: Fixup wrong irq frame link in stacktraces · af2d8289
      Frederic Weisbecker 提交于
      When we enter in irq, two things can happen to preserve the link
      to the previous frame pointer:
      
      - If we were in an irq already, we don't switch to the irq stack
        as we are inside. We just need to save the previous frame
        pointer and to link the new one to the previous.
      
      - Otherwise we need another level of indirection. We enter the irq with
        the previous stack. We save the previous bp inside and make bp
        pointing to its saved address. Then we switch to the irq stack and
        push bp another time but to the new stack. This makes two levels to
        dereference instead of one.
      
      In the second case, the current stacktrace code omits the second level
      and loses the frame pointer accuracy. The stack that follows will then
      be considered as unreliable.
      
      Handling that makes the perf callchain happier.
      Before:
      
      43.94%  [k] _raw_read_lock
                  |
                  --- _read_lock
                     |
                     |--60.53%-- send_sigio
                     |          __kill_fasync
                     |          kill_fasync
                     |          evdev_pass_event
                     |          evdev_event
                     |          input_pass_event
                     |          input_handle_event
                     |          input_event
                     |          synaptics_process_byte
                     |          psmouse_handle_byte
                     |          psmouse_interrupt
                     |          serio_interrupt
                     |          i8042_interrupt
                     |          handle_IRQ_event
                     |          handle_edge_irq
                     |          handle_irq
                     |          __irqentry_text_start
                     |          ret_from_intr
                     |          |
                     |          |--30.43%-- __select
                     |          |
                     |          |--17.39%-- 0x454f15
                     |          |
                     |          |--13.04%-- __read
                     |          |
                     |          |--13.04%-- vread_hpet
                     |          |
                     |          |--13.04%-- _xcb_lock_io
                     |          |
                     |           --13.04%-- 0x7f630878ce8
      
      After:
      
          50.00%  [k] _raw_read_lock
                  |
                  --- _read_lock
                     |
                     |--98.97%-- send_sigio
                     |          __kill_fasync
                     |          kill_fasync
                     |          evdev_pass_event
                     |          evdev_event
                     |          input_pass_event
                     |          input_handle_event
                     |          input_event
                     |          |
                     |          |--96.88%-- synaptics_process_byte
                     |          |          psmouse_handle_byte
                     |          |          psmouse_interrupt
                     |          |          serio_interrupt
                     |          |          i8042_interrupt
                     |          |          handle_IRQ_event
                     |          |          handle_edge_irq
                     |          |          handle_irq
                     |          |          __irqentry_text_start
                     |          |          ret_from_intr
                     |          |          |
                     |          |          |--39.78%-- __const_udelay
                     |          |          |          |
                     |          |          |          |--91.89%-- ath5k_hw_register_timeout
                     |          |          |          |          ath5k_hw_noise_floor_calibration
                     |          |          |          |          ath5k_hw_reset
                     |          |          |          |          ath5k_reset
                     |          |          |          |          ath5k_config
                     |          |          |          |          ieee80211_hw_config
                     |          |          |          |          |
                     |          |          |          |          |--88.24%-- ieee80211_scan_work
                     |          |          |          |          |          worker_thread
                     |          |          |          |          |          kthread
                     |          |          |          |          |          child_rip
                     |          |          |          |          |
                     |          |          |          |           --11.76%-- ieee80211_scan_completed
                     |          |          |          |                     ieee80211_scan_work
                     |          |          |          |                     worker_thread
                     |          |          |          |                     kthread
                     |          |          |          |                     child_rip
                     |          |          |          |
                     |          |          |           --8.11%-- ath5k_hw_noise_floor_calibration
                     |          |          |                     ath5k_hw_reset
                     |          |          |                     ath5k_reset
                     |          |          |                     ath5k_config
      
      Note: This does not only affect perf events but also x86-64
      stacktraces. They were considered as unreliable once we quit
      the irq stack frame.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      af2d8289
  20. 26 11月, 2009 2 次提交
    • I
      x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks · 67f2de0b
      Ingo Molnar 提交于
      This warning:
      
      [  847.140022] rb_producer   D 0000000000000000  5928   519      2 0x00000000
      [  847.203627] BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/517
      [  847.207360] caller is show_stack_log_lvl+0x2e/0x241
      [  847.210364] Pid: 517, comm: khungtaskd Not tainted 2.6.32-rc8-tip+ #13761
      [  847.213395] Call Trace:
      [  847.215847]  [<ffffffff81413bde>] debug_smp_processor_id+0x1f0/0x20a
      [  847.216809]  [<ffffffff81015eae>] show_stack_log_lvl+0x2e/0x241
      [  847.220027]  [<ffffffff81018512>] show_stack+0x1c/0x1e
      [  847.223365]  [<ffffffff8107b7db>] sched_show_task+0xe4/0xe9
      [  847.226694]  [<ffffffff8112f21f>] check_hung_task+0x140/0x199
      [  847.230261]  [<ffffffff8112f4a8>] check_hung_uninterruptible_tasks+0x1b7/0x20f
      [  847.233371]  [<ffffffff8112f500>] ? watchdog+0x0/0x50
      [  847.236683]  [<ffffffff8112f54e>] watchdog+0x4e/0x50
      [  847.240034]  [<ffffffff810cee56>] kthread+0x97/0x9f
      [  847.243372]  [<ffffffff81012aea>] child_rip+0xa/0x20
      [  847.246690]  [<ffffffff81e43494>] ? restore_args+0x0/0x30
      [  847.250019]  [<ffffffff81e43083>] ? _spin_lock+0xe/0x10
      [  847.253351]  [<ffffffff810cedbf>] ? kthread+0x0/0x9f
      [  847.256833]  [<ffffffff81012ae0>] ? child_rip+0x0/0x20
      
      Happens because on preempt-RCU, khungd calls show_stack() with
      preemption enabled.
      
      Make sure we are not preemptible while walking the IRQ and exception
      stacks on 64-bit. (32-bit stack dumping is preemption safe.)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      67f2de0b
    • I
      x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details · b8030906
      Ingo Molnar 提交于
      Make the initialization more readable, plus tidy up a few small
      visual details as well.
      
      No change in functionality.
      
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8030906
  21. 24 9月, 2009 1 次提交
  22. 02 7月, 2009 1 次提交
    • F
      perf_counter: Ignore the nmi call frames in the x86-64 backtraces · 0406ca6d
      Frederic Weisbecker 提交于
      About every callchains recorded with perf record are filled up
      including the internal perfcounter nmi frame:
      
       perf_callchain
       perf_counter_overflow
       intel_pmu_handle_irq
       perf_counter_nmi_handler
       notifier_call_chain
       atomic_notifier_call_chain
       notify_die
       do_nmi
       nmi
      
      We want ignore this frame as it's not interesting for
      instrumentation. To solve this, we simply ignore every frames
      from nmi context.
      
      New example of "perf report -s sym -c" after this patch:
      
      9.59%  [k] search_by_key
                   4.88%
                      search_by_key
                      reiserfs_read_locked_inode
                      reiserfs_iget
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      
                   3.19%
                      search_by_key
                      search_by_entry_key
                      reiserfs_find_entry
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      [...]
      
      For now this patch only solves the problem in x86-64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0406ca6d
  23. 18 1月, 2009 2 次提交
  24. 03 12月, 2008 1 次提交
    • S
      ftrace: print real return in dumpstack for function graph · 7ee991fb
      Steven Rostedt 提交于
      Impact: better dumpstack output
      
      I noticed in my crash dumps and even in the stack tracer that a
      lot of functions listed in the stack trace are simply
      return_to_handler which is ftrace graphs way to insert its own
      call into the return of a function.
      
      But we lose out where the actually function was called from.
      
      This patch adds in hooks to the dumpstack mechanism that detects
      this and finds the real function to print. Both are printed to
      let the user know that a hook is still in place.
      
      This does give a funny side effect in the stack tracer output:
      
              Depth   Size      Location    (80 entries)
              -----   ----      --------
        0)     4144      48   save_stack_trace+0x2f/0x4d
        1)     4096     128   ftrace_call+0x5/0x2b
        2)     3968      16   mempool_alloc_slab+0x16/0x18
        3)     3952     384   return_to_handler+0x0/0x73
        4)     3568    -240   stack_trace_call+0x11d/0x209
        5)     3808     144   return_to_handler+0x0/0x73
        6)     3664    -128   mempool_alloc+0x4d/0xfe
        7)     3792     128   return_to_handler+0x0/0x73
        8)     3664     -32   scsi_sg_alloc+0x48/0x4a [scsi_mod]
      
      As you can see, the real functions are now negative. This is due
      to them not being found inside the stack.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ee991fb
  25. 28 10月, 2008 1 次提交
    • N
      x86: unify appropriate bits from dumpstack_32 and dumpstack_64 · 878719e8
      Neil Horman 提交于
      Impact: cleanup
      
      As promised, now that dumpstack_32 and dumpstack_64 have so many bits
      in common, we should merge the in-sync bits into a common file, to
      prevent them from diverging again.
      
      This patch removes bits which are common between dumpstack_32.c and
      dumpstack_64.c and places them in a common dumpstack.c which is built
      for both 32 and 64 bit arches.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NAlexander van Heukelum <heukelum@fastmail.fm>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      
       Makefile       |    2
       arch/x86/kernel/Makefile       |    2
       arch/x86/kernel/Makefile       |    2
       arch/x86/kernel/Makefile       |    2
       arch/x86/kernel/Makefile       |    2
       arch/x86/kernel/Makefile       |    2
       arch/x86/kernel/dumpstack.c    |  319 +++++++++++++++++++++++++++++++++++++++++
       arch/x86/kernel/dumpstack.h    |   39 +++++
       arch/x86/kernel/dumpstack_32.c |  294 -------------------------------------
       arch/x86/kernel/dumpstack_64.c |  285 ------------------------------------
       5 files changed, 363 insertions(+), 576 deletions(-)
      878719e8
  26. 22 10月, 2008 5 次提交
  27. 17 10月, 2008 1 次提交
  28. 13 10月, 2008 6 次提交