1. 26 10月, 2016 1 次提交
    • J
      x86/dumpstack: Remove kernel text addresses from stack dump · bb5e5ce5
      Josh Poimboeuf 提交于
      Printing kernel text addresses in stack dumps is of questionable value,
      especially now that address randomization is becoming common.
      
      It can be a security issue because it leaks kernel addresses.  It also
      affects the usefulness of the stack dump.  Linus says:
      
        "I actually spend time cleaning up commit messages in logs, because
        useless data that isn't actually information (random hex numbers) is
        actively detrimental.
      
        It makes commit logs less legible.
      
        It also makes it harder to parse dumps.
      
        It's not useful. That makes it actively bad.
      
        I probably look at more oops reports than most people. I have not
        found the hex numbers useful for the last five years, because they are
        just randomized crap.
      
        The stack content thing just makes code scroll off the screen etc, for
        example."
      
      The only real downside to removing these addresses is that they can be
      used to disambiguate duplicate symbol names.  However such cases are
      rare, and the context of the stack dump should be enough to be able to
      figure it out.
      
      There's now a 'faddr2line' script which can be used to convert a
      function address to a file name and line:
      
        $ ./scripts/faddr2line ~/k/vmlinux write_sysrq_trigger+0x51/0x60
        write_sysrq_trigger+0x51/0x60:
        write_sysrq_trigger at drivers/tty/sysrq.c:1098
      
      Or gdb can be used:
      
        $ echo "list *write_sysrq_trigger+0x51" |gdb ~/k/vmlinux |grep "is in"
        (gdb) 0xffffffff815b5d83 is in driver_probe_device (/home/jpoimboe/git/linux/drivers/base/dd.c:378).
      
      (But note that when there are duplicate symbol names, gdb will only show
      the first symbol it finds.  faddr2line is recommended over gdb because
      it handles duplicates and it also does function size checking.)
      
      Here's an example of what a stack dump looks like after this change:
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: sysrq_handle_crash+0x45/0x80
        PGD 36bfa067 [   29.650644] PUD 7aca3067
        Oops: 0002 [#1] PREEMPT SMP
        Modules linked in: ...
        CPU: 1 PID: 786 Comm: bash Tainted: G            E   4.9.0-rc1+ #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
        task: ffff880078582a40 task.stack: ffffc90000ba8000
        RIP: 0010:sysrq_handle_crash+0x45/0x80
        RSP: 0018:ffffc90000babdc8 EFLAGS: 00010296
        RAX: ffff880078582a40 RBX: 0000000000000063 RCX: 0000000000000001
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000292
        RBP: ffffc90000babdc8 R08: 0000000b31866061 R09: 0000000000000000
        R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
        R13: 0000000000000007 R14: ffffffff81ee8680 R15: 0000000000000000
        FS:  00007ffb43869700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000007a3e9000 CR4: 00000000001406e0
        Stack:
         ffffc90000babe00 ffffffff81572d08 ffffffff81572bd5 0000000000000002
         0000000000000000 ffff880079606600 00007ffb4386e000 ffffc90000babe20
         ffffffff81573201 ffff880036a3fd00 fffffffffffffffb ffffc90000babe40
        Call Trace:
         __handle_sysrq+0x138/0x220
         ? __handle_sysrq+0x5/0x220
         write_sysrq_trigger+0x51/0x60
         proc_reg_write+0x42/0x70
         __vfs_write+0x37/0x140
         ? preempt_count_sub+0xa1/0x100
         ? __sb_start_write+0xf5/0x210
         ? vfs_write+0x183/0x1a0
         vfs_write+0xb8/0x1a0
         SyS_write+0x58/0xc0
         entry_SYSCALL_64_fastpath+0x1f/0xc2
        RIP: 0033:0x7ffb42f55940
        RSP: 002b:00007ffd33bb6b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007ffb42f55940
        RDX: 0000000000000002 RSI: 00007ffb4386e000 RDI: 0000000000000001
        RBP: 0000000000000011 R08: 00007ffb4321ea40 R09: 00007ffb43869700
        R10: 00007ffb43869700 R11: 0000000000000246 R12: 0000000000778a10
        R13: 00007ffd33bb5c00 R14: 0000000000000007 R15: 0000000000000010
        Code: 34 e8 d0 34 bc ff 48 c7 c2 3b 2b 57 81 be 01 00 00 00 48 c7 c7 e0 dd e5 81 e8 a8 55 ba ff c7 05 0e 3f de 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 e8 4c 49 bc ff 84 c0 75 c3 48 c7
        RIP: sysrq_handle_crash+0x45/0x80 RSP: ffffc90000babdc8
        CR2: 0000000000000000
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/69329cb29b8f324bb5fcea14d61d224807fb6488.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bb5e5ce5
  2. 21 10月, 2016 2 次提交
    • J
      x86/dumpstack: Print any pt_regs found on the stack · 3b3fa11b
      Josh Poimboeuf 提交于
      Now that we can find pt_regs registers on the stack, print them.  Here's
      an example of what it looks like:
      
        Call Trace:
         <IRQ>
         [<ffffffff8144b793>] dump_stack+0x86/0xc3
         [<ffffffff81142c73>] hrtimer_interrupt+0xb3/0x1c0
         [<ffffffff8105eb86>] local_apic_timer_interrupt+0x36/0x60
         [<ffffffff818b27cd>] smp_apic_timer_interrupt+0x3d/0x50
         [<ffffffff818b06ee>] apic_timer_interrupt+0x9e/0xb0
        RIP: 0010:[<ffffffff818aef43>]  [<ffffffff818aef43>] _raw_spin_unlock_irq+0x33/0x60
        RSP: 0018:ffff880079c4f760  EFLAGS: 00000202
        RAX: ffff880078738000 RBX: ffff88007d3da0c0 RCX: 0000000000000007
        RDX: 0000000000006d78 RSI: ffff8800787388f0 RDI: ffff880078738000
        RBP: ffff880079c4f768 R08: 0000002199088f38 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81e0d540
        R13: ffff8800369fb700 R14: 0000000000000000 R15: ffff880078738000
         <EOI>
         [<ffffffff810e1f14>] finish_task_switch+0xb4/0x250
         [<ffffffff810e1ed6>] ? finish_task_switch+0x76/0x250
         [<ffffffff818a7b61>] __schedule+0x3e1/0xb20
         ...
         [<ffffffff810759c8>] trace_do_page_fault+0x58/0x2c0
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
         [<ffffffff818b1dd8>] async_page_fault+0x28/0x30
        RIP: 0010:[<ffffffff8145b062>]  [<ffffffff8145b062>] __clear_user+0x42/0x70
        RSP: 0018:ffff880079c4fd38  EFLAGS: 00010202
        RAX: 0000000000000000 RBX: 0000000000000138 RCX: 0000000000000138
        RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000061b640
        RBP: ffff880079c4fd48 R08: 0000002198feefd7 R09: ffffffff82a40928
        R10: 0000000000000001 R11: 0000000000000000 R12: 000000000061b640
        R13: 0000000000000000 R14: ffff880079c50000 R15: ffff8800791d7400
         [<ffffffff8145b043>] ? __clear_user+0x23/0x70
         [<ffffffff8145b0fb>] clear_user+0x2b/0x40
         [<ffffffff812fbda2>] load_elf_binary+0x1472/0x1750
         [<ffffffff8129a591>] search_binary_handler+0xa1/0x200
         [<ffffffff8129b69b>] do_execveat_common.isra.36+0x6cb/0x9f0
         [<ffffffff8129b5f3>] ? do_execveat_common.isra.36+0x623/0x9f0
         [<ffffffff8129bcaa>] SyS_execve+0x3a/0x50
         [<ffffffff81003f5c>] do_syscall_64+0x6c/0x1e0
         [<ffffffff818afa3f>] entry_SYSCALL64_slow_path+0x25/0x25
        RIP: 0033:[<00007fd2e2f2e537>]  [<00007fd2e2f2e537>] 0x7fd2e2f2e537
        RSP: 002b:00007ffc449c5fc8  EFLAGS: 00000246
        RAX: ffffffffffffffda RBX: 00007ffc449c8860 RCX: 00007fd2e2f2e537
        RDX: 000000000127cc40 RSI: 00007ffc449c8860 RDI: 00007ffc449c6029
        RBP: 00007ffc449c60b0 R08: 65726f632d667265 R09: 00007ffc449c5e20
        R10: 00000000000005a7 R11: 0000000000000246 R12: 000000000127cc40
        R13: 000000000127ce05 R14: 00007ffc449c6029 R15: 000000000127ce01
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5cc2c512ec82cfba00dd22467644d4ed751a48c0.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3b3fa11b
    • J
      x86/dumpstack: Print stack identifier on its own line · 79439d8e
      Josh Poimboeuf 提交于
      show_trace_log_lvl() prints the stack id (e.g. "<IRQ>") without a
      newline so that any stack address printed after it will appear on the
      same line.  That causes the first stack address to be vertically
      misaligned with the rest, making it visually cluttered and slightly
      confusing:
      
        Call Trace:
         <IRQ> [<ffffffff814431c3>] dump_stack+0x86/0xc3
         [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160
         [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0
         ...
         <EOI> [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60
         [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
      
      It will look worse once we start printing pt_regs registers found in the
      middle of the stack:
      
        <IRQ> RIP: 0010:[<ffffffff8189c6c3>]  [<ffffffff8189c6c3>] _raw_spin_unlock_irq+0x33/0x60
        RSP: 0018:ffff88007876f720  EFLAGS: 00000206
        RAX: ffff8800786caa40 RBX: ffff88007d5da140 RCX: 0000000000000007
        ...
      
      Improve readability by adding a newline to the stack name:
      
        Call Trace:
         <IRQ>
         [<ffffffff814431c3>] dump_stack+0x86/0xc3
         [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160
         [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0
         ...
         <EOI>
         [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60
         [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
      
      Now that "continued" lines are no longer needed, we can also remove the
      hack of using the empty string (aka KERN_CONT) and replace it with
      KERN_DEFAULT.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/9bdd6dee2c74555d45500939fcc155997dc7889e.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      79439d8e
  3. 21 9月, 2016 1 次提交
    • J
      x86/dumpstack: Fix show_stack() task pointer regression · 71f5443e
      Josh Poimboeuf 提交于
      With the following commit:
      
        e18bcccd ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")
      
      The task pointer argument to show_stack_log_lvl() in show_stack() was
      inadvertently changed to 'current'.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: byungchul.park@lge.com
      Cc: fweisbec@gmail.com
      Cc: keescook@chromium.org
      Cc: linux-tip-commits@vger.kernel.org
      Cc: luto@amacapital.net
      Cc: nilayvaish@gmail.com
      Cc: rostedt@goodmis.org
      Cc: tip-bot for Josh Poimboeuf <tipbot@zytor.com>
      Fixes: e18bcccd ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")
      Link: http://lkml.kernel.org/r/20160920155340.yhewlx7vmgmov5fb@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      71f5443e
  4. 20 9月, 2016 2 次提交
    • J
      x86/dumpstack: Remove dump_trace() and related callbacks · c8fe4609
      Josh Poimboeuf 提交于
      All previous users of dump_trace() have been converted to use the new
      unwind interfaces, so we can remove it and the related
      print_context_stack() and print_context_stack_bp() callback functions.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5b97da3572b40b5a4d8e185cf2429308d0987a13.1474045023.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8fe4609
    • J
      x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder · e18bcccd
      Josh Poimboeuf 提交于
      Convert show_trace_log_lvl() to use the new unwinder.  dump_trace() has
      been deprecated.
      
      show_trace_log_lvl() is special compared to other users of the unwinder.
      It's the only place where both reliable *and* unreliable addresses are
      needed.  With frame pointers enabled, most callers of the unwinder don't
      want to know about unreliable addresses.  But in this case, when we're
      dumping the stack to the console because something presumably went
      wrong, the unreliable addresses are useful:
      
      - They show stale data on the stack which can provide useful clues.
      
      - If something goes wrong with the unwinder, or if frame pointers are
        corrupt or missing, all the stack addresses still get shown.
      
      So in order to show all addresses on the stack, and at the same time
      figure out which addresses are reliable, we have to do the scanning and
      the unwinding in parallel.
      
      The scanning is done with the help of get_stack_info() to traverse the
      stacks.  The unwinding is done separately by the new unwinder.
      
      In theory we could simplify show_trace_log_lvl() by instead pushing some
      of this logic into the unwind code.  But then we would need some kind of
      "fake" frame logic in the unwinder which would add a lot of complexity
      and wouldn't be worth it in order to support only one user.
      
      Another benefit of this approach is that once we have a DWARF unwinder,
      we should be able to just plug it in with minimal impact to this code.
      
      Another change here is that callers of show_trace_log_lvl() don't need
      to provide the 'bp' argument.  The unwinder already finds the relevant
      frame pointer by unwinding until it reaches the first frame after the
      provided stack pointer.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/703b5998604c712a1f801874b43f35d6dac52ede.1474045023.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e18bcccd
  5. 16 9月, 2016 1 次提交
    • J
      x86/dumpstack: Remove NULL task pointer convention · 81539169
      Josh Poimboeuf 提交于
      show_stack_log_lvl() and friends allow a NULL pointer for the
      task_struct to indicate the current task.  This creates confusion and
      can cause sneaky bugs.
      
      Instead require the caller to pass 'current' directly.
      
      This only changes the internal workings of the dumpstack code.  The
      dump_trace() and show_stack() interfaces still allow a NULL task
      pointer.  Those interfaces should also probably be fixed as well.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      81539169
  6. 15 9月, 2016 1 次提交
    • J
      x86/dumpstack: Add get_stack_info() interface · cb76c939
      Josh Poimboeuf 提交于
      valid_stack_ptr() is buggy: it assumes that all stacks are of size
      THREAD_SIZE, which is not true for exception stacks.  So the
      walk_stack() callbacks will need to know the location of the beginning
      of the stack as well as the end.
      
      Another issue is that in general the various features of a stack (type,
      size, next stack pointer, description string) are scattered around in
      various places throughout the stack dump code.
      
      Encapsulate all that information in a single place with a new stack_info
      struct and a get_stack_info() interface.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8164dd0db96b7e6a279fa17ae5e6dc375eecb4a9.1473905218.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb76c939
  7. 08 9月, 2016 3 次提交
  8. 24 8月, 2016 3 次提交
  9. 19 8月, 2016 1 次提交
  10. 15 7月, 2016 2 次提交
  11. 08 7月, 2016 1 次提交
    • B
      x86/dumpstack: Add show_stack_regs() and use it · 81c2949f
      Borislav Petkov 提交于
      Add a helper to dump supplied pt_regs and use it in the MSR exception
      handling code to have precise stack traces pointing to the actual
      function causing the MSR access exception and not the stack frame of the
      exception handler itself.
      
      The new output looks like this:
      
       unchecked MSR access error: RDMSR from 0xdeadbeef at rIP: 0xffffffff8102ddb6 (early_init_intel+0x16/0x3a0)
        00000000756e6547 ffffffff81c03f68 ffffffff81dd0940 ffffffff81c03f10
        ffffffff81d42e65 0000000001000000 ffffffff81c03f58 ffffffff81d3e5a3
        0000800000000000 ffffffff81800080 ffffffffffffffff 0000000000000000
       Call Trace:
        [<ffffffff81d42e65>] early_cpu_init+0xe7/0x136
        [<ffffffff81d3e5a3>] setup_arch+0xa5/0x9df
        [<ffffffff81d38bb9>] start_kernel+0x9f/0x43a
        [<ffffffff81d38294>] x86_64_start_reservations+0x2f/0x31
        [<ffffffff81d383fe>] x86_64_start_kernel+0x168/0x176
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1467671487-10344-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      81c2949f
  12. 25 6月, 2016 1 次提交
    • L
      x86: fix up a few misc stack pointer vs thread_info confusions · aca9c293
      Linus Torvalds 提交于
      As the actual pointer value is the same for the thread stack allocation
      and the thread_info, code that confused the two worked fine, but will
      break when the thread info is moved away from the stack allocation.  It
      also looks very confusing.
      
      For example, the kprobe code wanted to know the current top of stack.
      To do that, it used this:
      
      	(unsigned long)current_thread_info() + THREAD_SIZE
      
      which did indeed give the correct value.  But it's not only a fairly
      nonsensical expression, it's also rather complex, especially since we
      actually have this:
      
      	static inline unsigned long current_top_of_stack(void)
      
      which not only gives us the value we are interested in, but happens to
      be how "current_thread_info()" is currently defined as:
      
      	(struct thread_info *)(current_top_of_stack() - THREAD_SIZE);
      
      so using current_thread_info() to figure out the top of the stack really
      is a very round-about thing to do.
      
      The other cases are just simpler confusion about task_thread_info() vs
      task_stack_page(), which currently return the same pointer - but if you
      want the stack page, you really should be using the latter one.
      
      And there was one entirely unused assignment of the current stack to a
      thread_info pointer.
      
      All cleaned up to make more sense today, and make it easier to move the
      thread_info away from the stack in the future.
      
      No semantic changes.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aca9c293
  13. 24 6月, 2016 1 次提交
  14. 31 3月, 2016 1 次提交
  15. 16 3月, 2016 1 次提交
  16. 20 2月, 2016 1 次提交
  17. 23 3月, 2015 1 次提交
  18. 24 2月, 2015 1 次提交
  19. 14 2月, 2015 1 次提交
    • A
      x86_64: add KASan support · ef7f0d6a
      Andrey Ryabinin 提交于
      This patch adds arch specific code for kernel address sanitizer.
      
      16TB of virtual addressed used for shadow memory.  It's located in range
      [ffffec0000000000 - fffffc0000000000] between vmemmap and %esp fixup
      stacks.
      
      At early stage we map whole shadow region with zero page.  Latter, after
      pages mapped to direct mapping address range we unmap zero pages from
      corresponding shadow (see kasan_map_shadow()) and allocate and map a real
      shadow memory reusing vmemmap_populate() function.
      
      Also replace __pa with __pa_nodebug before shadow initialized.  __pa with
      CONFIG_DEBUG_VIRTUAL=y make external function call (__phys_addr)
      __phys_addr is instrumented, so __asan_load could be called before shadow
      area initialized.
      Signed-off-by: NAndrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: NAndrey Konovalov <adech.fo@gmail.com>
      Cc: Yuri Gribov <tetra2005@gmail.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jim Davis <jim.epost@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef7f0d6a
  20. 24 4月, 2014 1 次提交
    • M
      kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation · 9326638c
      Masami Hiramatsu 提交于
      Use NOKPROBE_SYMBOL macro for protecting functions
      from kprobes instead of __kprobes annotation under
      arch/x86.
      
      This applies nokprobe_inline annotation for some cases,
      because NOKPROBE_SYMBOL() will inhibit inlining by
      referring the symbol address.
      
      This just folds a bunch of previous NOKPROBE_SYMBOL()
      cleanup patches for x86 to one patch.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Link: http://lkml.kernel.org/r/20140417081814.26341.51656.stgit@ltc230.yrl.intra.hitachi.co.jp
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Jonathan Lebon <jlebon@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9326638c
  21. 13 11月, 2013 1 次提交
    • J
      x86/dumpstack: Fix printk_address for direct addresses · 5f01c988
      Jiri Slaby 提交于
      Consider a kernel crash in a module, simulated the following way:
      
       static int my_init(void)
       {
               char *map = (void *)0x5;
               *map = 3;
               return 0;
       }
       module_init(my_init);
      
      When we turn off FRAME_POINTERs, the very first instruction in
      that function causes a BUG. The problem is that we print IP in
      the BUG report using %pB (from printk_address). And %pB
      decrements the pointer by one to fix printing addresses of
      functions with tail calls.
      
      This was added in commit 71f9e598 ("x86, dumpstack: Use
      %pB format specifier for stack trace") to fix the call stack
      printouts.
      
      So instead of correct output:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000005
        IP: [<ffffffffa01ac000>] my_init+0x0/0x10 [pb173]
      
      We get:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000005
        IP: [<ffffffffa0152000>] 0xffffffffa0151fff
      
      To fix that, we use %pS only for stack addresses printouts (via
      newly added printk_stack_address) and %pB for regs->ip (via
      printk_address). I.e. we revert to the old behaviour for all
      except call stacks. And since from all those reliable is 1, we
      remove that parameter from printk_address.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: joe@perches.com
      Cc: jirislaby@gmail.com
      Link: http://lkml.kernel.org/r/1382706418-8435-1-git-send-email-jslaby@suse.czSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5f01c988
  22. 01 5月, 2013 2 次提交
    • T
      dump_stack: consolidate dump_stack() implementations and unify their behaviors · 196779b9
      Tejun Heo 提交于
      Both dump_stack() and show_stack() are currently implemented by each
      architecture.  show_stack(NULL, NULL) dumps the backtrace for the
      current task as does dump_stack().  On some archs, dump_stack() prints
      extra information - pid, utsname and so on - in addition to the
      backtrace while the two are identical on other archs.
      
      The usages in arch-independent code of the two functions indicate
      show_stack(NULL, NULL) should print out bare backtrace while
      dump_stack() is used for debugging purposes when something went wrong,
      so it does make sense to print additional information on the task which
      triggered dump_stack().
      
      There's no reason to require archs to implement two separate but mostly
      identical functions.  It leads to unnecessary subtle information.
      
      This patch expands the dummy fallback dump_stack() implementation in
      lib/dump_stack.c such that it prints out debug information (taken from
      x86) and invokes show_stack(NULL, NULL) and drops arch-specific
      dump_stack() implementations in all archs except blackfin.  Blackfin's
      dump_stack() does something wonky that I don't understand.
      
      Debug information can be printed separately by calling
      dump_stack_print_info() so that arch-specific dump_stack()
      implementation can still emit the same debug information.  This is used
      in blackfin.
      
      This patch brings the following behavior changes.
      
      * On some archs, an extra level in backtrace for show_stack() could be
        printed.  This is because the top frame was determined in
        dump_stack() on those archs while generic dump_stack() can't do that
        reliably.  It can be compensated by inlining dump_stack() but not
        sure whether that'd be necessary.
      
      * Most archs didn't use to print debug info on dump_stack().  They do
        now.
      
      An example WARN dump follows.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Hardware name: empty
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a071>] init_workqueues+0x35/0x505
        ...
      
      v2: CPU number added to the generic debug info as requested by s390
          folks and dropped the s390 specific dump_stack().  This loses %ksp
          from the debug message which the maintainers think isn't important
          enough to keep the s390-specific dump_stack() implementation.
      
          dump_stack_print_info() is moved to kernel/printk.c from
          lib/dump_stack.c.  Because linkage is per objecct file,
          dump_stack_print_info() living in the same lib file as generic
          dump_stack() means that archs which implement custom dump_stack()
          - at this point, only blackfin - can't use dump_stack_print_info()
          as that will bring in the generic version of dump_stack() too.  v1
          The v1 patch broke build on blackfin due to this issue.  The build
          breakage was reported by Fengguang Wu.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	[s390 bits]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      196779b9
    • T
      x86: don't show trace beyond show_stack(NULL, NULL) · a77f2a4e
      Tejun Heo 提交于
      There are multiple ways a task can be dumped - explicit call to
      dump_stack(), triggering WARN() or BUG(), through sysrq-t and so on.
      Most of what gets printed is upto each architecture and the current
      state is not particularly pretty.  Different pieces of information are
      presented differently depending on which path the dump takes and which
      architecture it's running on.  This is messy for no good reason and
      makes it exceedingly difficult to add or modify debug information to
      task dumps.
      
      In all archs except for s390, there's nothing arch-specific about the
      printed debug information.  This patchset updates all those archs to use
      the same helpers to consistently print out the same debug information.
      
      An example WARN dump after this patchset.
      
       WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #3
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
        0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
        ffffffff8108f500 ffffffff82228240 0000000000000040 ffffffff8234a08e
        0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
       Call Trace:
        [<ffffffff81c614dc>] dump_stack+0x19/0x1b
        [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0
        [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8234a0c3>] init_workqueues+0x35/0x505
        ...
      
      And BUG dump.
      
       kernel BUG at kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      This patchset contains the following seven patches.
      
       0001-x86-don-t-show-trace-beyond-show_stack-NULL-NULL.patch
       0002-sparc32-make-show_stack-acquire-fp-if-_ksp-is-not-sp.patch
       0003-dump_stack-consolidate-dump_stack-implementations-an.patch
       0004-dmi-morph-dmi_dump_ids-into-dmi_format_ids-which-for.patch
       0005-dump_stack-implement-arch-specific-hardware-descript.patch
       0006-dump_stack-unify-debug-information-printed-by-show_r.patch
       0007-arc-print-fatal-signals-reduce-duplicated-informatio.patch
      
      0001-0002 update stack dumping functions in x86 and sparc32 in
      preparation.
      
      0003 makes all arches except blackfin use generic dump_stack().
      blackfin still uses the generic helper to print the same info.
      
      0004-0005 properly abstract DMI identifier printing in WARN() and
      show_regs() so that all dumps print out the information.  This enables
      show_regs() to use the same debug info message.
      
      0006 updates show_regs() of all arches to use a common generic helper
      to print debug info.
      
      0007 removes somem duplicate information from arc dumps.
      
      While this patchset changes how debug info is printed on some archs,
      the printed information is always superset of what used to be there.
      
      This patchset makes task dump debug messages consistent and enables
      adding more information.  Workqueue is scheduled to add worker
      information including the workqueue in use and work item specific
      description.
      
      While this patch touches a lot of archs, it isn't too likely to cause
      non-trivial conflicts with arch-specfic changes and would probably be
      best to route together either through -mm.
      
      x86 is tested but other archs are either only compile tested or not
      tested at all.  Changes to most archs are generally trivial.
      
      This patch:
      
      show_stack(current or NULL, NULL) is used to print the backtrace of the
      current task.  As trace beyond the function itself isn't of much
      interest to anyone, don't show it by determining sp and bp in
      show_stack()'s frame and passing them to show_stack_log_lvl().
      
      This brings show_stack(NULL, NULL)'s behavior in line with
      dump_stack().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a77f2a4e
  23. 21 1月, 2013 1 次提交
  24. 20 6月, 2012 1 次提交
  25. 06 6月, 2012 1 次提交
  26. 16 5月, 2012 1 次提交
  27. 09 5月, 2012 1 次提交
  28. 24 3月, 2012 1 次提交
  29. 13 3月, 2012 1 次提交
  30. 27 1月, 2012 1 次提交
    • P
      bugs, x86: Fix printk levels for panic, softlockups and stack dumps · b0f4c4b3
      Prarit Bhargava 提交于
      rsyslog will display KERN_EMERG messages on a connected
      terminal.  However, these messages are useless/undecipherable
      for a general user.
      
      For example, after a softlockup we get:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Stack:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Call Trace:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
       kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89
       d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <e8> ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89
      
      This happens because the printk levels for these messages are
      incorrect. Only an informational message should be displayed on
      a terminal.
      
      I modified the printk levels for various messages in the kernel
      and tested the output by using the drivers/misc/lkdtm.c kernel
      modules (ie, softlockups, panics, hard lockups, etc.) and
      confirmed that the console output was still the same and that
      the output to the terminals was correct.
      
      For example, in the case of a softlockup we now see the much
      more informative:
      
       Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ...
       BUG: soft lockup - CPU4 stuck for 60s!
      
      instead of the above confusing messages.
      
      AFAICT, the messages no longer have to be KERN_EMERG.  In the
      most important case of a panic we set console_verbose().  As for
      the other less severe cases the correct data is output to the
      console and /var/log/messages.
      
      Successfully tested by me using the drivers/misc/lkdtm.c module.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: dzickus@redhat.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b0f4c4b3
  31. 14 5月, 2011 1 次提交
  32. 12 5月, 2011 1 次提交