1. 03 1月, 2018 2 次提交
  2. 24 12月, 2017 1 次提交
    • V
      x86/dumpstack: Indicate in Oops whether PTI is configured and enabled · 5f26d76c
      Vlastimil Babka 提交于
      CONFIG_PAGE_TABLE_ISOLATION is relatively new and intrusive feature that may
      still have some corner cases which could take some time to manifest and be
      fixed. It would be useful to have Oops messages indicate whether it was
      enabled for building the kernel, and whether it was disabled during boot.
      
      Example of fully enabled:
      
      	Oops: 0001 [#1] SMP PTI
      
      Example of enabled during build, but disabled during boot:
      
      	Oops: 0001 [#1] SMP NOPTI
      
      We can decide to remove this after the feature has been tested in the field
      long enough.
      
      [ tglx: Made it use boot_cpu_has() as requested by Borislav ]
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NEduardo Valentin <eduval@amazon.com>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: bpetkov@suse.de
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: jkosina@suse.cz
      Cc: keescook@google.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5f26d76c
  3. 23 12月, 2017 2 次提交
    • T
      x86/cpu_entry_area: Move it out of the fixmap · 92a0f81d
      Thomas Gleixner 提交于
      Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big
      and 0-day already hit a case where the fixmap PTEs were cleared by
      cleanup_highmap().
      
      Aside of that the fixmap API is a pain as it's all backwards.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      92a0f81d
    • D
      x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack · 4fe2d8b1
      Dave Hansen 提交于
      If the kernel oopses while on the trampoline stack, it will print
      "<SYSENTER>" even if SYSENTER is not involved.  That is rather confusing.
      
      The "SYSENTER" stack is used for a lot more than SYSENTER now.  Give it a
      better string to display in stack dumps, and rename the kernel code to
      match.
      
      Also move the 32-bit code over to the new naming even though it still uses
      the entry stack only for SYSENTER.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4fe2d8b1
  4. 17 12月, 2017 6 次提交
    • A
      x86/entry: Clean up the SYSENTER_stack code · 0f9a4810
      Andy Lutomirski 提交于
      The existing code was a mess, mainly because C arrays are nasty.  Turn
      SYSENTER_stack into a struct, add a helper to find it, and do all the
      obvious cleanups this enables.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bpetkov@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0f9a4810
    • A
      x86/entry/64: Remove the SYSENTER stack canary · 7fbbd5cb
      Andy Lutomirski 提交于
      Now that the SYSENTER stack has a guard page, there's no need for a canary
      to detect overflow after the fact.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.572577316@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7fbbd5cb
    • A
      x86/entry: Remap the TSS into the CPU entry area · 72f5e08d
      Andy Lutomirski 提交于
      This has a secondary purpose: it puts the entry stack into a region
      with a well-controlled layout.  A subsequent patch will take
      advantage of this to streamline the SYSCALL entry code to be able to
      find it more easily.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bpetkov@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      72f5e08d
    • A
      x86/dumpstack: Handle stack overflow on all stacks · 6e60e583
      Andy Lutomirski 提交于
      We currently special-case stack overflow on the task stack.  We're
      going to start putting special stacks in the fixmap with a custom
      layout, so they'll have guard pages, too.  Teach the unwinder to be
      able to unwind an overflow of any of the stacks.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.802057305@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e60e583
    • A
      x86/dumpstack: Add get_stack_info() support for the SYSENTER stack · 33a2f1a6
      Andy Lutomirski 提交于
      get_stack_info() doesn't currently know about the SYSENTER stack, so
      unwinding will fail if we entered the kernel on the SYSENTER stack
      and haven't fully switched off.  Teach get_stack_info() about the
      SYSENTER stack.
      
      With future patches applied that run part of the entry code on the
      SYSENTER stack and introduce an intentional BUG(), I would get:
      
        PANIC: double fault, error_code: 0x0
        ...
        RIP: 0010:do_error_trap+0x33/0x1c0
        ...
        Call Trace:
        Code: ...
      
      With this patch, I get:
      
        PANIC: double fault, error_code: 0x0
        ...
        Call Trace:
         <SYSENTER>
         ? async_page_fault+0x36/0x60
         ? invalid_op+0x22/0x40
         ? async_page_fault+0x36/0x60
         ? sync_regs+0x3c/0x40
         ? sync_regs+0x2e/0x40
         ? error_entry+0x6c/0xd0
         ? async_page_fault+0x36/0x60
         </SYSENTER>
        Code: ...
      
      which is a lot more informative.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.392711508@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      33a2f1a6
    • J
      x86/unwinder: Handle stack overflows more gracefully · b02fcf9b
      Josh Poimboeuf 提交于
      There are at least two unwinder bugs hindering the debugging of
      stack-overflow crashes:
      
      - It doesn't deal gracefully with the case where the stack overflows and
        the stack pointer itself isn't on a valid stack but the
        to-be-dereferenced data *is*.
      
      - The ORC oops dump code doesn't know how to print partial pt_regs, for the
        case where if we get an interrupt/exception in *early* entry code
        before the full pt_regs have been saved.
      
      Fix both issues.
      
      http://lkml.kernel.org/r/20171126024031.uxi4numpbjm5rlbr@trebleSigned-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bpetkov@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.071425003@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b02fcf9b
  5. 30 7月, 2017 1 次提交
    • A
      x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads · 99504819
      Andy Lutomirski 提交于
      Now that pt_regs properly defines segment fields as 16-bit on 32-bit
      CPUs, there's no need to mask off the high word.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      99504819
  6. 18 7月, 2017 1 次提交
  7. 18 4月, 2017 1 次提交
    • J
      x86/unwind: Ensure stack pointer is aligned · e335bb51
      Josh Poimboeuf 提交于
      With frame pointers disabled, on some older versions of GCC (like
      4.8.3), it's possible for the stack pointer to get aligned at a
      half-word boundary:
      
        00000000000004d0 <fib_table_lookup>:
             4d0:       41 57                   push   %r15
             4d2:       41 56                   push   %r14
             4d4:       41 55                   push   %r13
             4d6:       41 54                   push   %r12
             4d8:       55                      push   %rbp
             4d9:       53                      push   %rbx
             4da:       48 83 ec 24             sub    $0x24,%rsp
      
      In such a case, the unwinder ends up reading the entire stack at the
      wrong alignment.  Then the last read goes past the end of the stack,
      hitting the stack guard page:
      
        BUG: stack guard page was hit at ffffc900217c4000 (stack is ffffc900217c0000..ffffc900217c3fff)
        kernel stack overflow (page fault): 0000 [#1] SMP
        ...
      
      Fix it by ensuring the stack pointer is properly aligned before
      unwinding.
      Reported-by: NJirka Hladky <jhladky@redhat.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 7c7900f8 ("x86/unwind: Add new unwind interface and implementations")
      Link: http://lkml.kernel.org/r/cff33847cc9b02fa548625aa23268ac574460d8d.1492436590.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e335bb51
  8. 27 3月, 2017 1 次提交
    • P
      x86/debug: Implement __WARN() using UD0 · 9a93848f
      Peter Zijlstra 提交于
      By using "UD0" for WARN()s we remove the function call and its possible
      __FILE__ and __LINE__ immediate arguments from the instruction stream.
      
      Total image size will not change much, what we win in the instruction
      stream we'll lose because of the __bug_table entries. Still, saves on
      I$ footprint and the total image size does go down a bit.
      
            text    data       filename
        10702123    4530992    defconfig-build/vmlinux.orig
        10682460    4530992    defconfig-build/vmlinux.patched
      
      (UML didn't seem to use GENERIC_BUG at all, so remove it)
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard.weinberger@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9a93848f
  9. 02 3月, 2017 2 次提交
  10. 21 11月, 2016 1 次提交
  11. 18 11月, 2016 1 次提交
  12. 17 11月, 2016 1 次提交
  13. 26 10月, 2016 2 次提交
    • J
      x86/dumpstack: Remove raw stack dump · 0ee1dd9f
      Josh Poimboeuf 提交于
      For mostly historical reasons, the x86 oops dump shows the raw stack
      values:
      
        ...
        [registers]
        Stack:
         ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0
         ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321
         0000000000000002 0000000000000000 0000000000000000 0000000000000000
        Call Trace:
        ...
      
      This seems to be an artifact from long ago, and probably isn't needed
      anymore.  It generally just adds noise to the dump, and it can be
      actively harmful because it leaks kernel addresses.
      
      Linus says:
      
        "The stack dump actually goes back to forever, and it used to be
         useful back in 1992 or so. But it used to be useful mainly because
         stacks were simpler and we didn't have very good call traces anyway. I
         definitely remember having used them - I just do not remember having
         used them in the last ten+ years.
      
         Of course, it's still true that if you can trigger an oops, you've
         likely already lost the security game, but since the stack dump is so
         useless, let's aim to just remove it and make games like the above
         harder."
      
      This also removes the related 'kstack=' cmdline option and the
      'kstack_depth_to_print' sysctl.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0ee1dd9f
    • J
      x86/dumpstack: Remove kernel text addresses from stack dump · bb5e5ce5
      Josh Poimboeuf 提交于
      Printing kernel text addresses in stack dumps is of questionable value,
      especially now that address randomization is becoming common.
      
      It can be a security issue because it leaks kernel addresses.  It also
      affects the usefulness of the stack dump.  Linus says:
      
        "I actually spend time cleaning up commit messages in logs, because
        useless data that isn't actually information (random hex numbers) is
        actively detrimental.
      
        It makes commit logs less legible.
      
        It also makes it harder to parse dumps.
      
        It's not useful. That makes it actively bad.
      
        I probably look at more oops reports than most people. I have not
        found the hex numbers useful for the last five years, because they are
        just randomized crap.
      
        The stack content thing just makes code scroll off the screen etc, for
        example."
      
      The only real downside to removing these addresses is that they can be
      used to disambiguate duplicate symbol names.  However such cases are
      rare, and the context of the stack dump should be enough to be able to
      figure it out.
      
      There's now a 'faddr2line' script which can be used to convert a
      function address to a file name and line:
      
        $ ./scripts/faddr2line ~/k/vmlinux write_sysrq_trigger+0x51/0x60
        write_sysrq_trigger+0x51/0x60:
        write_sysrq_trigger at drivers/tty/sysrq.c:1098
      
      Or gdb can be used:
      
        $ echo "list *write_sysrq_trigger+0x51" |gdb ~/k/vmlinux |grep "is in"
        (gdb) 0xffffffff815b5d83 is in driver_probe_device (/home/jpoimboe/git/linux/drivers/base/dd.c:378).
      
      (But note that when there are duplicate symbol names, gdb will only show
      the first symbol it finds.  faddr2line is recommended over gdb because
      it handles duplicates and it also does function size checking.)
      
      Here's an example of what a stack dump looks like after this change:
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: sysrq_handle_crash+0x45/0x80
        PGD 36bfa067 [   29.650644] PUD 7aca3067
        Oops: 0002 [#1] PREEMPT SMP
        Modules linked in: ...
        CPU: 1 PID: 786 Comm: bash Tainted: G            E   4.9.0-rc1+ #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
        task: ffff880078582a40 task.stack: ffffc90000ba8000
        RIP: 0010:sysrq_handle_crash+0x45/0x80
        RSP: 0018:ffffc90000babdc8 EFLAGS: 00010296
        RAX: ffff880078582a40 RBX: 0000000000000063 RCX: 0000000000000001
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000292
        RBP: ffffc90000babdc8 R08: 0000000b31866061 R09: 0000000000000000
        R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
        R13: 0000000000000007 R14: ffffffff81ee8680 R15: 0000000000000000
        FS:  00007ffb43869700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000007a3e9000 CR4: 00000000001406e0
        Stack:
         ffffc90000babe00 ffffffff81572d08 ffffffff81572bd5 0000000000000002
         0000000000000000 ffff880079606600 00007ffb4386e000 ffffc90000babe20
         ffffffff81573201 ffff880036a3fd00 fffffffffffffffb ffffc90000babe40
        Call Trace:
         __handle_sysrq+0x138/0x220
         ? __handle_sysrq+0x5/0x220
         write_sysrq_trigger+0x51/0x60
         proc_reg_write+0x42/0x70
         __vfs_write+0x37/0x140
         ? preempt_count_sub+0xa1/0x100
         ? __sb_start_write+0xf5/0x210
         ? vfs_write+0x183/0x1a0
         vfs_write+0xb8/0x1a0
         SyS_write+0x58/0xc0
         entry_SYSCALL_64_fastpath+0x1f/0xc2
        RIP: 0033:0x7ffb42f55940
        RSP: 002b:00007ffd33bb6b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007ffb42f55940
        RDX: 0000000000000002 RSI: 00007ffb4386e000 RDI: 0000000000000001
        RBP: 0000000000000011 R08: 00007ffb4321ea40 R09: 00007ffb43869700
        R10: 00007ffb43869700 R11: 0000000000000246 R12: 0000000000778a10
        R13: 00007ffd33bb5c00 R14: 0000000000000007 R15: 0000000000000010
        Code: 34 e8 d0 34 bc ff 48 c7 c2 3b 2b 57 81 be 01 00 00 00 48 c7 c7 e0 dd e5 81 e8 a8 55 ba ff c7 05 0e 3f de 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 01 5d c3 e8 4c 49 bc ff 84 c0 75 c3 48 c7
        RIP: sysrq_handle_crash+0x45/0x80 RSP: ffffc90000babdc8
        CR2: 0000000000000000
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/69329cb29b8f324bb5fcea14d61d224807fb6488.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bb5e5ce5
  14. 21 10月, 2016 2 次提交
    • J
      x86/dumpstack: Print any pt_regs found on the stack · 3b3fa11b
      Josh Poimboeuf 提交于
      Now that we can find pt_regs registers on the stack, print them.  Here's
      an example of what it looks like:
      
        Call Trace:
         <IRQ>
         [<ffffffff8144b793>] dump_stack+0x86/0xc3
         [<ffffffff81142c73>] hrtimer_interrupt+0xb3/0x1c0
         [<ffffffff8105eb86>] local_apic_timer_interrupt+0x36/0x60
         [<ffffffff818b27cd>] smp_apic_timer_interrupt+0x3d/0x50
         [<ffffffff818b06ee>] apic_timer_interrupt+0x9e/0xb0
        RIP: 0010:[<ffffffff818aef43>]  [<ffffffff818aef43>] _raw_spin_unlock_irq+0x33/0x60
        RSP: 0018:ffff880079c4f760  EFLAGS: 00000202
        RAX: ffff880078738000 RBX: ffff88007d3da0c0 RCX: 0000000000000007
        RDX: 0000000000006d78 RSI: ffff8800787388f0 RDI: ffff880078738000
        RBP: ffff880079c4f768 R08: 0000002199088f38 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81e0d540
        R13: ffff8800369fb700 R14: 0000000000000000 R15: ffff880078738000
         <EOI>
         [<ffffffff810e1f14>] finish_task_switch+0xb4/0x250
         [<ffffffff810e1ed6>] ? finish_task_switch+0x76/0x250
         [<ffffffff818a7b61>] __schedule+0x3e1/0xb20
         ...
         [<ffffffff810759c8>] trace_do_page_fault+0x58/0x2c0
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
         [<ffffffff818b1dd8>] async_page_fault+0x28/0x30
        RIP: 0010:[<ffffffff8145b062>]  [<ffffffff8145b062>] __clear_user+0x42/0x70
        RSP: 0018:ffff880079c4fd38  EFLAGS: 00010202
        RAX: 0000000000000000 RBX: 0000000000000138 RCX: 0000000000000138
        RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000061b640
        RBP: ffff880079c4fd48 R08: 0000002198feefd7 R09: ffffffff82a40928
        R10: 0000000000000001 R11: 0000000000000000 R12: 000000000061b640
        R13: 0000000000000000 R14: ffff880079c50000 R15: ffff8800791d7400
         [<ffffffff8145b043>] ? __clear_user+0x23/0x70
         [<ffffffff8145b0fb>] clear_user+0x2b/0x40
         [<ffffffff812fbda2>] load_elf_binary+0x1472/0x1750
         [<ffffffff8129a591>] search_binary_handler+0xa1/0x200
         [<ffffffff8129b69b>] do_execveat_common.isra.36+0x6cb/0x9f0
         [<ffffffff8129b5f3>] ? do_execveat_common.isra.36+0x623/0x9f0
         [<ffffffff8129bcaa>] SyS_execve+0x3a/0x50
         [<ffffffff81003f5c>] do_syscall_64+0x6c/0x1e0
         [<ffffffff818afa3f>] entry_SYSCALL64_slow_path+0x25/0x25
        RIP: 0033:[<00007fd2e2f2e537>]  [<00007fd2e2f2e537>] 0x7fd2e2f2e537
        RSP: 002b:00007ffc449c5fc8  EFLAGS: 00000246
        RAX: ffffffffffffffda RBX: 00007ffc449c8860 RCX: 00007fd2e2f2e537
        RDX: 000000000127cc40 RSI: 00007ffc449c8860 RDI: 00007ffc449c6029
        RBP: 00007ffc449c60b0 R08: 65726f632d667265 R09: 00007ffc449c5e20
        R10: 00000000000005a7 R11: 0000000000000246 R12: 000000000127cc40
        R13: 000000000127ce05 R14: 00007ffc449c6029 R15: 000000000127ce01
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5cc2c512ec82cfba00dd22467644d4ed751a48c0.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3b3fa11b
    • J
      x86/dumpstack: Print stack identifier on its own line · 79439d8e
      Josh Poimboeuf 提交于
      show_trace_log_lvl() prints the stack id (e.g. "<IRQ>") without a
      newline so that any stack address printed after it will appear on the
      same line.  That causes the first stack address to be vertically
      misaligned with the rest, making it visually cluttered and slightly
      confusing:
      
        Call Trace:
         <IRQ> [<ffffffff814431c3>] dump_stack+0x86/0xc3
         [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160
         [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0
         ...
         <EOI> [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60
         [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
      
      It will look worse once we start printing pt_regs registers found in the
      middle of the stack:
      
        <IRQ> RIP: 0010:[<ffffffff8189c6c3>]  [<ffffffff8189c6c3>] _raw_spin_unlock_irq+0x33/0x60
        RSP: 0018:ffff88007876f720  EFLAGS: 00000206
        RAX: ffff8800786caa40 RBX: ffff88007d5da140 RCX: 0000000000000007
        ...
      
      Improve readability by adding a newline to the stack name:
      
        Call Trace:
         <IRQ>
         [<ffffffff814431c3>] dump_stack+0x86/0xc3
         [<ffffffff8100828b>] perf_callchain_kernel+0x14b/0x160
         [<ffffffff811e915f>] get_perf_callchain+0x15f/0x2b0
         ...
         <EOI>
         [<ffffffff8189c6c3>] ? _raw_spin_unlock_irq+0x33/0x60
         [<ffffffff810e1c84>] finish_task_switch+0xb4/0x250
         [<ffffffff8106f7dc>] do_async_page_fault+0x2c/0xa0
      
      Now that "continued" lines are no longer needed, we can also remove the
      hack of using the empty string (aka KERN_CONT) and replace it with
      KERN_DEFAULT.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/9bdd6dee2c74555d45500939fcc155997dc7889e.1476973742.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      79439d8e
  15. 21 9月, 2016 1 次提交
    • J
      x86/dumpstack: Fix show_stack() task pointer regression · 71f5443e
      Josh Poimboeuf 提交于
      With the following commit:
      
        e18bcccd ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")
      
      The task pointer argument to show_stack_log_lvl() in show_stack() was
      inadvertently changed to 'current'.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: byungchul.park@lge.com
      Cc: fweisbec@gmail.com
      Cc: keescook@chromium.org
      Cc: linux-tip-commits@vger.kernel.org
      Cc: luto@amacapital.net
      Cc: nilayvaish@gmail.com
      Cc: rostedt@goodmis.org
      Cc: tip-bot for Josh Poimboeuf <tipbot@zytor.com>
      Fixes: e18bcccd ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")
      Link: http://lkml.kernel.org/r/20160920155340.yhewlx7vmgmov5fb@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      71f5443e
  16. 20 9月, 2016 2 次提交
    • J
      x86/dumpstack: Remove dump_trace() and related callbacks · c8fe4609
      Josh Poimboeuf 提交于
      All previous users of dump_trace() have been converted to use the new
      unwind interfaces, so we can remove it and the related
      print_context_stack() and print_context_stack_bp() callback functions.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5b97da3572b40b5a4d8e185cf2429308d0987a13.1474045023.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c8fe4609
    • J
      x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder · e18bcccd
      Josh Poimboeuf 提交于
      Convert show_trace_log_lvl() to use the new unwinder.  dump_trace() has
      been deprecated.
      
      show_trace_log_lvl() is special compared to other users of the unwinder.
      It's the only place where both reliable *and* unreliable addresses are
      needed.  With frame pointers enabled, most callers of the unwinder don't
      want to know about unreliable addresses.  But in this case, when we're
      dumping the stack to the console because something presumably went
      wrong, the unreliable addresses are useful:
      
      - They show stale data on the stack which can provide useful clues.
      
      - If something goes wrong with the unwinder, or if frame pointers are
        corrupt or missing, all the stack addresses still get shown.
      
      So in order to show all addresses on the stack, and at the same time
      figure out which addresses are reliable, we have to do the scanning and
      the unwinding in parallel.
      
      The scanning is done with the help of get_stack_info() to traverse the
      stacks.  The unwinding is done separately by the new unwinder.
      
      In theory we could simplify show_trace_log_lvl() by instead pushing some
      of this logic into the unwind code.  But then we would need some kind of
      "fake" frame logic in the unwinder which would add a lot of complexity
      and wouldn't be worth it in order to support only one user.
      
      Another benefit of this approach is that once we have a DWARF unwinder,
      we should be able to just plug it in with minimal impact to this code.
      
      Another change here is that callers of show_trace_log_lvl() don't need
      to provide the 'bp' argument.  The unwinder already finds the relevant
      frame pointer by unwinding until it reaches the first frame after the
      provided stack pointer.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/703b5998604c712a1f801874b43f35d6dac52ede.1474045023.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e18bcccd
  17. 16 9月, 2016 1 次提交
    • J
      x86/dumpstack: Remove NULL task pointer convention · 81539169
      Josh Poimboeuf 提交于
      show_stack_log_lvl() and friends allow a NULL pointer for the
      task_struct to indicate the current task.  This creates confusion and
      can cause sneaky bugs.
      
      Instead require the caller to pass 'current' directly.
      
      This only changes the internal workings of the dumpstack code.  The
      dump_trace() and show_stack() interfaces still allow a NULL task
      pointer.  Those interfaces should also probably be fixed as well.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      81539169
  18. 15 9月, 2016 1 次提交
    • J
      x86/dumpstack: Add get_stack_info() interface · cb76c939
      Josh Poimboeuf 提交于
      valid_stack_ptr() is buggy: it assumes that all stacks are of size
      THREAD_SIZE, which is not true for exception stacks.  So the
      walk_stack() callbacks will need to know the location of the beginning
      of the stack as well as the end.
      
      Another issue is that in general the various features of a stack (type,
      size, next stack pointer, description string) are scattered around in
      various places throughout the stack dump code.
      
      Encapsulate all that information in a single place with a new stack_info
      struct and a get_stack_info() interface.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/8164dd0db96b7e6a279fa17ae5e6dc375eecb4a9.1473905218.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb76c939
  19. 08 9月, 2016 3 次提交
  20. 24 8月, 2016 3 次提交
  21. 19 8月, 2016 1 次提交
  22. 15 7月, 2016 2 次提交
  23. 08 7月, 2016 1 次提交
    • B
      x86/dumpstack: Add show_stack_regs() and use it · 81c2949f
      Borislav Petkov 提交于
      Add a helper to dump supplied pt_regs and use it in the MSR exception
      handling code to have precise stack traces pointing to the actual
      function causing the MSR access exception and not the stack frame of the
      exception handler itself.
      
      The new output looks like this:
      
       unchecked MSR access error: RDMSR from 0xdeadbeef at rIP: 0xffffffff8102ddb6 (early_init_intel+0x16/0x3a0)
        00000000756e6547 ffffffff81c03f68 ffffffff81dd0940 ffffffff81c03f10
        ffffffff81d42e65 0000000001000000 ffffffff81c03f58 ffffffff81d3e5a3
        0000800000000000 ffffffff81800080 ffffffffffffffff 0000000000000000
       Call Trace:
        [<ffffffff81d42e65>] early_cpu_init+0xe7/0x136
        [<ffffffff81d3e5a3>] setup_arch+0xa5/0x9df
        [<ffffffff81d38bb9>] start_kernel+0x9f/0x43a
        [<ffffffff81d38294>] x86_64_start_reservations+0x2f/0x31
        [<ffffffff81d383fe>] x86_64_start_kernel+0x168/0x176
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1467671487-10344-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      81c2949f
  24. 25 6月, 2016 1 次提交
    • L
      x86: fix up a few misc stack pointer vs thread_info confusions · aca9c293
      Linus Torvalds 提交于
      As the actual pointer value is the same for the thread stack allocation
      and the thread_info, code that confused the two worked fine, but will
      break when the thread info is moved away from the stack allocation.  It
      also looks very confusing.
      
      For example, the kprobe code wanted to know the current top of stack.
      To do that, it used this:
      
      	(unsigned long)current_thread_info() + THREAD_SIZE
      
      which did indeed give the correct value.  But it's not only a fairly
      nonsensical expression, it's also rather complex, especially since we
      actually have this:
      
      	static inline unsigned long current_top_of_stack(void)
      
      which not only gives us the value we are interested in, but happens to
      be how "current_thread_info()" is currently defined as:
      
      	(struct thread_info *)(current_top_of_stack() - THREAD_SIZE);
      
      so using current_thread_info() to figure out the top of the stack really
      is a very round-about thing to do.
      
      The other cases are just simpler confusion about task_thread_info() vs
      task_stack_page(), which currently return the same pointer - but if you
      want the stack page, you really should be using the latter one.
      
      And there was one entirely unused assignment of the current stack to a
      thread_info pointer.
      
      All cleaned up to make more sense today, and make it easier to move the
      thread_info away from the stack in the future.
      
      No semantic changes.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aca9c293