1. 17 12月, 2017 26 次提交
    • A
      x86/entry: Remap the TSS into the CPU entry area · 72f5e08d
      Andy Lutomirski 提交于
      This has a secondary purpose: it puts the entry stack into a region
      with a well-controlled layout.  A subsequent patch will take
      advantage of this to streamline the SYSCALL entry code to be able to
      find it more easily.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bpetkov@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      72f5e08d
    • A
      x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct · 1a935bc3
      Andy Lutomirski 提交于
      SYSENTER_stack should have reliable overflow detection, which
      means that it needs to be at the bottom of a page, not the top.
      Move it to the beginning of struct tss_struct and page-align it.
      
      Also add an assertion to make sure that the fixed hardware TSS
      doesn't cross a page boundary.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.881827433@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1a935bc3
    • A
      x86/dumpstack: Handle stack overflow on all stacks · 6e60e583
      Andy Lutomirski 提交于
      We currently special-case stack overflow on the task stack.  We're
      going to start putting special stacks in the fixmap with a custom
      layout, so they'll have guard pages, too.  Teach the unwinder to be
      able to unwind an overflow of any of the stacks.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.802057305@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e60e583
    • A
      x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss · 7fb983b4
      Andy Lutomirski 提交于
      A future patch will move SYSENTER_stack to the beginning of cpu_tss
      to help detect overflow.  Before this can happen, fix several code
      paths that hardcode assumptions about the old layout.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.722425540@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7fb983b4
    • A
      x86/kasan/64: Teach KASAN about the cpu_entry_area · 21506525
      Andy Lutomirski 提交于
      The cpu_entry_area will contain stacks.  Make sure that KASAN has
      appropriate shadow mappings for them.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: kasan-dev@googlegroups.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.642806442@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      21506525
    • A
      x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area · ef8813ab
      Andy Lutomirski 提交于
      Currently, the GDT is an ad-hoc array of pages, one per CPU, in the
      fixmap.  Generalize it to be an array of a new 'struct cpu_entry_area'
      so that we can cleanly add new things to it.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.563271721@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ef8813ab
    • A
      x86/entry/gdt: Put per-CPU GDT remaps in ascending order · aaeed3ae
      Andy Lutomirski 提交于
      We currently have CPU 0's GDT at the top of the GDT range and
      higher-numbered CPUs at lower addresses.  This happens because the
      fixmap is upside down (index 0 is the top of the fixmap).
      
      Flip it so that GDTs are in ascending order by virtual address.
      This will simplify a future patch that will generalize the GDT
      remap to contain multiple pages.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.471561421@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aaeed3ae
    • A
      x86/dumpstack: Add get_stack_info() support for the SYSENTER stack · 33a2f1a6
      Andy Lutomirski 提交于
      get_stack_info() doesn't currently know about the SYSENTER stack, so
      unwinding will fail if we entered the kernel on the SYSENTER stack
      and haven't fully switched off.  Teach get_stack_info() about the
      SYSENTER stack.
      
      With future patches applied that run part of the entry code on the
      SYSENTER stack and introduce an intentional BUG(), I would get:
      
        PANIC: double fault, error_code: 0x0
        ...
        RIP: 0010:do_error_trap+0x33/0x1c0
        ...
        Call Trace:
        Code: ...
      
      With this patch, I get:
      
        PANIC: double fault, error_code: 0x0
        ...
        Call Trace:
         <SYSENTER>
         ? async_page_fault+0x36/0x60
         ? invalid_op+0x22/0x40
         ? async_page_fault+0x36/0x60
         ? sync_regs+0x3c/0x40
         ? sync_regs+0x2e/0x40
         ? error_entry+0x6c/0xd0
         ? async_page_fault+0x36/0x60
         </SYSENTER>
        Code: ...
      
      which is a lot more informative.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.392711508@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      33a2f1a6
    • A
      x86/entry/64: Allocate and enable the SYSENTER stack · 1a79797b
      Andy Lutomirski 提交于
      This will simplify future changes that want scratch variables early in
      the SYSENTER handler -- they'll be able to spill registers to the
      stack.  It also lets us get rid of a SWAPGS_UNSAFE_STACK user.
      
      This does not depend on CONFIG_IA32_EMULATION=y because we'll want the
      stack space even without IA32 emulation.
      
      As far as I can tell, the reason that this wasn't done from day 1 is
      that we use IST for #DB and #BP, which is IMO rather nasty and causes
      a lot more problems than it solves.  But, since #DB uses IST, we don't
      actually need a real stack for SYSENTER (because SYSENTER with TF set
      will invoke #DB on the IST stack rather than the SYSENTER stack).
      
      I want to remove IST usage from these vectors some day, and this patch
      is a prerequisite for that as well.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1a79797b
    • A
      x86/irq/64: Print the offending IP in the stack overflow warning · 4f3789e7
      Andy Lutomirski 提交于
      In case something goes wrong with unwind (not unlikely in case of
      overflow), print the offending IP where we detected the overflow.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.231677119@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4f3789e7
    • A
      x86/irq: Remove an old outdated comment about context tracking races · 6669a692
      Andy Lutomirski 提交于
      That race has been fixed and code cleaned up for a while now.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.150551639@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6669a692
    • J
      x86/unwinder: Handle stack overflows more gracefully · b02fcf9b
      Josh Poimboeuf 提交于
      There are at least two unwinder bugs hindering the debugging of
      stack-overflow crashes:
      
      - It doesn't deal gracefully with the case where the stack overflows and
        the stack pointer itself isn't on a valid stack but the
        to-be-dereferenced data *is*.
      
      - The ORC oops dump code doesn't know how to print partial pt_regs, for the
        case where if we get an interrupt/exception in *early* entry code
        before the full pt_regs have been saved.
      
      Fix both issues.
      
      http://lkml.kernel.org/r/20171126024031.uxi4numpbjm5rlbr@trebleSigned-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bpetkov@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.071425003@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b02fcf9b
    • A
      x86/unwinder/orc: Dont bail on stack overflow · d3a09104
      Andy Lutomirski 提交于
      If the stack overflows into a guard page and the ORC unwinder should work
      well: by construction, there can't be any meaningful data in the guard page
      because no writes to the guard page will have succeeded.
      
      But there is a bug that prevents unwinding from working correctly: if the
      starting register state has RSP pointing into a stack guard page, the ORC
      unwinder bails out immediately.
      
      Instead of bailing out immediately check whether the next page up is a
      valid check page and if so analyze that. As a result the ORC unwinder will
      start the unwind.
      
      Tested by intentionally overflowing the task stack.  The result is an
      accurate call trace instead of a trace consisting purely of '?' entries.
      
      There are a few other bugs that are triggered if the unwinder encounters a
      stack overflow after the first step, but they are outside the scope of this
      fix.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150604.991389777@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d3a09104
    • B
      x86/entry/64/paravirt: Use paravirt-safe macro to access eflags · e17f8234
      Boris Ostrovsky 提交于
      Commit 1d3e53e8 ("x86/entry/64: Refactor IRQ stacks and make them
      NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags
      using 'pushfq' instruction when testing for IF bit. On PV Xen guests
      looking at IF flag directly will always see it set, resulting in 'ud2'.
      
      Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when
      running paravirt.
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e17f8234
    • A
      x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow · 2aeb0736
      Andrey Ryabinin 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          d17a1d97: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      The KASAN shadow is currently mapped using vmemmap_populate() since that
      provides a semi-convenient way to map pages into init_top_pgt.  However,
      since that no longer zeroes the mapped pages, it is not suitable for
      KASAN, which requires zeroed shadow memory.
      
      Add kasan_populate_shadow() interface and use it instead of
      vmemmap_populate().  Besides, this allows us to take advantage of
      gigantic pages and use them to populate the shadow, which should save us
      some memory wasted on page tables and reduce TLB pressure.
      
      Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Bob Picco <bob.picco@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2aeb0736
    • W
      locking/barriers: Convert users of lockless_dereference() to READ_ONCE() · 3382290e
      Will Deacon 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          506458ef ("locking/barriers: Convert users of lockless_dereference() to READ_ONCE()")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it
      can be used instead of lockless_dereference() without any change in
      semantics.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3382290e
    • W
      locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() · c2bc6608
      Will Deacon 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          76ebbe78 ("locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      In preparation for the removal of lockless_dereference(), which is the
      same as READ_ONCE() on all architectures other than Alpha, add an
      implicit smp_read_barrier_depends() to READ_ONCE() so that it can be
      used to head dependency chains on all architectures.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1508840570-22169-3-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c2bc6608
    • D
      bpf: fix build issues on um due to mising bpf_perf_event.h · ab95477e
      Daniel Borkmann 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          a23f06f0 ("bpf: fix build issues on um due to mising bpf_perf_event.h")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      Since c895f6f7 ("bpf: correct broken uapi for
      BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build
      on i386 or x86_64:
      
        [...]
          CC      init/main.o
        In file included from ../include/linux/perf_event.h:18:0,
                         from ../include/linux/trace_events.h:10,
                         from ../include/trace/syscall.h:7,
                         from ../include/linux/syscalls.h:82,
                         from ../init/main.c:20:
        ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error:
        asm/bpf_perf_event.h: No such file or directory #include
        <asm/bpf_perf_event.h>
        [...]
      
      Lets add missing bpf_perf_event.h also to um arch. This seems
      to be the only one still missing.
      
      Fixes: c895f6f7 ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type")
      Reported-by: NRandy Dunlap <rdunlap@infradead.org>
      Suggested-by: NRichard Weinberger <richard@sigma-star.at>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: NRandy Dunlap <rdunlap@infradead.org>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Richard Weinberger <richard@sigma-star.at>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ab95477e
    • A
      perf/x86: Enable free running PEBS for REGS_USER/INTR · 2fe1bc1f
      Andi Kleen 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          a47ba4d7 ("perf/x86: Enable free running PEBS for REGS_USER/INTR")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      Currently free running PEBS is disabled when user or interrupt
      registers are requested. Most of the registers are actually
      available in the PEBS record and can be supported.
      
      So we just need to check for the supported registers and then
      allow it: it is all except for the segment register.
      
      For user registers this only works when the counter is limited
      to ring 3 only, so this also needs to be checked.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2fe1bc1f
    • R
      x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD · f2dbad36
      Rudolf Marek 提交于
      [ Note, this is a Git cherry-pick of the following commit:
      
          2b67799bdf25 ("x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      The latest AMD AMD64 Architecture Programmer's Manual
      adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]).
      
      If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES
      / FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers,
      thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs.
      Signed-Off-By: NRudolf Marek <r.marek@assembler.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.czSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2dbad36
    • R
      x86/cpufeature: Add User-Mode Instruction Prevention definitions · a8b4db56
      Ricardo Neri 提交于
      [ Note, this is a Git cherry-pick of the following commit: (limited to the cpufeatures.h file)
      
          3522c2a6 ("x86/cpufeature: Add User-Mode Instruction Prevention definitions")
      
        ... for easier x86 PTI code testing and back-porting. ]
      
      User-Mode Instruction Prevention is a security feature present in new
      Intel processors that, when set, prevents the execution of a subset of
      instructions if such instructions are executed in user mode (CPL > 0).
      Attempting to execute such instructions causes a general protection
      exception.
      
      The subset of instructions comprises:
      
       * SGDT - Store Global Descriptor Table
       * SIDT - Store Interrupt Descriptor Table
       * SLDT - Store Local Descriptor Table
       * SMSW - Store Machine Status Word
       * STR  - Store Task Register
      
      This feature is also added to the list of disabled-features to allow
      a cleaner handling of build-time configuration.
      Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chen Yucong <slaoub@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: ricardo.neri@intel.com
      Link: http://lkml.kernel.org/r/1509935277-22138-7-git-send-email-ricardo.neri-calderon@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a8b4db56
    • I
      Merge commit 'upstream-x86-virt' into WIP.x86/mm · e5d77a73
      Ingo Molnar 提交于
      Merge a minimal set of virt cleanups, for a base for the MM isolation patches.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e5d77a73
    • I
      2ec077c1
    • I
      Merge branch 'upstream-x86-selftests' into WIP.x86/pti.base · 650400b2
      Ingo Molnar 提交于
      Conflicts:
      	arch/x86/kernel/cpu/Makefile
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      650400b2
    • I
      Merge commit 'upstream-x86-entry' into WIP.x86/mm · 0fd2e9c5
      Ingo Molnar 提交于
      Pull in a minimal set of v4.15 entry code changes, for a base for the MM isolation patches.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0fd2e9c5
    • I
      drivers/misc/intel/pti: Rename the header file to free up the namespace · 1784f914
      Ingo Molnar 提交于
      We'd like to use the 'PTI' acronym for 'Page Table Isolation' - free up the
      namespace by renaming the <linux/pti.h> driver header to <linux/intel-pti.h>.
      
      (Also standardize the header guard name while at it.)
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: J Freyensee <james_p_freyensee@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1784f914
  2. 13 11月, 2017 3 次提交
    • L
      Linux 4.14 · bebc6082
      Linus Torvalds 提交于
      bebc6082
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 152bbb43
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "A set of small fixes:
      
         - make KGDB work again which got broken by the conversion of WARN()
           to #UD. The WARN fixup needs to run before the notifier callchain,
           otherwise KGDB tries to handle it and crashes.
      
         - disable KASAN in the ORC unwinder to prevent false positive KASAN
           warnings
      
         - prevent default mapping above 47bit when 5 level page tables are
           enabled
      
         - make the delay calibration optimization work correctly, which had
           the conditionals the wrong way around and was operating on data
           which was not yet updated.
      
         - remove the bogus X86_TRAP_BP trap init from the default IDT init
           table, which broke 32bit int3 handling by overwriting the correct
           int3 setup.
      
         - replace this_cpu* with boot_cpu_data access in the preemptible
           oprofile init code"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/debug: Handle warnings before the notifier chain, to fix KGDB crash
        x86/mm: Fix ELF_ET_DYN_BASE for 5-level paging
        x86/idt: Remove X86_TRAP_BP initialization in idt_setup_traps()
        x86/oprofile/ppro: Do not use __this_cpu*() in preemptible context
        x86/unwind: Disable KASAN checking in the ORC unwinder
        x86/smpboot: Make optimization of delay calibration work correctly
      152bbb43
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 69581c74
      Linus Torvalds 提交于
      Pull perf tool fixes from Thomas Gleixner:
       "A small set of fixes for perf tool:
      
         - synchronize the i915 drm header to avoid the 'out of date' warning
      
         - make sure that perf trace cleans up its temporary files on exit
      
         - unbreak the build with newer flex versions
      
         - add missing braces in the eBPF parsing rules"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tooling/headers: Sync the tools/include/uapi/drm/i915_drm.h UAPI header
        perf trace: Call machine__exit() at exit
        perf tools: Fix eBPF event specification parsing
        perf tools: Add "reject" option for parse-events.l
      69581c74
  3. 12 11月, 2017 1 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b3954568
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Use after free in vlan, from Cong Wang.
      
       2) Handle NAPI poll with a zero budget properly in mlx5 driver, from
          Saeed Mahameed.
      
       3) If DMA mapping fails in mlx5 driver, NULL out page, from Inbar
          Karmy.
      
       4) Handle overrun in RX FIFO of sun4i CAN driver, from Gerhard
          Bertelsmann.
      
       5) Missing return in mdb and vlan prepare phase of DSA layer, from
          Vivien Didelot.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        vlan: fix a use-after-free in vlan_device_event()
        net: dsa: return after vlan prepare phase
        net: dsa: return after mdb prepare phase
        can: ifi: Fix transmitter delay calculation
        tcp: fix tcp_fastretrans_alert warning
        tcp: gso: avoid refcount_t warning from tcp_gso_segment()
        can: peak: Add support for new PCIe/M2 CAN FD interfaces
        can: sun4i: handle overrun in RX FIFO
        can: c_can: don't indicate triple sampling support for D_CAN
        net/mlx5e: Increase Striding RQ minimum size limit to 4 multi-packet WQEs
        net/mlx5e: Set page to null in case dma mapping fails
        net/mlx5e: Fix napi poll with zero budget
        net/mlx5: Cancel health poll before sending panic teardown command
        net/mlx5: Loop over temp list to release delay events
        rds: ib: Fix NULL pointer dereference in debug code
      b3954568
  4. 11 11月, 2017 10 次提交