1. 16 8月, 2022 1 次提交
    • J
      x86/entry: Fix entry_INT80_compat for Xen PV guests · 5b9f0c4d
      Juergen Gross 提交于
      Commit
      
        c89191ce ("x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS")
      
      missed one use case of SWAPGS in entry_INT80_compat(). Removing of
      the SWAPGS macro led to asm just using "swapgs", as it is accepting
      instructions in capital letters, too.
      
      This in turn leads to splats in Xen PV guests like:
      
        [   36.145223] general protection fault, maybe for address 0x2d: 0000 [#1] PREEMPT SMP NOPTI
        [   36.145794] CPU: 2 PID: 1847 Comm: ld-linux.so.2 Not tainted 5.19.1-1-default #1 \
      	  openSUSE Tumbleweed f3b44bfb672cdb9f235aff53b57724eba8b9411b
        [   36.146608] Hardware name: HP ProLiant ML350p Gen8, BIOS P72 11/14/2013
        [   36.148126] RIP: e030:entry_INT80_compat+0x3/0xa3
      
      Fix that by open coding this single instance of the SWAPGS macro.
      
      Fixes: c89191ce ("x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS")
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NJan Beulich <jbeulich@suse.com>
      Cc: <stable@vger.kernel.org> # 5.19
      Link: https://lore.kernel.org/r/20220816071137.4893-1-jgross@suse.com
      5b9f0c4d
  2. 27 6月, 2022 3 次提交
  3. 20 5月, 2022 2 次提交
    • P
      x86/entry: Fixup objtool/ibt validation · ce656528
      Peter Zijlstra 提交于
      Commit
      
        47f33de4 ("x86/sev: Mark the code returning to user space as syscall gap")
      
      added a bunch of text references without annotating them, resulting in a
      spree of objtool complaints:
      
        vmlinux.o: warning: objtool: vc_switch_off_ist+0x77: relocation to !ENDBR: entry_SYSCALL_64+0x15c
        vmlinux.o: warning: objtool: vc_switch_off_ist+0x8f: relocation to !ENDBR: entry_SYSCALL_compat+0xa5
        vmlinux.o: warning: objtool: vc_switch_off_ist+0x97: relocation to !ENDBR: .entry.text+0x21ea
        vmlinux.o: warning: objtool: vc_switch_off_ist+0xef: relocation to !ENDBR: .entry.text+0x162
        vmlinux.o: warning: objtool: __sev_es_ist_enter+0x60: relocation to !ENDBR: entry_SYSCALL_64+0x15c
        vmlinux.o: warning: objtool: __sev_es_ist_enter+0x6c: relocation to !ENDBR: .entry.text+0x162
        vmlinux.o: warning: objtool: __sev_es_ist_enter+0x8a: relocation to !ENDBR: entry_SYSCALL_compat+0xa5
        vmlinux.o: warning: objtool: __sev_es_ist_enter+0xc1: relocation to !ENDBR: .entry.text+0x21ea
      
      Since these text references are used to compare against IP, and are not
      an indirect call target, they don't need ENDBR so annotate them away.
      
      Fixes: 47f33de4 ("x86/sev: Mark the code returning to user space as syscall gap")
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Link: https://lore.kernel.org/r/20220520082604.GQ2578@worktop.programming.kicks-ass.net
      ce656528
    • J
      x86/entry: Fix register corruption in compat syscall · 036c07c0
      Josh Poimboeuf 提交于
      A panic was reported in the init process on AMD:
      
        Run /sbin/init as init process
        init[1]: segfault at f7fd5ca0 ip 00000000f7f5bbc7 sp 00000000ffa06aa0 error 7 in libc.so[f7f51000+4e000]
        Code: 8a 44 24 10 88 41 ff 8b 44 24 10 83 c4 2c 5b 5e 5f 5d c3 53 83 ec 08 8b 5c 24 10 81 fb 00 f0 ff ff 76 0c e8 ba dc ff ff f7 db <89> 18 83 cb ff 83 c4 08 89 d8 5b c3 e8 81 60 ff ff 05 28 84 07 00
        Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
        CPU: 1 PID: 1 Comm: init Tainted: G        W         5.18.0-rc7-next-20220519 #1
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
        Call Trace:
         <TASK>
         dump_stack_lvl+0x57/0x7d
         panic+0x10f/0x28d
         do_exit.cold+0x18/0x48
         do_group_exit+0x2e/0xb0
         get_signal+0xb6d/0xb80
         arch_do_signal_or_restart+0x31/0x760
         ? show_opcodes.cold+0x1c/0x21
         ? force_sig_fault+0x49/0x70
         exit_to_user_mode_prepare+0x131/0x1a0
         irqentry_exit_to_user_mode+0x5/0x30
         asm_exc_page_fault+0x27/0x30
        RIP: 0023:0xf7f5bbc7
        Code: 8a 44 24 10 88 41 ff 8b 44 24 10 83 c4 2c 5b 5e 5f 5d c3 53 83 ec 08 8b 5c 24 10 81 fb 00 f0 ff ff 76 0c e8 ba dc ff ff f7 db <89> 18 83 cb ff 83 c4 08 89 d8 5b c3 e8 81 60 ff ff 05 28 84 07 00
        RSP: 002b:00000000ffa06aa0 EFLAGS: 00000217
        RAX: 00000000f7fd5ca0 RBX: 000000000000000c RCX: 0000000000001000
        RDX: 0000000000000001 RSI: 00000000f7fd5b60 RDI: 00000000f7fd5b60
        RBP: 00000000f7fd1c1c R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
        R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
         </TASK>
      
      The task's CX register got corrupted by commit 8c42819b ("x86/entry:
      Use PUSH_AND_CLEAR_REGS for compat"), which overlooked the fact that
      compat SYSCALL apparently stores the user's CX value in BP.
      
      Before that commit, CX was saved from its stashed value in BP:
      
      	pushq   %rbp                    /* pt_regs->cx (stashed in bp) */
      
      But then it got changed to:
      
      	pushq	%rcx			/* pt_regs->cx */
      
      So the wrong value got saved and later restored back to the user.  Fix
      it by pushing the correct value again (BP) for regs->cx.
      
      Fixes: 8c42819b ("x86/entry: Use PUSH_AND_CLEAR_REGS for compat")
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Link: https://lkml.kernel.org/r/b5a26592c9dd60bbacdf97974a7433fd802a5593.1652985970.git.jpoimboe@kernel.org
      036c07c0
  4. 19 5月, 2022 1 次提交
  5. 06 5月, 2022 2 次提交
  6. 03 5月, 2022 1 次提交
  7. 15 3月, 2022 3 次提交
  8. 08 3月, 2021 1 次提交
  9. 05 7月, 2020 1 次提交
  10. 01 7月, 2020 2 次提交
  11. 11 6月, 2020 3 次提交
  12. 18 10月, 2019 3 次提交
    • J
      x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* · 6dcc5627
      Jiri Slaby 提交于
      These are all functions which are invoked from elsewhere, so annotate
      them as global using the new SYM_FUNC_START and their ENDPROC's by
      SYM_FUNC_END.
      
      Make sure ENTRY/ENDPROC is not defined on X86_64, given these were the
      last users.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [hibernate]
      Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen bits]
      Acked-by: Herbert Xu <herbert@gondor.apana.org.au> [crypto]
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Shevchenko <andy@infradead.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Armijn Hemel <armijn@tjaldur.nl>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Enrico Weigelt <info@metux.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-efi <linux-efi@vger.kernel.org>
      Cc: linux-efi@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: platform-driver-x86@vger.kernel.org
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Wei Huang <wei@redhat.com>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com>
      Link: https://lkml.kernel.org/r/20191011115108.12392-25-jslaby@suse.cz
      6dcc5627
    • J
      x86/asm/64: Change all ENTRY+END to SYM_CODE_* · bc7b11c0
      Jiri Slaby 提交于
      Change all assembly code which is marked using END (and not ENDPROC).
      Switch all these to the appropriate new annotation SYM_CODE_START and
      SYM_CODE_END.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> [xen bits]
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cao jin <caoj.fnst@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Maran Wilson <maran.wilson@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20191011115108.12392-24-jslaby@suse.cz
      bc7b11c0
    • J
      x86/asm: Use SYM_INNER_LABEL instead of GLOBAL · 26ba4e57
      Jiri Slaby 提交于
      The GLOBAL macro had several meanings and is going away. Convert all the
      inner function labels marked with GLOBAL to use SYM_INNER_LABEL instead.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-arch@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191011115108.12392-18-jslaby@suse.cz
      26ba4e57
  13. 18 1月, 2019 1 次提交
  14. 05 9月, 2018 1 次提交
    • A
      x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls · afaef01c
      Alexander Popov 提交于
      The STACKLEAK feature (initially developed by PaX Team) has the following
      benefits:
      
      1. Reduces the information that can be revealed through kernel stack leak
         bugs. The idea of erasing the thread stack at the end of syscalls is
         similar to CONFIG_PAGE_POISONING and memzero_explicit() in kernel
         crypto, which all comply with FDP_RIP.2 (Full Residual Information
         Protection) of the Common Criteria standard.
      
      2. Blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
         CVE-2010-2963). That kind of bugs should be killed by improving C
         compilers in future, which might take a long time.
      
      This commit introduces the code filling the used part of the kernel
      stack with a poison value before returning to userspace. Full
      STACKLEAK feature also contains the gcc plugin which comes in a
      separate commit.
      
      The STACKLEAK feature is ported from grsecurity/PaX. More information at:
        https://grsecurity.net/
        https://pax.grsecurity.net/
      
      This code is modified from Brad Spengler/PaX Team's code in the last
      public patch of grsecurity/PaX based on our understanding of the code.
      Changes or omissions from the original code are ours and don't reflect
      the original grsecurity/PaX code.
      
      Performance impact:
      
      Hardware: Intel Core i7-4770, 16 GB RAM
      
      Test #1: building the Linux kernel on a single core
              0.91% slowdown
      
      Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
              4.2% slowdown
      
      So the STACKLEAK description in Kconfig includes: "The tradeoff is the
      performance impact: on a single CPU system kernel compilation sees a 1%
      slowdown, other systems and workloads may vary and you are advised to
      test this feature on your expected workload before deploying it".
      Signed-off-by: NAlexander Popov <alex.popov@linux.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      afaef01c
  15. 27 6月, 2018 1 次提交
  16. 27 4月, 2018 1 次提交
    • A
      x86/entry/64/compat: Preserve r8-r11 in int $0x80 · 8bb2610b
      Andy Lutomirski 提交于
      32-bit user code that uses int $80 doesn't care about r8-r11.  There is,
      however, some 64-bit user code that intentionally uses int $0x80 to invoke
      32-bit system calls.  From what I've seen, basically all such code assumes
      that r8-r15 are all preserved, but the kernel clobbers r8-r11.  Since I
      doubt that there's any code that depends on int $0x80 zeroing r8-r11,
      change the kernel to preserve them.
      
      I suspect that very little user code is broken by the old clobber, since
      r8-r11 are only rarely allocated by gcc, and they're clobbered by function
      calls, so they only way we'd see a problem is if the same function that
      invokes int $0x80 also spills something important to one of these
      registers.
      
      The current behavior seems to date back to the historical commit
      "[PATCH] x86-64 merge for 2.6.4".  Before that, all regs were
      preserved.  I can't find any explanation of why this change was made.
      
      Update the test_syscall_vdso_32 testcase as well to verify the new
      behavior, and it strengthens the test to make sure that the kernel doesn't
      accidentally permute r8..r15.
      Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Link: https://lkml.kernel.org/r/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org
      8bb2610b
  17. 05 4月, 2018 1 次提交
    • D
      syscalls/x86: Extend register clearing on syscall entry to lower registers · 6dc936f1
      Dominik Brodowski 提交于
      To reduce the chance that random user space content leaks down the call
      chain in registers, also clear lower registers on syscall entry:
      
      For 64-bit syscalls, extend the register clearing in PUSH_AND_CLEAR_REGS
      to %dx and %cx. This should not hurt at all, also on the other callers
      of that macro. We do not need to clear %rdi and %rsi for syscall entry,
      as those registers are used to pass the parameters to do_syscall_64().
      
      For the 32-bit compat syscalls, do_int80_syscall_32() and
      do_fast_syscall_32() each only take one parameter. Therefore, extend the
      register clearing to %dx, %cx, and %si in entry_SYSCALL_compat and
      entry_INT80_compat.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180405095307.3730-8-linux@dominikbrodowski.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6dc936f1
  18. 07 3月, 2018 2 次提交
  19. 21 2月, 2018 1 次提交
  20. 17 2月, 2018 1 次提交
    • D
      x86/entry/64: Use 'xorl' for faster register clearing · ced5d0bf
      Dominik Brodowski 提交于
      On some x86 CPU microarchitectures using 'xorq' to clear general-purpose
      registers is slower than 'xorl'. As 'xorl' is sufficient to clear all
      64 bits of these registers due to zero-extension [*], switch the x86
      64-bit entry code to use 'xorl'.
      
      No change in functionality and no change in code size.
      
      [*] According to Intel 64 and IA-32 Architecture Software Developer's
          Manual, section 3.4.1.1, the result of 32-bit operands are "zero-
          extended to a 64-bit result in the destination general-purpose
          register." The AMD64 Architecture Programmer’s Manual Volume 3,
          Appendix B.1, describes the same behaviour.
      Suggested-by: NDenys Vlasenko <dvlasenk@redhat.com>
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180214175924.23065-3-linux@dominikbrodowski.net
      [ Improved on the changelog a bit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ced5d0bf
  21. 06 2月, 2018 1 次提交
  22. 04 1月, 2018 1 次提交
  23. 24 12月, 2017 2 次提交
    • P
      x86/mm: Use/Fix PCID to optimize user/kernel switches · 6fd166aa
      Peter Zijlstra 提交于
      We can use PCID to retain the TLBs across CR3 switches; including those now
      part of the user/kernel switch. This increases performance of kernel
      entry/exit at the cost of more expensive/complicated TLB flushing.
      
      Now that we have two address spaces, one for kernel and one for user space,
      we need two PCIDs per mm. We use the top PCID bit to indicate a user PCID
      (just like we use the PFN LSB for the PGD). Since we do TLB invalidation
      from kernel space, the existing code will only invalidate the kernel PCID,
      we augment that by marking the corresponding user PCID invalid, and upon
      switching back to userspace, use a flushing CR3 write for the switch.
      
      In order to access the user_pcid_flush_mask we use PER_CPU storage, which
      means the previously established SWAPGS vs CR3 ordering is now mandatory
      and required.
      
      Having to do this memory access does require additional registers, most
      sites have a functioning stack and we can spill one (RAX), sites without
      functional stack need to otherwise provide the second scratch register.
      
      Note: PCID is generally available on Intel Sandybridge and later CPUs.
      Note: Up until this point TLB flushing was broken in this series.
      
      Based-on-code-from: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6fd166aa
    • D
      x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching · 8a09317b
      Dave Hansen 提交于
      PAGE_TABLE_ISOLATION needs to switch to a different CR3 value when it
      enters the kernel and switch back when it exits.  This essentially needs to
      be done before leaving assembly code.
      
      This is extra challenging because the switching context is tricky: the
      registers that can be clobbered can vary.  It is also hard to store things
      on the stack because there is an established ABI (ptregs) or the stack is
      entirely unsafe to use.
      
      Establish a set of macros that allow changing to the user and kernel CR3
      values.
      
      Interactions with SWAPGS:
      
        Previous versions of the PAGE_TABLE_ISOLATION code relied on having
        per-CPU scratch space to save/restore a register that can be used for the
        CR3 MOV.  The %GS register is used to index into our per-CPU space, so
        SWAPGS *had* to be done before the CR3 switch.  That scratch space is gone
        now, but the semantic that SWAPGS must be done before the CR3 MOV is
        retained.  This is good to keep because it is not that hard to do and it
        allows to do things like add per-CPU debugging information.
      
      What this does in the NMI code is worth pointing out.  NMIs can interrupt
      *any* context and they can also be nested with NMIs interrupting other
      NMIs.  The comments below ".Lnmi_from_kernel" explain the format of the
      stack during this situation.  Changing the format of this stack is hard.
      Instead of storing the old CR3 value on the stack, this depends on the
      *regular* register save/restore mechanism and then uses %r14 to keep CR3
      during the NMI.  It is callee-saved and will not be clobbered by the C NMI
      handlers that get called.
      
      [ PeterZ: ESPFIX optimization ]
      
      Based-on-code-from: Andy Lutomirski <luto@kernel.org>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8a09317b
  24. 17 12月, 2017 2 次提交
    • A
      x86/entry/64: Use a per-CPU trampoline stack for IDT entries · 7f2590a1
      Andy Lutomirski 提交于
      Historically, IDT entries from usermode have always gone directly
      to the running task's kernel stack.  Rearrange it so that we enter on
      a per-CPU trampoline stack and then manually switch to the task's stack.
      This touches a couple of extra cachelines, but it gives us a chance
      to run some code before we touch the kernel stack.
      
      The asm isn't exactly beautiful, but I think that fully refactoring
      it can wait.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7f2590a1
    • A
      x86/entry/64: Allocate and enable the SYSENTER stack · 1a79797b
      Andy Lutomirski 提交于
      This will simplify future changes that want scratch variables early in
      the SYSENTER handler -- they'll be able to spill registers to the
      stack.  It also lets us get rid of a SWAPGS_UNSAFE_STACK user.
      
      This does not depend on CONFIG_IA32_EMULATION=y because we'll want the
      stack space even without IA32 emulation.
      
      As far as I can tell, the reason that this wasn't done from day 1 is
      that we use IST for #DB and #BP, which is IMO rather nasty and causes
      a lot more problems than it solves.  But, since #DB uses IST, we don't
      actually need a real stack for SYSENTER (because SYSENTER with TF set
      will invoke #DB on the IST stack rather than the SYSENTER stack).
      
      I want to remove IST usage from these vectors some day, and this patch
      is a prerequisite for that as well.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1a79797b
  25. 02 11月, 2017 2 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
    • A
      x86/entry/64: Move SWAPGS into the common IRET-to-usermode path · 8a055d7f
      Andy Lutomirski 提交于
      All of the code paths that ended up doing IRET to usermode did
      SWAPGS immediately beforehand.  Move the SWAPGS into the common
      code.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8a055d7f