1. 08 12月, 2018 1 次提交
  2. 07 12月, 2018 2 次提交
  3. 06 12月, 2018 1 次提交
    • A
      kprobes/x86: Blacklist non-attachable interrupt functions · a50480cb
      Andrea Righi 提交于
      These interrupt functions are already non-attachable by kprobes.
      Blacklist them explicitly so that they can show up in
      /sys/kernel/debug/kprobes/blacklist and tools like BCC can use this
      additional information.
      Signed-off-by: NAndrea Righi <righi.andrea@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lkml.kernel.org/r/20181206095648.GA8249@DellSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a50480cb
  4. 05 12月, 2018 4 次提交
  5. 04 12月, 2018 2 次提交
    • M
      kprobes/x86: Fix instruction patching corruption when copying more than one... · 43a1b0cb
      Masami Hiramatsu 提交于
      kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction
      
      After copy_optimized_instructions() copies several instructions
      to the working buffer it tries to fix up the real RIP address, but it
      adjusts the RIP-relative instruction with an incorrect RIP address
      for the 2nd and subsequent instructions due to a bug in the logic.
      
      This will break the kernel pretty badly (with likely outcomes such as
      a kernel freeze, a crash, or worse) because probed instructions can refer
      to the wrong data.
      
      For example putting kprobes on cpumask_next() typically hits this bug.
      
      cpumask_next() is normally like below if CONFIG_CPUMASK_OFFSTACK=y
      (in this case nr_cpumask_bits is an alias of nr_cpu_ids):
      
       <cpumask_next>:
      	48 89 f0		mov    %rsi,%rax
      	8b 35 7b fb e2 00	mov    0xe2fb7b(%rip),%esi # ffffffff82db9e64 <nr_cpu_ids>
      	55			push   %rbp
      ...
      
      If we put a kprobe on it and it gets jump-optimized, it gets
      patched by the kprobes code like this:
      
       <cpumask_next>:
      	e9 95 7d 07 1e		jmpq   0xffffffffa000207a
      	7b fb			jnp    0xffffffff81f8a2e2 <cpumask_next+2>
      	e2 00			loop   0xffffffff81f8a2e9 <cpumask_next+9>
      	55			push   %rbp
      
      This shows that the first two MOV instructions were copied to a
      trampoline buffer at 0xffffffffa000207a.
      
      Here is the disassembled result of the trampoline, skipping
      the optprobe template instructions:
      
      	# Dump of assembly code from 0xffffffffa000207a to 0xffffffffa00020ea:
      
      	54			push   %rsp
      	...
      	48 83 c4 08		add    $0x8,%rsp
      	9d			popfq
      	48 89 f0		mov    %rsi,%rax
      	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi # 0xffffffff82db9e67 <nr_cpu_ids+3>
      
      This dump shows that the second MOV accesses *(nr_cpu_ids+3) instead of
      the original *nr_cpu_ids. This leads to a kernel freeze because
      cpumask_next() always returns 0 and for_each_cpu() never ends.
      
      Fix this by adding 'len' correctly to the real RIP address while
      copying.
      
      [ mingo: Improved the changelog. ]
      Reported-by: NMichael Rodin <michael@rodin.online>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org # v4.15+
      Fixes: 63fef14f ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
      Link: http://lkml.kernel.org/r/153504457253.22602.1314289671019919596.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
      43a1b0cb
    • S
      bpf: powerpc64: optimize JIT passes for bpf function calls · 025dceb0
      Sandipan Das 提交于
      Once the JITed images for each function in a multi-function program
      are generated after the first three JIT passes, we only need to fix
      the target address for the branch instruction corresponding to each
      bpf-to-bpf function call.
      
      This introduces the following optimizations for reducing the work
      done by the JIT compiler when handling multi-function programs:
      
        [1] Instead of doing two extra passes to fix the bpf function calls,
            do just one as that would be sufficient.
      
        [2] During the extra pass, only overwrite the instruction sequences
            for the bpf-to-bpf function calls as everything else would still
            remain exactly the same. This also reduces the number of writes
            to the JITed image.
      
        [3] Do not regenerate the prologue and the epilogue during the extra
            pass as that would be redundant.
      Signed-off-by: NSandipan Das <sandipan@linux.ibm.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      025dceb0
  6. 03 12月, 2018 3 次提交
    • J
      x86/boot: Clear RSDP address in boot_params for broken loaders · 182ddd16
      Juergen Gross 提交于
      Gunnar Krueger reported a systemd-boot failure and bisected it down to:
      
        e6e094e0 ("x86/acpi, x86/boot: Take RSDP address from boot params if available")
      
      In case a broken boot loader doesn't clear its 'struct boot_params', clear
      rsdp_addr in sanitize_boot_params().
      Reported-by: NGunnar Krueger <taijian@posteo.de>
      Tested-by: NGunnar Krueger <taijian@posteo.de>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: sstabellini@kernel.org
      Fixes: e6e094e0 ("x86/acpi, x86/boot: Take RSDP address from boot params if available")
      Link: http://lkml.kernel.org/r/20181203103811.17056-1-jgross@suse.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      182ddd16
    • G
      csky: bugfix tlb_get_pgd error. · 63e19c82
      Guo Ren 提交于
      It's wrong to mask/unmask highest bit in addr to translate the vaddr
      to paddr. We should use PAGE_OFFSET and PHYS_OFFSET.
      
      Wrong implement:
        return ((get_pgd()|(1<<31)) - PHYS_OFFSET) & ~1;
      
      When PHYS_OFFSET=0xc0000000 and get_pgd() return 0xe0000000, it'll
      return 0x60000000. It's wrong and should be 0xa0000000.
      
      Now correct it to:
        return ((get_pgd() - PHYS_OFFSET) & ~1) + PAGE_OFFSET;
      Signed-off-by: NGuo Ren <ren_guo@c-sky.com>
      63e19c82
    • H
      parisc: Enable -ffunction-sections for modules on 32-bit kernel · 1e8249b8
      Helge Deller 提交于
      Frank Schreiner reported, that since kernel 4.18 he faces sysfs-warnings
      when loading modules on a 32-bit kernel. Here is one such example:
      
       sysfs: cannot create duplicate filename '/module/nfs/sections/.text'
       CPU: 0 PID: 98 Comm: modprobe Not tainted 4.18.0-2-parisc #1 Debian 4.18.10-2
       Backtrace:
        [<1017ce2c>] show_stack+0x3c/0x50
        [<107a7210>] dump_stack+0x28/0x38
        [<103f900c>] sysfs_warn_dup+0x88/0xac
        [<103f8b1c>] sysfs_add_file_mode_ns+0x164/0x1d0
        [<103f9e70>] internal_create_group+0x11c/0x304
        [<103fa0a0>] sysfs_create_group+0x48/0x60
        [<1022abe8>] load_module.constprop.35+0x1f9c/0x23b8
        [<1022b278>] sys_finit_module+0xd0/0x11c
        [<101831dc>] syscall_exit+0x0/0x14
      
      This warning gets triggered by the fact, that due to commit 24b6c225
      ("parisc: Build kernel without -ffunction-sections") we now get multiple .text
      sections in the kernel modules for which sysfs_create_group() can't create
      multiple virtual files.
      
      This patch works around the problem by re-enabling the -ffunction-sections
      compiler option for modules, while keeping it disabled for the non-module
      kernel code.
      Reported-by: NFrank Scheiner <frank.scheiner@web.de>
      Fixes: 24b6c225 ("parisc: Build kernel without -ffunction-sections")
      Cc: <stable@vger.kernel.org> # v4.18+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      1e8249b8
  7. 01 12月, 2018 2 次提交
    • J
      ARC: io.h: Implement reads{x}()/writes{x}() · 10d44343
      Jose Abreu 提交于
      Some ARC CPU's do not support unaligned loads/stores. Currently, generic
      implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
      This can lead to misfunction of some drivers as generic functions do a
      plain dereference of a pointer that can be unaligned.
      
      Let's use {get/put}_unaligned() helpers instead of plain dereference of
      pointer in order to fix. The helpers allow to get and store data from an
      unaligned address whilst preserving the CPU internal alignment.
      According to [1], the use of these helpers are costly in terms of
      performance so we added an initial check for a buffer already aligned so
      that the usage of the helpers can be avoided, when possible.
      
      [1] Documentation/unaligned-memory-access.txt
      
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: Joao Pinto <jpinto@synopsys.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Tested-by: NVitor Soares <soares@synopsys.com>
      Signed-off-by: NJose Abreu <joabreu@synopsys.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      10d44343
    • K
      ARC: change defconfig defaults to ARCv2 · b7cc40c3
      Kevin Hilman 提交于
      Change the default defconfig (used with 'make defconfig') to the ARCv2
      nsim_hs_defconfig, and also switch the default Kconfig ISA selection to
      ARCv2.
      
      This allows several default defconfigs (e.g. make defconfig, make
      allnoconfig, make tinyconfig) to all work with ARCv2 by default.
      
      Note since we change default architecture from ARCompact to ARCv2
      it's required to explicitly mention architecture type in ARCompact
      defconfigs otherwise ARCv2 will be implied and binaries will be
      generated for ARCv2.
      
      Cc: <stable@vger.kernel.org> # 4.4.x
      Signed-off-by: NKevin Hilman <khilman@baylibre.com>
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      b7cc40c3
  8. 30 11月, 2018 6 次提交
    • Y
      x86/earlyprintk/efi: Fix infinite loop on some screen widths · 79c2206d
      YiFei Zhu 提交于
      An affected screen resolution is 1366 x 768, which width is not
      divisible by 8, the default font width. On such screens, when longer
      lines are earlyprintk'ed, overflow-to-next-line can never trigger,
      due to the left-most x-coordinate of the next character always less
      than the screen width. Earlyprintk will infinite loop in trying to
      print the rest of the string but unable to, due to the line being
      full.
      
      This patch makes the trigger consider the right-most x-coordinate,
      instead of left-most, as the value to compare against the screen
      width threshold.
      Signed-off-by: NYiFei Zhu <zhuyifei1999@gmail.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
      Cc: Bhupesh Sharma <bhsharma@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Snowberg <eric.snowberg@oracle.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Jon Hunter <jonathanh@nvidia.com>
      Cc: Julien Thierry <julien.thierry@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181129171230.18699-12-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      79c2206d
    • E
      x86/efi: Allocate e820 buffer before calling efi_exit_boot_service · b84a64fa
      Eric Snowberg 提交于
      The following commit:
      
        d6493401 ("x86/efi: Use efi_exit_boot_services()")
      
      introduced a regression on systems with large memory maps causing them
      to hang on boot. The first "goto get_map" that was removed from
      exit_boot() ensured there was enough room for the memory map when
      efi_call_early(exit_boot_services) was called. This happens when
      (nr_desc > ARRAY_SIZE(params->e820_table).
      
      Chain of events:
      
        exit_boot()
          efi_exit_boot_services()
            efi_get_memory_map                  <- at this point the mm can't grow over 8 desc
            priv_func()
              exit_boot_func()
                allocate_e820ext()              <- new mm grows over 8 desc from e820 alloc
            efi_call_early(exit_boot_services)  <- mm key doesn't match so retry
            efi_call_early(get_memory_map)      <- not enough room for new mm
            system hangs
      
      This patch allocates the e820 buffer before calling efi_exit_boot_services()
      and fixes the regression.
      
       [ mingo: minor cleanliness edits. ]
      Signed-off-by: NEric Snowberg <eric.snowberg@oracle.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
      Cc: Bhupesh Sharma <bhsharma@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Jon Hunter <jonathanh@nvidia.com>
      Cc: Julien Thierry <julien.thierry@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181129171230.18699-2-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b84a64fa
    • I
      Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" · 12366410
      Igor Druzhinin 提交于
      This reverts commit b3cf8528.
      
      That commit unintentionally broke Xen balloon memory hotplug with
      "hotplug_unpopulated" set to 1. As long as "System RAM" resource
      got assigned under a new "Unusable memory" resource in IO/Mem tree
      any attempt to online this memory would fail due to general kernel
      restrictions on having "System RAM" resources as 1st level only.
      
      The original issue that commit has tried to workaround fa564ad9
      ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
      60-7f)") also got amended by the following 03a55173 ("x86/PCI: Move
      and shrink AMD 64-bit window to avoid conflict") which made the
      original fix to Xen ballooning unnecessary.
      Signed-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      12366410
    • J
      xen/x86: add diagnostic printout to xen_mc_flush() in case of error · a7b40310
      Juergen Gross 提交于
      Failure of an element of a Xen multicall is signalled via a WARN()
      only if the kernel is compiled with MC_DEBUG. It is impossible to
      know which element failed and why it did so.
      
      Change that by printing the related information even without MC_DEBUG,
      even if maybe in some limited form (e.g. without information which
      caller produced the failing element).
      
      Move the printing out of the switch statement in order to have the
      same information for a single call.
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      a7b40310
    • M
      arm64: ftrace: Fix to enable syscall events on arm64 · 874bfc6e
      Masami Hiramatsu 提交于
      Since commit 4378a7d4 ("arm64: implement syscall wrappers")
      introduced "__arm64_" prefix to all syscall wrapper symbols in
      sys_call_table, syscall tracer can not find corresponding
      metadata from syscall name. In the result, we have no syscall
      ftrace events on arm64 kernel, and some bpf testcases are failed
      on arm64.
      
      To fix this issue, this introduces custom
      arch_syscall_match_sym_name() which skips first 8 bytes when
      comparing the syscall and symbol names.
      
      Fixes: 4378a7d4 ("arm64: implement syscall wrappers")
      Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      874bfc6e
    • C
      arm64: Add workaround for Cortex-A76 erratum 1286807 · ce8c80c5
      Catalin Marinas 提交于
      On the affected Cortex-A76 cores (r0p0 to r3p0), if a virtual address
      for a cacheable mapping of a location is being accessed by a core while
      another core is remapping the virtual address to a new physical page
      using the recommended break-before-make sequence, then under very rare
      circumstances TLBI+DSB completes before a read using the translation
      being invalidated has been observed by other observers. The workaround
      repeats the TLBI+DSB operation and is shared with the Qualcomm Falkor
      erratum 1009
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      ce8c80c5
  9. 28 11月, 2018 19 次提交
    • T
      x86/speculation: Provide IBPB always command line options · 55a97402
      Thomas Gleixner 提交于
      Provide the possibility to enable IBPB always in combination with 'prctl'
      and 'seccomp'.
      
      Add the extra command line options and rework the IBPB selection to
      evaluate the command instead of the mode selected by the STIPB switch case.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185006.144047038@linutronix.de
      55a97402
    • T
      x86/speculation: Add seccomp Spectre v2 user space protection mode · 6b3e64c2
      Thomas Gleixner 提交于
      If 'prctl' mode of user space protection from spectre v2 is selected
      on the kernel command-line, STIBP and IBPB are applied on tasks which
      restrict their indirect branch speculation via prctl.
      
      SECCOMP enables the SSBD mitigation for sandboxed tasks already, so it
      makes sense to prevent spectre v2 user space to user space attacks as
      well.
      
      The Intel mitigation guide documents how STIPB works:
          
         Setting bit 1 (STIBP) of the IA32_SPEC_CTRL MSR on a logical processor
         prevents the predicted targets of indirect branches on any logical
         processor of that core from being controlled by software that executes
         (or executed previously) on another logical processor of the same core.
      
      Ergo setting STIBP protects the task itself from being attacked from a task
      running on a different hyper-thread and protects the tasks running on
      different hyper-threads from being attacked.
      
      While the document suggests that the branch predictors are shielded between
      the logical processors, the observed performance regressions suggest that
      STIBP simply disables the branch predictor more or less completely. Of
      course the document wording is vague, but the fact that there is also no
      requirement for issuing IBPB when STIBP is used points clearly in that
      direction. The kernel still issues IBPB even when STIBP is used until Intel
      clarifies the whole mechanism.
      
      IBPB is issued when the task switches out, so malicious sandbox code cannot
      mistrain the branch predictor for the next user space task on the same
      logical processor.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185006.051663132@linutronix.de
      
      6b3e64c2
    • T
      x86/speculation: Enable prctl mode for spectre_v2_user · 7cc765a6
      Thomas Gleixner 提交于
      Now that all prerequisites are in place:
      
       - Add the prctl command line option
      
       - Default the 'auto' mode to 'prctl'
      
       - When SMT state changes, update the static key which controls the
         conditional STIBP evaluation on context switch.
      
       - At init update the static key which controls the conditional IBPB
         evaluation on context switch.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.958421388@linutronix.de
      
      7cc765a6
    • T
      x86/speculation: Add prctl() control for indirect branch speculation · 9137bb27
      Thomas Gleixner 提交于
      Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
      PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
      indirect branch speculation via STIBP and IBPB.
      
      Invocations:
       Check indirect branch speculation status with
       - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
      
       Enable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
      
       Disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
      
       Force disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
      
      See Documentation/userspace-api/spec_ctrl.rst.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.de
      9137bb27
    • T
      x86/speculation: Prepare arch_smt_update() for PRCTL mode · 6893a959
      Thomas Gleixner 提交于
      The upcoming fine grained per task STIBP control needs to be updated on CPU
      hotplug as well.
      
      Split out the code which controls the strict mode so the prctl control code
      can be added later. Mark the SMP function call argument __unused while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.759457117@linutronix.de
      
      6893a959
    • T
      x86/speculation: Prevent stale SPEC_CTRL msr content · 6d991ba5
      Thomas Gleixner 提交于
      The seccomp speculation control operates on all tasks of a process, but
      only the current task of a process can update the MSR immediately. For the
      other threads the update is deferred to the next context switch.
      
      This creates the following situation with Process A and B:
      
      Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2
      does not have the speculation control TIF bit set. Process B task 1 has the
      speculation control TIF bit set.
      
      CPU0					CPU1
      					MSR bit is set
      					ProcB.T1 schedules out
      					ProcA.T2 schedules in
      					MSR bit is cleared
      ProcA.T1
        seccomp_update()
        set TIF bit on ProcA.T2
      					ProcB.T1 schedules in
      					MSR is not updated  <-- FAIL
      
      This happens because the context switch code tries to avoid the MSR update
      if the speculation control TIF bits of the incoming and the outgoing task
      are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks
      scheduling back and forth on CPU1, which keeps the MSR stale forever.
      
      In theory this could be remedied by IPIs, but chasing the remote task which
      could be migrated is complex and full of races.
      
      The straight forward solution is to avoid the asychronous update of the TIF
      bit and defer it to the next context switch. The speculation control state
      is stored in task_struct::atomic_flags by the prctl and seccomp updates
      already.
      
      Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the
      atomic_flags. Check the bit on context switch and force a synchronous
      update of the speculation control if set. Use the same mechanism for
      updating the current task.
      Reported-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.de
      6d991ba5
    • T
      x86/speculation: Split out TIF update · e6da8bb6
      Thomas Gleixner 提交于
      The update of the TIF_SSBD flag and the conditional speculation control MSR
      update is done in the ssb_prctl_set() function directly. The upcoming prctl
      support for controlling indirect branch speculation via STIBP needs the
      same mechanism.
      
      Split the code out and make it reusable. Reword the comment about updates
      for other tasks.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.652305076@linutronix.de
      e6da8bb6
    • T
      x86/speculation: Prepare for conditional IBPB in switch_mm() · 4c71a2b6
      Thomas Gleixner 提交于
      The IBPB speculation barrier is issued from switch_mm() when the kernel
      switches to a user space task with a different mm than the user space task
      which ran last on the same CPU.
      
      An additional optimization is to avoid IBPB when the incoming task can be
      ptraced by the outgoing task. This optimization only works when switching
      directly between two user space tasks. When switching from a kernel task to
      a user space task the optimization fails because the previous task cannot
      be accessed anymore. So for quite some scenarios the optimization is just
      adding overhead.
      
      The upcoming conditional IBPB support will issue IBPB only for user space
      tasks which have the TIF_SPEC_IB bit set. This requires to handle the
      following cases:
      
        1) Switch from a user space task (potential attacker) which has
           TIF_SPEC_IB set to a user space task (potential victim) which has
           TIF_SPEC_IB not set.
      
        2) Switch from a user space task (potential attacker) which has
           TIF_SPEC_IB not set to a user space task (potential victim) which has
           TIF_SPEC_IB set.
      
      This needs to be optimized for the case where the IBPB can be avoided when
      only kernel threads ran in between user space tasks which belong to the
      same process.
      
      The current check whether two tasks belong to the same context is using the
      tasks context id. While correct, it's simpler to use the mm pointer because
      it allows to mangle the TIF_SPEC_IB bit into it. The context id based
      mechanism requires extra storage, which creates worse code.
      
      When a task is scheduled out its TIF_SPEC_IB bit is mangled as bit 0 into
      the per CPU storage which is used to track the last user space mm which was
      running on a CPU. This bit can be used together with the TIF_SPEC_IB bit of
      the incoming task to make the decision whether IBPB needs to be issued or
      not to cover the two cases above.
      
      As conditional IBPB is going to be the default, remove the dubious ptrace
      check for the IBPB always case and simply issue IBPB always when the
      process changes.
      
      Move the storage to a different place in the struct as the original one
      created a hole.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.466447057@linutronix.de
      4c71a2b6
    • T
      x86/speculation: Avoid __switch_to_xtra() calls · 5635d999
      Thomas Gleixner 提交于
      The TIF_SPEC_IB bit does not need to be evaluated in the decision to invoke
      __switch_to_xtra() when:
      
       - CONFIG_SMP is disabled
      
       - The conditional STIPB mode is disabled
      
      The TIF_SPEC_IB bit still controls IBPB in both cases so the TIF work mask
      checks might invoke __switch_to_xtra() for nothing if TIF_SPEC_IB is the
      only set bit in the work masks.
      
      Optimize it out by masking the bit at compile time for CONFIG_SMP=n and at
      run time when the static key controlling the conditional STIBP mode is
      disabled.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.374062201@linutronix.de
      
      5635d999
    • T
      x86/process: Consolidate and simplify switch_to_xtra() code · ff16701a
      Thomas Gleixner 提交于
      Move the conditional invocation of __switch_to_xtra() into an inline
      function so the logic can be shared between 32 and 64 bit.
      
      Remove the handthrough of the TSS pointer and retrieve the pointer directly
      in the bitmap handling function. Use this_cpu_ptr() instead of the
      per_cpu() indirection.
      
      This is a preparatory change so integration of conditional indirect branch
      speculation optimization happens only in one place.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.280855518@linutronix.de
      ff16701a
    • T
      x86/speculation: Prepare for per task indirect branch speculation control · 5bfbe3ad
      Tim Chen 提交于
      To avoid the overhead of STIBP always on, it's necessary to allow per task
      control of STIBP.
      
      Add a new task flag TIF_SPEC_IB and evaluate it during context switch if
      SMT is active and flag evaluation is enabled by the speculation control
      code. Add the conditional evaluation to x86_virt_spec_ctrl() as well so the
      guest/host switch works properly.
      
      This has no effect because TIF_SPEC_IB cannot be set yet and the static key
      which controls evaluation is off. Preparatory patch for adding the control
      code.
      
      [ tglx: Simplify the context switch logic and make the TIF evaluation
        	depend on SMP=y and on the static key controlling the conditional
        	update. Rename it to TIF_SPEC_IB because it controls both STIBP and
        	IBPB ]
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.176917199@linutronix.de
      
      5bfbe3ad
    • T
      x86/speculation: Add command line control for indirect branch speculation · fa1202ef
      Thomas Gleixner 提交于
      Add command line control for user space indirect branch speculation
      mitigations. The new option is: spectre_v2_user=
      
      The initial options are:
      
          -  on:   Unconditionally enabled
          - off:   Unconditionally disabled
          -auto:   Kernel selects mitigation (default off for now)
      
      When the spectre_v2= command line argument is either 'on' or 'off' this
      implies that the application to application control follows that state even
      if a contradicting spectre_v2_user= argument is supplied.
      Originally-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.082720373@linutronix.de
      fa1202ef
    • T
      x86/speculation: Unify conditional spectre v2 print functions · 495d470e
      Thomas Gleixner 提交于
      There is no point in having two functions and a conditional at the call
      site.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.986890749@linutronix.de
      
      495d470e
    • T
      x86/speculataion: Mark command line parser data __initdata · 30ba72a9
      Thomas Gleixner 提交于
      No point to keep that around.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.893886356@linutronix.de
      30ba72a9
    • T
      x86/speculation: Mark string arrays const correctly · 8770709f
      Thomas Gleixner 提交于
      checkpatch.pl muttered when reshuffling the code:
       WARNING: static const char * array should probably be static const char * const
      
      Fix up all the string arrays.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.800018931@linutronix.de
      8770709f
    • T
      x86/speculation: Reorder the spec_v2 code · 15d6b7aa
      Thomas Gleixner 提交于
      Reorder the code so it is better grouped. No functional change.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.707122879@linutronix.de
      
      15d6b7aa
    • T
      x86/l1tf: Show actual SMT state · 130d6f94
      Thomas Gleixner 提交于
      Use the now exposed real SMT state, not the SMT sysfs control knob
      state. This reflects the state of the system when the mitigation status is
      queried.
      
      This does not change the warning in the VMX launch code. There the
      dependency on the control knob makes sense because siblings could be
      brought online anytime after launching the VM.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.613357354@linutronix.de
      
      130d6f94
    • T
      x86/speculation: Rework SMT state change · a74cfffb
      Thomas Gleixner 提交于
      arch_smt_update() is only called when the sysfs SMT control knob is
      changed. This means that when SMT is enabled in the sysfs control knob the
      system is considered to have SMT active even if all siblings are offline.
      
      To allow finegrained control of the speculation mitigations, the actual SMT
      state is more interesting than the fact that siblings could be enabled.
      
      Rework the code, so arch_smt_update() is invoked from each individual CPU
      hotplug function, and simplify the update function while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.521974984@linutronix.de
      
      a74cfffb
    • T
      x86/Kconfig: Select SCHED_SMT if SMP enabled · dbe73364
      Thomas Gleixner 提交于
      CONFIG_SCHED_SMT is enabled by all distros, so there is not a real point to
      have it configurable. The runtime overhead in the core scheduler code is
      minimal because the actual SMT scheduling parts are conditional on a static
      key.
      
      This allows to expose the scheduler's SMT state static key to the
      speculation control code. Alternatively the scheduler's static key could be
      made always available when CONFIG_SMP is enabled, but that's just adding an
      unused static key to every other architecture for nothing.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.337452245@linutronix.de
      
      dbe73364