1. 01 12月, 2019 2 次提交
    • W
      x86/speculation: Fix redundant MDS mitigation message · ed7a3dde
      Waiman Long 提交于
      commit cd5a2aa89e847bdda7b62029d94e95488d73f6b2 upstream.
      
      Since MDS and TAA mitigations are inter-related for processors that are
      affected by both vulnerabilities, the followiing confusing messages can
      be printed in the kernel log:
      
        MDS: Vulnerable
        MDS: Mitigation: Clear CPU buffers
      
      To avoid the first incorrect message, defer the printing of MDS
      mitigation after the TAA mitigation selection has been done. However,
      that has the side effect of printing TAA mitigation first before MDS
      mitigation.
      
       [ bp: Check box is affected/mitigations are disabled first before
         printing and massage. ]
      Suggested-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Mark Gross <mgross@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191115161445.30809-3-longman@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed7a3dde
    • W
      x86/speculation: Fix incorrect MDS/TAA mitigation status · 0af5ae26
      Waiman Long 提交于
      commit 64870ed1b12e235cfca3f6c6da75b542c973ff78 upstream.
      
      For MDS vulnerable processors with TSX support, enabling either MDS or
      TAA mitigations will enable the use of VERW to flush internal processor
      buffers at the right code path. IOW, they are either both mitigated
      or both not. However, if the command line options are inconsistent,
      the vulnerabilites sysfs files may not report the mitigation status
      correctly.
      
      For example, with only the "mds=off" option:
      
        vulnerabilities/mds:Vulnerable; SMT vulnerable
        vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable
      
      The mds vulnerabilities file has wrong status in this case. Similarly,
      the taa vulnerability file will be wrong with mds mitigation on, but
      taa off.
      
      Change taa_select_mitigation() to sync up the two mitigation status
      and have them turned off if both "mds=off" and "tsx_async_abort=off"
      are present.
      
      Update documentation to emphasize the fact that both "mds=off" and
      "tsx_async_abort=off" have to be specified together for processors that
      are affected by both TAA and MDS to be effective.
      
       [ bp: Massage and add kernel-parameters.txt change too. ]
      
      Fixes: 1b42f017415b ("x86/speculation/taa: Add mitigation for TSX Async Abort")
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Mark Gross <mgross@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0af5ae26
  2. 13 11月, 2019 5 次提交
  3. 07 8月, 2019 2 次提交
    • T
      x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS · b88241ae
      Thomas Gleixner 提交于
      commit f36cf386e3fec258a341d446915862eded3e13d8 upstream
      
      Intel provided the following information:
      
       On all current Atom processors, instructions that use a segment register
       value (e.g. a load or store) will not speculatively execute before the
       last writer of that segment retires. Thus they will not use a
       speculatively written segment value.
      
      That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
      entry paths can be excluded from the extra LFENCE if PTI is disabled.
      
      Create a separate bug flag for the through SWAPGS speculation and mark all
      out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
      are excluded from the whole mitigation mess anyway.
      Reported-by: NAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b88241ae
    • J
      x86/speculation: Enable Spectre v1 swapgs mitigations · 23e7a7b3
      Josh Poimboeuf 提交于
      commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream
      
      The previous commit added macro calls in the entry code which mitigate the
      Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
      enabled.  Enable those features where applicable.
      
      The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
      
      There are different features which can affect the risk of attack:
      
      - When FSGSBASE is enabled, unprivileged users are able to place any
        value in GS, using the wrgsbase instruction.  This means they can
        write a GS value which points to any value in kernel space, which can
        be useful with the following gadget in an interrupt/exception/NMI
        handler:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	// dependent load or store based on the value of %reg
      	// for example: mov %(reg1), %reg2
      
        If an interrupt is coming from user space, and the entry code
        speculatively skips the swapgs (due to user branch mistraining), it
        may speculatively execute the GS-based load and a subsequent dependent
        load or store, exposing the kernel data to an L1 side channel leak.
      
        Note that, on Intel, a similar attack exists in the above gadget when
        coming from kernel space, if the swapgs gets speculatively executed to
        switch back to the user GS.  On AMD, this variant isn't possible
        because swapgs is serializing with respect to future GS-based
        accesses.
      
        NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
      	doesn't exist quite yet.
      
      - When FSGSBASE is disabled, the issue is mitigated somewhat because
        unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
        restricts GS values to user space addresses only.  That means the
        gadget would need an additional step, since the target kernel address
        needs to be read from user space first.  Something like:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	mov (%reg1), %reg2
      	// dependent load or store based on the value of %reg2
      	// for example: mov %(reg2), %reg3
      
        It's difficult to audit for this gadget in all the handlers, so while
        there are no known instances of it, it's entirely possible that it
        exists somewhere (or could be introduced in the future).  Without
        tooling to analyze all such code paths, consider it vulnerable.
      
        Effects of SMAP on the !FSGSBASE case:
      
        - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
          susceptible to Meltdown), the kernel is prevented from speculatively
          reading user space memory, even L1 cached values.  This effectively
          disables the !FSGSBASE attack vector.
      
        - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
          still prevents the kernel from speculatively reading user space
          memory.  But it does *not* prevent the kernel from reading the
          user value from L1, if it has already been cached.  This is probably
          only a small hurdle for an attacker to overcome.
      
      Thanks to Dave Hansen for contributing the speculative_smap() function.
      
      Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
      is serializing on AMD.
      
      [ tglx: Fixed the USER fence decision and polished the comment as suggested
        	by Dave Hansen ]
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23e7a7b3
  4. 31 7月, 2019 1 次提交
  5. 03 7月, 2019 1 次提交
  6. 15 5月, 2019 14 次提交
  7. 27 4月, 2019 1 次提交
  8. 13 2月, 2019 1 次提交
    • J
      cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM · 97a7fa90
      Josh Poimboeuf 提交于
      commit b284909abad48b07d3071a9fc9b5692b3e64914b upstream.
      
      With the following commit:
      
        73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      
      ... the hotplug code attempted to detect when SMT was disabled by BIOS,
      in which case it reported SMT as permanently disabled.  However, that
      code broke a virt hotplug scenario, where the guest is booted with only
      primary CPU threads, and a sibling is brought online later.
      
      The problem is that there doesn't seem to be a way to reliably
      distinguish between the HW "SMT disabled by BIOS" case and the virt
      "sibling not yet brought online" case.  So the above-mentioned commit
      was a bit misguided, as it permanently disabled SMT for both cases,
      preventing future virt sibling hotplugs.
      
      Going back and reviewing the original problems which were attempted to
      be solved by that commit, when SMT was disabled in BIOS:
      
        1) /sys/devices/system/cpu/smt/control showed "on" instead of
           "notsupported"; and
      
        2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.
      
      I'd propose that we instead consider #1 above to not actually be a
      problem.  Because, at least in the virt case, it's possible that SMT
      wasn't disabled by BIOS and a sibling thread could be brought online
      later.  So it makes sense to just always default the smt control to "on"
      to allow for that possibility (assuming cpuid indicates that the CPU
      supports SMT).
      
      The real problem is #2, which has a simple fix: change vmx_vm_init() to
      query the actual current SMT state -- i.e., whether any siblings are
      currently online -- instead of looking at the SMT "control" sysfs value.
      
      So fix it by:
      
        a) reverting the original "fix" and its followup fix:
      
           73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
           bc2d8d26 ("cpu/hotplug: Fix SMT supported evaluation")
      
           and
      
        b) changing vmx_vm_init() to query the actual current SMT state --
           instead of the sysfs control value -- to determine whether the L1TF
           warning is needed.  This also requires the 'sched_smt_present'
           variable to exported, instead of 'cpu_smt_control'.
      
      Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
      Reported-by: NIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      97a7fa90
  9. 17 1月, 2019 1 次提交
    • W
      x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE · 4bef2bac
      WANG Chao 提交于
      commit e4f358916d528d479c3c12bd2fd03f2d5a576380 upstream.
      
      Commit
      
        4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
      
      replaced the RETPOLINE define with CONFIG_RETPOLINE checks. Remove the
      remaining pieces.
      
       [ bp: Massage commit message. ]
      
      Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
      Signed-off-by: NWANG Chao <chao.wang@ucloud.cn>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Reviewed-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: linux-kbuild@vger.kernel.org
      Cc: srinivas.eeda@oracle.com
      Cc: stable <stable@vger.kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20181210163725.95977-1-chao.wang@ucloud.cnSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4bef2bac
  10. 10 1月, 2019 1 次提交
  11. 06 12月, 2018 11 次提交
    • T
      x86/speculation: Provide IBPB always command line options · 9f3baace
      Thomas Gleixner 提交于
      commit 55a974021ec952ee460dc31ca08722158639de72 upstream
      
      Provide the possibility to enable IBPB always in combination with 'prctl'
      and 'seccomp'.
      
      Add the extra command line options and rework the IBPB selection to
      evaluate the command instead of the mode selected by the STIPB switch case.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185006.144047038@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f3baace
    • T
      x86/speculation: Add seccomp Spectre v2 user space protection mode · d1ec2354
      Thomas Gleixner 提交于
      commit 6b3e64c237c072797a9ec918654a60e3a46488e2 upstream
      
      If 'prctl' mode of user space protection from spectre v2 is selected
      on the kernel command-line, STIBP and IBPB are applied on tasks which
      restrict their indirect branch speculation via prctl.
      
      SECCOMP enables the SSBD mitigation for sandboxed tasks already, so it
      makes sense to prevent spectre v2 user space to user space attacks as
      well.
      
      The Intel mitigation guide documents how STIPB works:
          
         Setting bit 1 (STIBP) of the IA32_SPEC_CTRL MSR on a logical processor
         prevents the predicted targets of indirect branches on any logical
         processor of that core from being controlled by software that executes
         (or executed previously) on another logical processor of the same core.
      
      Ergo setting STIBP protects the task itself from being attacked from a task
      running on a different hyper-thread and protects the tasks running on
      different hyper-threads from being attacked.
      
      While the document suggests that the branch predictors are shielded between
      the logical processors, the observed performance regressions suggest that
      STIBP simply disables the branch predictor more or less completely. Of
      course the document wording is vague, but the fact that there is also no
      requirement for issuing IBPB when STIBP is used points clearly in that
      direction. The kernel still issues IBPB even when STIBP is used until Intel
      clarifies the whole mechanism.
      
      IBPB is issued when the task switches out, so malicious sandbox code cannot
      mistrain the branch predictor for the next user space task on the same
      logical processor.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185006.051663132@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1ec2354
    • T
      x86/speculation: Enable prctl mode for spectre_v2_user · 7b62ef14
      Thomas Gleixner 提交于
      commit 7cc765a67d8e04ef7d772425ca5a2a1e2b894c15 upstream
      
      Now that all prerequisites are in place:
      
       - Add the prctl command line option
      
       - Default the 'auto' mode to 'prctl'
      
       - When SMT state changes, update the static key which controls the
         conditional STIBP evaluation on context switch.
      
       - At init update the static key which controls the conditional IBPB
         evaluation on context switch.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.958421388@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b62ef14
    • T
      x86/speculation: Add prctl() control for indirect branch speculation · 238ba6e7
      Thomas Gleixner 提交于
      commit 9137bb27e60e554dab694eafa4cca241fa3a694f upstream
      
      Add the PR_SPEC_INDIRECT_BRANCH option for the PR_GET_SPECULATION_CTRL and
      PR_SET_SPECULATION_CTRL prctls to allow fine grained per task control of
      indirect branch speculation via STIBP and IBPB.
      
      Invocations:
       Check indirect branch speculation status with
       - prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, 0, 0, 0);
      
       Enable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
      
       Disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
      
       Force disable indirect branch speculation with
       - prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
      
      See Documentation/userspace-api/spec_ctrl.rst.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.866780996@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      238ba6e7
    • T
      x86/speculation: Prepare arch_smt_update() for PRCTL mode · f67fafb8
      Thomas Gleixner 提交于
      commit 6893a959d7fdebbab5f5aa112c277d5a44435ba1 upstream
      
      The upcoming fine grained per task STIBP control needs to be updated on CPU
      hotplug as well.
      
      Split out the code which controls the strict mode so the prctl control code
      can be added later. Mark the SMP function call argument __unused while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.759457117@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f67fafb8
    • T
      x86/speculation: Prevent stale SPEC_CTRL msr content · e8412401
      Thomas Gleixner 提交于
      commit 6d991ba509ebcfcc908e009d1db51972a4f7a064 upstream
      
      The seccomp speculation control operates on all tasks of a process, but
      only the current task of a process can update the MSR immediately. For the
      other threads the update is deferred to the next context switch.
      
      This creates the following situation with Process A and B:
      
      Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2
      does not have the speculation control TIF bit set. Process B task 1 has the
      speculation control TIF bit set.
      
      CPU0					CPU1
      					MSR bit is set
      					ProcB.T1 schedules out
      					ProcA.T2 schedules in
      					MSR bit is cleared
      ProcA.T1
        seccomp_update()
        set TIF bit on ProcA.T2
      					ProcB.T1 schedules in
      					MSR is not updated  <-- FAIL
      
      This happens because the context switch code tries to avoid the MSR update
      if the speculation control TIF bits of the incoming and the outgoing task
      are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks
      scheduling back and forth on CPU1, which keeps the MSR stale forever.
      
      In theory this could be remedied by IPIs, but chasing the remote task which
      could be migrated is complex and full of races.
      
      The straight forward solution is to avoid the asychronous update of the TIF
      bit and defer it to the next context switch. The speculation control state
      is stored in task_struct::atomic_flags by the prctl and seccomp updates
      already.
      
      Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the
      atomic_flags. Check the bit on context switch and force a synchronous
      update of the speculation control if set. Use the same mechanism for
      updating the current task.
      Reported-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8412401
    • T
      x86/speculation: Split out TIF update · 59028be1
      Thomas Gleixner 提交于
      commit e6da8bb6f9abb2628381904b24163c770e630bac upstream
      
      The update of the TIF_SSBD flag and the conditional speculation control MSR
      update is done in the ssb_prctl_set() function directly. The upcoming prctl
      support for controlling indirect branch speculation via STIBP needs the
      same mechanism.
      
      Split the code out and make it reusable. Reword the comment about updates
      for other tasks.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.652305076@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59028be1
    • T
      x86/speculation: Prepare for conditional IBPB in switch_mm() · a1788815
      Thomas Gleixner 提交于
      commit 4c71a2b6fd7e42814aa68a6dec88abf3b42ea573 upstream
      
      The IBPB speculation barrier is issued from switch_mm() when the kernel
      switches to a user space task with a different mm than the user space task
      which ran last on the same CPU.
      
      An additional optimization is to avoid IBPB when the incoming task can be
      ptraced by the outgoing task. This optimization only works when switching
      directly between two user space tasks. When switching from a kernel task to
      a user space task the optimization fails because the previous task cannot
      be accessed anymore. So for quite some scenarios the optimization is just
      adding overhead.
      
      The upcoming conditional IBPB support will issue IBPB only for user space
      tasks which have the TIF_SPEC_IB bit set. This requires to handle the
      following cases:
      
        1) Switch from a user space task (potential attacker) which has
           TIF_SPEC_IB set to a user space task (potential victim) which has
           TIF_SPEC_IB not set.
      
        2) Switch from a user space task (potential attacker) which has
           TIF_SPEC_IB not set to a user space task (potential victim) which has
           TIF_SPEC_IB set.
      
      This needs to be optimized for the case where the IBPB can be avoided when
      only kernel threads ran in between user space tasks which belong to the
      same process.
      
      The current check whether two tasks belong to the same context is using the
      tasks context id. While correct, it's simpler to use the mm pointer because
      it allows to mangle the TIF_SPEC_IB bit into it. The context id based
      mechanism requires extra storage, which creates worse code.
      
      When a task is scheduled out its TIF_SPEC_IB bit is mangled as bit 0 into
      the per CPU storage which is used to track the last user space mm which was
      running on a CPU. This bit can be used together with the TIF_SPEC_IB bit of
      the incoming task to make the decision whether IBPB needs to be issued or
      not to cover the two cases above.
      
      As conditional IBPB is going to be the default, remove the dubious ptrace
      check for the IBPB always case and simply issue IBPB always when the
      process changes.
      
      Move the storage to a different place in the struct as the original one
      created a hole.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.466447057@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1788815
    • T
      x86/speculation: Prepare for per task indirect branch speculation control · 69985a2c
      Tim Chen 提交于
      commit 5bfbe3ad5840d941b89bcac54b821ba14f50a0ba upstream
      
      To avoid the overhead of STIBP always on, it's necessary to allow per task
      control of STIBP.
      
      Add a new task flag TIF_SPEC_IB and evaluate it during context switch if
      SMT is active and flag evaluation is enabled by the speculation control
      code. Add the conditional evaluation to x86_virt_spec_ctrl() as well so the
      guest/host switch works properly.
      
      This has no effect because TIF_SPEC_IB cannot be set yet and the static key
      which controls evaluation is off. Preparatory patch for adding the control
      code.
      
      [ tglx: Simplify the context switch logic and make the TIF evaluation
        	depend on SMP=y and on the static key controlling the conditional
        	update. Rename it to TIF_SPEC_IB because it controls both STIBP and
        	IBPB ]
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.176917199@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69985a2c
    • T
      x86/speculation: Add command line control for indirect branch speculation · 71187543
      Thomas Gleixner 提交于
      commit fa1202ef224391b6f5b26cdd44cc50495e8fab54 upstream
      
      Add command line control for user space indirect branch speculation
      mitigations. The new option is: spectre_v2_user=
      
      The initial options are:
      
          -  on:   Unconditionally enabled
          - off:   Unconditionally disabled
          -auto:   Kernel selects mitigation (default off for now)
      
      When the spectre_v2= command line argument is either 'on' or 'off' this
      implies that the application to application control follows that state even
      if a contradicting spectre_v2_user= argument is supplied.
      Originally-by: NTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185005.082720373@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71187543
    • T
      x86/speculation: Unify conditional spectre v2 print functions · 8a34c706
      Thomas Gleixner 提交于
      commit 495d470e9828500e0155027f230449ac5e29c025 upstream
      
      There is no point in having two functions and a conditional at the call
      site.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Casey Schaufler <casey.schaufler@intel.com>
      Cc: Asit Mallick <asit.k.mallick@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Waiman Long <longman9394@gmail.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Dave Stewart <david.c.stewart@intel.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20181125185004.986890749@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a34c706