1. 19 1月, 2018 3 次提交
  2. 18 1月, 2018 1 次提交
    • T
      x86/mm: Rework wbinvd, hlt operation in stop_this_cpu() · f23d74f6
      Tom Lendacky 提交于
      Some issues have been reported with the for loop in stop_this_cpu() that
      issues the 'wbinvd; hlt' sequence.  Reverting this sequence to halt()
      has been shown to resolve the issue.
      
      However, the wbinvd is needed when running with SME.  The reason for the
      wbinvd is to prevent cache flush races between encrypted and non-encrypted
      entries that have the same physical address.  This can occur when
      kexec'ing from memory encryption active to inactive or vice-versa.  The
      important thing is to not have outside of kernel text memory references
      (such as stack usage), so the usage of the native_*() functions is needed
      since these expand as inline asm sequences.  So instead of reverting the
      change, rework the sequence.
      
      Move the wbinvd instruction outside of the for loop as native_wbinvd()
      and make its execution conditional on X86_FEATURE_SME.  In the for loop,
      change the asm 'wbinvd; hlt' sequence back to a halt sequence but use
      the native_halt() call.
      
      Fixes: bba4ed01 ("x86/mm, kexec: Allow kexec to be used with SME")
      Reported-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Yu Chen <yu.c.chen@intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: kexec@lists.infradead.org
      Cc: ebiederm@redhat.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180117234141.21184.44067.stgit@tlendack-t1.amdoffice.net
      f23d74f6
  3. 17 1月, 2018 3 次提交
  4. 16 1月, 2018 1 次提交
  5. 15 1月, 2018 2 次提交
    • D
      x86/retpoline: Fill RSB on context switch for affected CPUs · c995efd5
      David Woodhouse 提交于
      On context switch from a shallow call stack to a deeper one, as the CPU
      does 'ret' up the deeper side it may encounter RSB entries (predictions for
      where the 'ret' goes to) which were populated in userspace.
      
      This is problematic if neither SMEP nor KPTI (the latter of which marks
      userspace pages as NX for the kernel) are active, as malicious code in
      userspace may then be executed speculatively.
      
      Overwrite the CPU's return prediction stack with calls which are predicted
      to return to an infinite loop, to "capture" speculation if this
      happens. This is required both for retpoline, and also in conjunction with
      IBRS for !SMEP && !KPTI.
      
      On Skylake+ the problem is slightly different, and an *underflow* of the
      RSB may cause errant branch predictions to occur. So there it's not so much
      overwrite, as *filling* the RSB to attempt to prevent it getting
      empty. This is only a partial solution for Skylake+ since there are many
      other conditions which may result in the RSB becoming empty. The full
      solution on Skylake+ is to use IBRS, which will prevent the problem even
      when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
      required on context switch.
      
      [ tglx: Added missing vendor check and slighty massaged comments and
        	changelog ]
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
      c995efd5
    • A
      x86/idt: Mark IDT tables __initconst · 327867fa
      Andi Kleen 提交于
      const variables must use __initconst, not __initdata.
      
      Fix this up for the IDT tables, which got it consistently wrong.
      
      Fixes: 16bc18d8 ("x86/idt: Move 32-bit idt_descr to C code")
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20171222001821.2157-7-andi@firstfloor.org
      327867fa
  6. 14 1月, 2018 4 次提交
  7. 12 1月, 2018 5 次提交
    • A
      x86/retpoline/irq32: Convert assembler indirect jumps · 7614e913
      Andi Kleen 提交于
      Convert all indirect jumps in 32bit irq inline asm code to use non
      speculative sequences.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co.uk
      7614e913
    • D
      x86/retpoline/ftrace: Convert ftrace assembler indirect jumps · 9351803b
      David Woodhouse 提交于
      Convert all indirect jumps in ftrace assembler code to use non-speculative
      sequences when CONFIG_RETPOLINE is enabled.
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.uk
      9351803b
    • D
      x86/spectre: Add boot time option to select Spectre v2 mitigation · da285121
      David Woodhouse 提交于
      Add a spectre_v2= option to select the mitigation used for the indirect
      branch speculation vulnerability.
      
      Currently, the only option available is retpoline, in its various forms.
      This will be expanded to cover the new IBRS/IBPB microcode features.
      
      The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation
      control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a
      serializing instruction, which is indicated by the LFENCE_RDTSC feature.
      
      [ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS
        	integration becomes simple ]
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk
      da285121
    • D
      x86/retpoline: Add initial retpoline support · 76b04384
      David Woodhouse 提交于
      Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
      the corresponding thunks. Provide assembler macros for invoking the thunks
      in the same way that GCC does, from native and inline assembler.
      
      This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
      some circumstances, IBRS microcode features may be used instead, and the
      retpoline can be disabled.
      
      On AMD CPUs if lfence is serialising, the retpoline can be dramatically
      simplified to a simple "lfence; jmp *\reg". A future patch, after it has
      been verified that lfence really is serialising in all circumstances, can
      enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
      to X86_FEATURE_RETPOLINE.
      
      Do not align the retpoline in the altinstr section, because there is no
      guarantee that it stays aligned when it's copied over the oldinstr during
      alternative patching.
      
      [ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
      [ tglx: Put actual function CALL/JMP in front of the macros, convert to
        	symbolic labels ]
      [ dwmw2: Convert back to numeric labels, merge objtool fixes ]
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
      76b04384
    • D
      x86/pti: Make unpoison of pgd for trusted boot work for real · 445b69e3
      Dave Hansen 提交于
      The inital fix for trusted boot and PTI potentially misses the pgd clearing
      if pud_alloc() sets a PGD.  It probably works in *practice* because for two
      adjacent calls to map_tboot_page() that share a PGD entry, the first will
      clear NX, *then* allocate and set the PGD (without NX clear).  The second
      call will *not* allocate but will clear the NX bit.
      
      Defer the NX clearing to a point after it is known that all top-level
      allocations have occurred.  Add a comment to clarify why.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 262b6b30 ("x86/tboot: Unbreak tboot with PTI enabled")
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: "Tim Chen" <tim.c.chen@linux.intel.com>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: peterz@infradead.org
      Cc: ning.sun@intel.com
      Cc: tboot-devel@lists.sourceforge.net
      Cc: andi@firstfloor.org
      Cc: luto@kernel.org
      Cc: law@redhat.com
      Cc: pbonzini@redhat.com
      Cc: torvalds@linux-foundation.org
      Cc: gregkh@linux-foundation.org
      Cc: dwmw@amazon.co.uk
      Cc: nickc@redhat.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180110224939.2695CD47@viggo.jf.intel.com
      445b69e3
  8. 11 1月, 2018 1 次提交
  9. 09 1月, 2018 3 次提交
  10. 08 1月, 2018 1 次提交
  11. 07 1月, 2018 1 次提交
  12. 06 1月, 2018 1 次提交
  13. 05 1月, 2018 1 次提交
  14. 04 1月, 2018 1 次提交
  15. 03 1月, 2018 4 次提交
  16. 31 12月, 2017 3 次提交
  17. 30 12月, 2017 3 次提交
    • T
      genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI · bc976233
      Thomas Gleixner 提交于
      The new reservation mode for interrupts assigns a dummy vector when the
      interrupt is allocated and assigns a real vector when the interrupt is
      requested. The reservation mode prevents vector pressure when devices with
      a large amount of queues/interrupts are initialized, but only a minimal
      subset of those queues/interrupts is actually used.
      
      This mode has an issue with MSI interrupts which cannot be masked. If the
      driver is not careful or the hardware emits an interrupt before the device
      irq is requestd by the driver then the interrupt ends up on the dummy
      vector as a spurious interrupt which can cause malfunction of the device or
      in the worst case a lockup of the machine.
      
      Change the logic for the reservation mode so that the early activation of
      MSI interrupts checks whether:
      
       - the device is a PCI/MSI device
       - the reservation mode of the underlying irqdomain is activated
       - PCI/MSI masking is globally enabled
       - the PCI/MSI device uses either MSI-X, which supports masking, or
         MSI with the maskbit supported.
      
      If one of those conditions is false, then clear the reservation mode flag
      in the irq data of the interrupt and invoke irq_domain_activate_irq() with
      the reserve argument cleared. In the x86 vector code, clear the can_reserve
      flag in the vector allocation data so a subsequent free_irq() won't create
      the same situation again. The interrupt stays assigned to a real vector
      until pci_disable_msi() is invoked and all allocations are undone.
      
      Fixes: 4900be83 ("x86/vector/msi: Switch to global reservation mode")
      Reported-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
      bc976233
    • T
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq() · 702cb0a0
      Thomas Gleixner 提交于
      The 'early' argument of irq_domain_activate_irq() is actually used to
      denote reservation mode. To avoid confusion, rename it before abuse
      happens.
      
      No functional change.
      
      Fixes: 72491643 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      702cb0a0
    • T
      x86/vector: Use IRQD_CAN_RESERVE flag · 945f50a5
      Thomas Gleixner 提交于
      Set the new CAN_RESERVE flag when the initial reservation for an interrupt
      happens. The flag is used in a subsequent patch to disable reservation mode
      for a certain class of MSI devices.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      
      945f50a5
  18. 29 12月, 2017 1 次提交
    • T
      x86/apic: Switch all APICs to Fixed delivery mode · a31e58e1
      Thomas Gleixner 提交于
      Some of the APIC incarnations are operating in lowest priority delivery
      mode. This worked as long as the vector management code allocated the same
      vector on all possible CPUs for each interrupt.
      
      Lowest priority delivery mode does not necessarily respect the affinity
      setting and may redirect to some other online CPU. This was documented
      somewhere in the old code and the conversion to single target delivery
      missed to update the delivery mode of the affected APIC drivers which
      results in spurious interrupts on some of the affected CPU/Chipset
      combinations.
      
      Switch the APIC drivers over to Fixed delivery mode and remove all
      leftovers of lowest priority delivery mode.
      
      Switching to Fixed delivery mode is not a problem on these CPUs because the
      kernel already uses Fixed delivery mode for IPIs. The reason for this is
      that th SDM explicitely forbids lowest prio mode for IPIs. The reason is
      obvious: If the irq routing does not honor destination targets in lowest
      prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be
      a fatal problem in many cases.
      
      As a consequence of this change, the apic::irq_delivery_mode field is now
      pointless, but this needs to be cleaned up in a separate patch.
      
      Fixes: fdba46ff ("x86/apic: Get rid of multi CPU affinity")
      Reported-by: vcaputo@pengaru.com
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: vcaputo@pengaru.com
      Cc: Pavel Machek <pavel@ucw.cz>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos
      a31e58e1
  19. 28 12月, 2017 1 次提交