1. 24 1月, 2018 1 次提交
  2. 21 1月, 2018 3 次提交
    • L
      x86: Use __nostackprotect for sme_encrypt_kernel · 91cfc88c
      Laura Abbott 提交于
      Commit bacf6b49 ("x86/mm: Use a struct to reduce parameters for SME
      PGD mapping") moved some parameters into a structure.
      
      The structure was large enough to trigger the stack protection canary in
      sme_encrypt_kernel which doesn't work this early, causing reboots.
      
      Mark sme_encrypt_kernel appropriately to not use the canary.
      
      Fixes: bacf6b49 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping")
      Signed-off-by: NLaura Abbott <labbott@redhat.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91cfc88c
    • L
      alpha/PCI: Fix noname IRQ level detection · 86be8993
      Lorenzo Pieralisi 提交于
      The conversion of the alpha architecture PCI host bridge legacy IRQ
      mapping/swizzling to the new PCI host bridge map/swizzle hooks carried
      out through:
      
      commit 0e4c2eeb ("alpha/PCI: Replace pci_fixup_irqs() call with
      host bridge IRQ mapping hooks")
      
      implies that IRQ for devices are now allocated through pci_assign_irq()
      function in pci_device_probe() that is called when a driver matching a
      device is found in order to probe the device through the device driver.
      
      Alpha noname platforms required IRQ level programming to be executed
      in sio_fixup_irq_levels(), that is called in noname_init_pci(), a
      platform hook called within a subsys_initcall.
      
      In noname_init_pci(), present IRQs are detected through
      sio_collect_irq_levels() that check the struct pci_dev->irq number
      to detect if an IRQ has been allocated for the device.
      
      By the time sio_collect_irq_levels() is called, some devices may still
      have not a matching driver loaded to match them (eg loadable module)
      therefore their IRQ allocation is still pending - which means that
      sio_collect_irq_levels() does not programme the correct IRQ level for
      those devices, causing their IRQ handling to be broken when the device
      driver is actually loaded and the device is probed.
      
      Fix the issue by adding code in the noname map_irq() function
      (noname_map_irq()) that, whilst mapping/swizzling the IRQ line, it also
      ensures that the correct IRQ level programming is executed at platform
      level, fixing the issue.
      
      Fixes: 0e4c2eeb ("alpha/PCI: Replace pci_fixup_irqs() call with
      host bridge IRQ mapping hooks")
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: stable@vger.kernel.org # 4.14
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Meelis Roos <mroos@linux.ee>
      Signed-off-by: NMatt Turner <mattst88@gmail.com>
      86be8993
    • C
      KVM: s390: wire up bpb feature · 35b3fde6
      Christian Borntraeger 提交于
      The new firmware interfaces for branch prediction behaviour changes
      are transparently available for the guest. Nevertheless, there is
      new state attached that should be migrated and properly resetted.
      Provide a mechanism for handling reset, migration and VSIE.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      [Changed capability number to 152. - Radim]
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      35b3fde6
  3. 20 1月, 2018 1 次提交
  4. 19 1月, 2018 8 次提交
  5. 18 1月, 2018 9 次提交
    • T
      x86/mm: Rework wbinvd, hlt operation in stop_this_cpu() · f23d74f6
      Tom Lendacky 提交于
      Some issues have been reported with the for loop in stop_this_cpu() that
      issues the 'wbinvd; hlt' sequence.  Reverting this sequence to halt()
      has been shown to resolve the issue.
      
      However, the wbinvd is needed when running with SME.  The reason for the
      wbinvd is to prevent cache flush races between encrypted and non-encrypted
      entries that have the same physical address.  This can occur when
      kexec'ing from memory encryption active to inactive or vice-versa.  The
      important thing is to not have outside of kernel text memory references
      (such as stack usage), so the usage of the native_*() functions is needed
      since these expand as inline asm sequences.  So instead of reverting the
      change, rework the sequence.
      
      Move the wbinvd instruction outside of the for loop as native_wbinvd()
      and make its execution conditional on X86_FEATURE_SME.  In the for loop,
      change the asm 'wbinvd; hlt' sequence back to a halt sequence but use
      the native_halt() call.
      
      Fixes: bba4ed01 ("x86/mm, kexec: Allow kexec to be used with SME")
      Reported-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NDave Young <dyoung@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Yu Chen <yu.c.chen@intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: kexec@lists.infradead.org
      Cc: ebiederm@redhat.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180117234141.21184.44067.stgit@tlendack-t1.amdoffice.net
      f23d74f6
    • R
      ARM: net: bpf: clarify tail_call index · 091f0248
      Russell King 提交于
      As per 90caccdd ("bpf: fix bpf_tail_call() x64 JIT"), the index used
      for array lookup is defined to be 32-bit wide. Update a misleading
      comment that suggests it is 64-bit wide.
      
      Fixes: 39c13c20 ("arm: eBPF JIT compiler")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      091f0248
    • R
      ARM: net: bpf: fix LDX instructions · ec19e02b
      Russell King 提交于
      When the source and destination register are identical, our JIT does not
      generate correct code, which leads to kernel oopses.
      
      Fix this by (a) generating more efficient code, and (b) making use of
      the temporary earlier if we will overwrite the address register.
      
      Fixes: 39c13c20 ("arm: eBPF JIT compiler")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      ec19e02b
    • R
      ARM: net: bpf: fix register saving · 02088d9b
      Russell King 提交于
      When an eBPF program tail-calls another eBPF program, it enters it after
      the prologue to avoid having complex stack manipulations.  This can lead
      to kernel oopses, and similar.
      
      Resolve this by always using a fixed stack layout, a CPU register frame
      pointer, and using this when reloading registers before returning.
      
      Fixes: 39c13c20 ("arm: eBPF JIT compiler")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      02088d9b
    • R
      ARM: net: bpf: correct stack layout documentation · 0005e55a
      Russell King 提交于
      The stack layout documentation incorrectly suggests that the BPF JIT
      scratch space starts immediately below BPF_FP. This is not correct,
      so let's fix the documentation to reflect reality.
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      0005e55a
    • R
      ARM: net: bpf: move stack documentation · 70ec3a6c
      Russell King 提交于
      Move the stack documentation towards the top of the file, where it's
      relevant for things like the register layout.
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      70ec3a6c
    • R
      ARM: net: bpf: fix stack alignment · d1220efd
      Russell King 提交于
      As per 2dede2d8 ("ARM EABI: stack pointer must be 64-bit aligned
      after a CPU exception") the stack should be aligned to a 64-bit boundary
      on EABI systems.  Ensure that the eBPF JIT appropraitely aligns the
      stack.
      
      Fixes: 39c13c20 ("arm: eBPF JIT compiler")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      d1220efd
    • R
      ARM: net: bpf: fix tail call jumps · f4483f2c
      Russell King 提交于
      When a tail call fails, it is documented that the tail call should
      continue execution at the following instruction.  An example tail call
      sequence is:
      
        12: (85) call bpf_tail_call#12
        13: (b7) r0 = 0
        14: (95) exit
      
      The ARM assembler for the tail call in this case ends up branching to
      instruction 14 instead of instruction 13, resulting in the BPF filter
      returning a non-zero value:
      
        178:	ldr	r8, [sp, #588]	; insn 12
        17c:	ldr	r6, [r8, r6]
        180:	ldr	r8, [sp, #580]
        184:	cmp	r8, r6
        188:	bcs	0x1e8
        18c:	ldr	r6, [sp, #524]
        190:	ldr	r7, [sp, #528]
        194:	cmp	r7, #0
        198:	cmpeq	r6, #32
        19c:	bhi	0x1e8
        1a0:	adds	r6, r6, #1
        1a4:	adc	r7, r7, #0
        1a8:	str	r6, [sp, #524]
        1ac:	str	r7, [sp, #528]
        1b0:	mov	r6, #104
        1b4:	ldr	r8, [sp, #588]
        1b8:	add	r6, r8, r6
        1bc:	ldr	r8, [sp, #580]
        1c0:	lsl	r7, r8, #2
        1c4:	ldr	r6, [r6, r7]
        1c8:	cmp	r6, #0
        1cc:	beq	0x1e8
        1d0:	mov	r8, #32
        1d4:	ldr	r6, [r6, r8]
        1d8:	add	r6, r6, #44
        1dc:	bx	r6
        1e0:	mov	r0, #0		; insn 13
        1e4:	mov	r1, #0
        1e8:	add	sp, sp, #596	; insn 14
        1ec:	pop	{r4, r5, r6, r7, r8, sl, pc}
      
      For other sequences, the tail call could end up branching midway through
      the following BPF instructions, or maybe off the end of the function,
      leading to unknown behaviours.
      
      Fixes: 39c13c20 ("arm: eBPF JIT compiler")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      f4483f2c
    • R
      ARM: net: bpf: avoid 'bx' instruction on non-Thumb capable CPUs · e9062481
      Russell King 提交于
      Avoid the 'bx' instruction on CPUs that have no support for Thumb and
      thus do not implement this instruction by moving the generation of this
      opcode to a separate function that selects between:
      
      	bx	reg
      
      and
      
      	mov	pc, reg
      
      according to the capabilities of the CPU.
      
      Fixes: 39c13c20 ("arm: eBPF JIT compiler")
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      e9062481
  6. 17 1月, 2018 11 次提交
  7. 16 1月, 2018 5 次提交
  8. 15 1月, 2018 2 次提交
    • T
      x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros · 28d437d5
      Tom Lendacky 提交于
      The PAUSE instruction is currently used in the retpoline and RSB filling
      macros as a speculation trap.  The use of PAUSE was originally suggested
      because it showed a very, very small difference in the amount of
      cycles/time used to execute the retpoline as compared to LFENCE.  On AMD,
      the PAUSE instruction is not a serializing instruction, so the pause/jmp
      loop will use excess power as it is speculated over waiting for return
      to mispredict to the correct target.
      
      The RSB filling macro is applicable to AMD, and, if software is unable to
      verify that LFENCE is serializing on AMD (possible when running under a
      hypervisor), the generic retpoline support will be used and, so, is also
      applicable to AMD.  Keep the current usage of PAUSE for Intel, but add an
      LFENCE instruction to the speculation trap for AMD.
      
      The same sequence has been adopted by GCC for the GCC generated retpolines.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Kees Cook <keescook@google.com>
      Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net
      28d437d5
    • D
      x86/retpoline: Fill RSB on context switch for affected CPUs · c995efd5
      David Woodhouse 提交于
      On context switch from a shallow call stack to a deeper one, as the CPU
      does 'ret' up the deeper side it may encounter RSB entries (predictions for
      where the 'ret' goes to) which were populated in userspace.
      
      This is problematic if neither SMEP nor KPTI (the latter of which marks
      userspace pages as NX for the kernel) are active, as malicious code in
      userspace may then be executed speculatively.
      
      Overwrite the CPU's return prediction stack with calls which are predicted
      to return to an infinite loop, to "capture" speculation if this
      happens. This is required both for retpoline, and also in conjunction with
      IBRS for !SMEP && !KPTI.
      
      On Skylake+ the problem is slightly different, and an *underflow* of the
      RSB may cause errant branch predictions to occur. So there it's not so much
      overwrite, as *filling* the RSB to attempt to prevent it getting
      empty. This is only a partial solution for Skylake+ since there are many
      other conditions which may result in the RSB becoming empty. The full
      solution on Skylake+ is to use IBRS, which will prevent the problem even
      when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
      required on context switch.
      
      [ tglx: Added missing vendor check and slighty massaged comments and
        	changelog ]
      Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
      c995efd5