1. 15 8月, 2019 6 次提交
    • T
      x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS · f5d27995
      Thomas Gleixner 提交于
      commit f36cf386e3fec258a341d446915862eded3e13d8 upstream.
      
      Intel provided the following information:
      
       On all current Atom processors, instructions that use a segment register
       value (e.g. a load or store) will not speculatively execute before the
       last writer of that segment retires. Thus they will not use a
       speculatively written segment value.
      
      That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
      entry paths can be excluded from the extra LFENCE if PTI is disabled.
      
      Create a separate bug flag for the through SWAPGS speculation and mark all
      out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
      are excluded from the whole mitigation mess anyway.
      Reported-by: NAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      f5d27995
    • J
      x86/entry/64: Use JMP instead of JMPQ · 49e1ce98
      Josh Poimboeuf 提交于
      commit 64dbc122b20f75183d8822618c24f85144a5a94d upstream.
      
      Somehow the swapgs mitigation entry code patch ended up with a JMPQ
      instruction instead of JMP, where only the short jump is needed.  Some
      assembler versions apparently fail to optimize JMPQ into a two-byte JMP
      when possible, instead always using a 7-byte JMP with relocation.  For
      some reason that makes the entry code explode with a #GP during boot.
      
      Change it back to "JMP" as originally intended.
      
      Fixes: 18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      49e1ce98
    • J
      x86/speculation: Enable Spectre v1 swapgs mitigations · 12351a6c
      Josh Poimboeuf 提交于
      commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream.
      
      The previous commit added macro calls in the entry code which mitigate the
      Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
      enabled.  Enable those features where applicable.
      
      The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
      
      There are different features which can affect the risk of attack:
      
      - When FSGSBASE is enabled, unprivileged users are able to place any
        value in GS, using the wrgsbase instruction.  This means they can
        write a GS value which points to any value in kernel space, which can
        be useful with the following gadget in an interrupt/exception/NMI
        handler:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	// dependent load or store based on the value of %reg
      	// for example: mov %(reg1), %reg2
      
        If an interrupt is coming from user space, and the entry code
        speculatively skips the swapgs (due to user branch mistraining), it
        may speculatively execute the GS-based load and a subsequent dependent
        load or store, exposing the kernel data to an L1 side channel leak.
      
        Note that, on Intel, a similar attack exists in the above gadget when
        coming from kernel space, if the swapgs gets speculatively executed to
        switch back to the user GS.  On AMD, this variant isn't possible
        because swapgs is serializing with respect to future GS-based
        accesses.
      
        NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
      	doesn't exist quite yet.
      
      - When FSGSBASE is disabled, the issue is mitigated somewhat because
        unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
        restricts GS values to user space addresses only.  That means the
        gadget would need an additional step, since the target kernel address
        needs to be read from user space first.  Something like:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	mov (%reg1), %reg2
      	// dependent load or store based on the value of %reg2
      	// for example: mov %(reg2), %reg3
      
        It's difficult to audit for this gadget in all the handlers, so while
        there are no known instances of it, it's entirely possible that it
        exists somewhere (or could be introduced in the future).  Without
        tooling to analyze all such code paths, consider it vulnerable.
      
        Effects of SMAP on the !FSGSBASE case:
      
        - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
          susceptible to Meltdown), the kernel is prevented from speculatively
          reading user space memory, even L1 cached values.  This effectively
          disables the !FSGSBASE attack vector.
      
        - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
          still prevents the kernel from speculatively reading user space
          memory.  But it does *not* prevent the kernel from reading the
          user value from L1, if it has already been cached.  This is probably
          only a small hurdle for an attacker to overcome.
      
      Thanks to Dave Hansen for contributing the speculative_smap() function.
      
      Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
      is serializing on AMD.
      
      [ tglx: Fixed the USER fence decision and polished the comment as suggested
        	by Dave Hansen ]
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      12351a6c
    • J
      x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations · 06e45d3a
      Josh Poimboeuf 提交于
      commit 18ec54fdd6d18d92025af097cd042a75cf0ea24c upstream.
      
      Spectre v1 isn't only about array bounds checks.  It can affect any
      conditional checks.  The kernel entry code interrupt, exception, and NMI
      handlers all have conditional swapgs checks.  Those may be problematic in
      the context of Spectre v1, as kernel code can speculatively run with a user
      GS.
      
      For example:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg
      	mov (%reg), %reg1
      
      When coming from user space, the CPU can speculatively skip the swapgs, and
      then do a speculative percpu load using the user GS value.  So the user can
      speculatively force a read of any kernel value.  If a gadget exists which
      uses the percpu value as an address in another load/store, then the
      contents of the kernel value may become visible via an L1 side channel
      attack.
      
      A similar attack exists when coming from kernel space.  The CPU can
      speculatively do the swapgs, causing the user GS to get used for the rest
      of the speculative window.
      
      The mitigation is similar to a traditional Spectre v1 mitigation, except:
      
        a) index masking isn't possible; because the index (percpu offset)
           isn't user-controlled; and
      
        b) an lfence is needed in both the "from user" swapgs path and the
           "from kernel" non-swapgs path (because of the two attacks described
           above).
      
      The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
      CR3 write when PTI is enabled.  Since CR3 writes are serializing, the
      lfences can be skipped in those cases.
      
      On the other hand, the kernel entry swapgs paths don't depend on PTI.
      
      To avoid unnecessary lfences for the user entry case, create two separate
      features for alternative patching:
      
        X86_FEATURE_FENCE_SWAPGS_USER
        X86_FEATURE_FENCE_SWAPGS_KERNEL
      
      Use these features in entry code to patch in lfences where needed.
      
      The features aren't enabled yet, so there's no functional change.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      06e45d3a
    • F
      x86/cpufeatures: Combine word 11 and 12 into a new scattered features word · 74f65670
      Fenghua Yu 提交于
      commit acec0ce081de0c36459eea91647faf99296445a3 upstream.
      
      It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
      whole feature bits words. To better utilize feature words, re-define
      word 11 to host scattered features and move the four X86_FEATURE_CQM_*
      features into Linux defined word 11. More scattered features can be
      added in word 11 in the future.
      
      Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
      Linux-defined leaf.
      
      Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
      name in the next patch when CPUID.7.1:EAX occupies world 12.
      
      Maximum number of RMID and cache occupancy scale are retrieved from
      CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
      code into a separate function.
      
      KVM doesn't support resctrl now. So it's safe to move the
      X86_FEATURE_CQM_* features to scattered features word 11 for KVM.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Babu Moger <babu.moger@amd.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86 <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      74f65670
    • B
      x86/cpufeatures: Carve out CQM features retrieval · deaf49f3
      Borislav Petkov 提交于
      commit 45fc56e629caa451467e7664fbd4c797c434a6c4 upstream.
      
      ... into a separate function for better readability. Split out from a
      patch from Fenghua Yu <fenghua.yu@intel.com> to keep the mechanical,
      sole code movement separate for easy review.
      
      No functional changes.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: x86@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      deaf49f3
  2. 06 8月, 2019 2 次提交
    • W
      x86: uaccess: Inhibit speculation past access_ok() in user_access_begin() · e28b4155
      Will Deacon 提交于
      commit 6e693b3ffecb0b478c7050b44a4842854154f715 upstream.
      
      Commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
      makes the access_ok() check part of the user_access_begin() preceding a
      series of 'unsafe' accesses.  This has the desirable effect of ensuring
      that all 'unsafe' accesses have been range-checked, without having to
      pick through all of the callsites to verify whether the appropriate
      checking has been made.
      
      However, the consolidated range check does not inhibit speculation, so
      it is still up to the caller to ensure that they are not susceptible to
      any speculative side-channel attacks for user addresses that ultimately
      fail the access_ok() check.
      
      This is an oversight, so use __uaccess_begin_nospec() to ensure that
      speculation is inhibited until the access_ok() check has passed.
      Reported-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      Acked-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      e28b4155
    • L
      make 'user_access_begin()' do 'access_ok()' · aba1b548
      Linus Torvalds 提交于
      commit 594cc251fdd0d231d342d88b2fdff4bc42fb0690 upstream.
      
      Originally, the rule used to be that you'd have to do access_ok()
      separately, and then user_access_begin() before actually doing the
      direct (optimized) user access.
      
      But experience has shown that people then decide not to do access_ok()
      at all, and instead rely on it being implied by other operations or
      similar.  Which makes it very hard to verify that the access has
      actually been range-checked.
      
      If you use the unsafe direct user accesses, hardware features (either
      SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
      Access Never - on ARM) do force you to use user_access_begin().  But
      nothing really forces the range check.
      
      By putting the range check into user_access_begin(), we actually force
      people to do the right thing (tm), and the range check vill be visible
      near the actual accesses.  We have way too long a history of people
      trying to avoid them.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      
      [ Shile: fix following conflicts by adding a dummy arguments ]
      Conflicts:
      	kernel/compat.c
      	kernel/exit.c
      Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com>
      aba1b548
  3. 10 7月, 2019 1 次提交
  4. 05 7月, 2019 4 次提交
  5. 03 7月, 2019 9 次提交
    • J
      arm64: insn: Fix ldadd instruction encoding · 3919d91f
      Jean-Philippe Brucker 提交于
      commit c5e2edeb01ae9ffbdde95bdcdb6d3614ba1eb195 upstream.
      
      GCC 8.1.0 reports that the ldadd instruction encoding, recently added to
      insn.c, doesn't match the mask and couldn't possibly be identified:
      
       linux/arch/arm64/include/asm/insn.h: In function 'aarch64_insn_is_ldadd':
       linux/arch/arm64/include/asm/insn.h:280:257: warning: bitwise comparison always evaluates to false [-Wtautological-compare]
      
      Bits [31:30] normally encode the size of the instruction (1 to 8 bytes)
      and the current instruction value only encodes the 4- and 8-byte
      variants. At the moment only the BPF JIT needs this instruction, and
      doesn't require the 1- and 2-byte variants, but to be consistent with
      our other ldr and str instruction encodings, clear the size field in the
      insn value.
      
      Fixes: 34b8ab091f9ef57a ("bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd")
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reported-by: NKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: NYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3919d91f
    • D
      bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd · 4423a82c
      Daniel Borkmann 提交于
      commit 34b8ab091f9ef57a2bb3c8c8359a0a03a8abf2f9 upstream.
      
      Since ARMv8.1 supplement introduced LSE atomic instructions back in 2016,
      lets add support for STADD and use that in favor of LDXR / STXR loop for
      the XADD mapping if available. STADD is encoded as an alias for LDADD with
      XZR as the destination register, therefore add LDADD to the instruction
      encoder along with STADD as special case and use it in the JIT for CPUs
      that advertise LSE atomics in CPUID register. If immediate offset in the
      BPF XADD insn is 0, then use dst register directly instead of temporary
      one.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4423a82c
    • W
      arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg() · 436869e0
      Will Deacon 提交于
      commit 8e4e0ac02b449297b86498ac24db5786ddd9f647 upstream.
      
      Returning an error code from futex_atomic_cmpxchg_inatomic() indicates
      that the caller should not make any use of *uval, and should instead act
      upon on the value of the error code. Although this is implemented
      correctly in our futex code, we needlessly copy uninitialised stack to
      *uval in the error case, which can easily be avoided.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      436869e0
    • P
      irqchip/mips-gic: Use the correct local interrupt map registers · c22cea5a
      Paul Burton 提交于
      commit 6d4d367d0e9ffab4d64a3436256a6a052dc1195d upstream.
      
      The MIPS GIC contains a block of registers used to map local interrupts
      to a particular CPU interrupt pin. Since these registers are found at a
      consecutive range of addresses we access them using an index, via the
      (read|write)_gic_v[lo]_map accessor functions. We currently use values
      from enum mips_gic_local_interrupt as those indices.
      
      Unfortunately whilst enum mips_gic_local_interrupt provides the correct
      offsets for bits in the pending & mask registers, the ordering of the
      map registers is subtly different... Compared with the ordering of
      pending & mask bits, the map registers move the FDC from the end of the
      list to index 3 after the timer interrupt. As a result the performance
      counter & software interrupts are therefore at indices 4-6 rather than
      indices 3-5.
      
      Notably this causes problems with performance counter interrupts being
      incorrectly mapped on some systems, and presumably will also cause
      problems for FDC interrupts.
      
      Introduce a function to map from enum mips_gic_local_interrupt to the
      index of the corresponding map register, and use it to ensure we access
      the map registers for the correct interrupts.
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Fixes: a0dc5cb5 ("irqchip: mips-gic: Simplify gic_local_irq_domain_map()")
      Fixes: da61fcf9 ("irqchip: mips-gic: Use irq_cpu_online to (un)mask all-VP(E) IRQs")
      Reported-and-tested-by: NArcher Yan <ayan@wavecomp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c22cea5a
    • S
      KVM: x86/mmu: Allocate PAE root array when using SVM's 32-bit NPT · 01a02a98
      Sean Christopherson 提交于
      commit b6b80c78af838bef17501416d5d383fedab0010a upstream.
      
      SVM's Nested Page Tables (NPT) reuses x86 paging for the host-controlled
      page walk.  For 32-bit KVM, this means PAE paging is used even when TDP
      is enabled, i.e. the PAE root array needs to be allocated.
      
      Fixes: ee6268ba ("KVM: x86: Skip pae_root shadow allocation if tdp enabled")
      Cc: stable@vger.kernel.org
      Reported-by: NJiri Palecek <jpalecek@web.de>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: Jiri Palecek <jpalecek@web.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01a02a98
    • R
      x86/resctrl: Prevent possible overrun during bitmap operations · 32746032
      Reinette Chatre 提交于
      commit 32f010deab575199df4ebe7b6aec20c17bb7eccd upstream.
      
      While the DOC at the beginning of lib/bitmap.c explicitly states that
      "The number of valid bits in a given bitmap does _not_ need to be an
      exact multiple of BITS_PER_LONG.", some of the bitmap operations do
      indeed access BITS_PER_LONG portions of the provided bitmap no matter
      the size of the provided bitmap.
      
      For example, if find_first_bit() is provided with an 8 bit bitmap the
      operation will access BITS_PER_LONG bits from the provided bitmap. While
      the operation ensures that these extra bits do not affect the result,
      the memory is still accessed.
      
      The capacity bitmasks (CBMs) are typically stored in u32 since they
      can never exceed 32 bits. A few instances exist where a bitmap_*
      operation is performed on a CBM by simply pointing the bitmap operation
      to the stored u32 value.
      
      The consequence of this pattern is that some bitmap_* operations will
      access out-of-bounds memory when interacting with the provided CBM.
      
      This same issue has previously been addressed with commit 49e00eee
      ("x86/intel_rdt: Fix out-of-bounds memory access in CBM tests")
      but at that time not all instances of the issue were fixed.
      
      Fix this by using an unsigned long to store the capacity bitmask data
      that is passed to bitmap functions.
      
      Fixes: e6519011 ("x86/intel_rdt: Introduce "bit_usage" to display cache allocations details")
      Fixes: f4e80d67 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information")
      Fixes: 95f0b77e ("x86/intel_rdt: Initialize new resource group with sane defaults")
      Signed-off-by: NReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/58c9b6081fd9bf599af0dfc01a6fdd335768efef.1560975645.git.reinette.chatre@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32746032
    • T
      x86/microcode: Fix the microcode load on CPU hotplug for real · 1746dc52
      Thomas Gleixner 提交于
      commit 5423f5ce5ca410b3646f355279e4e937d452e622 upstream.
      
      A recent change moved the microcode loader hotplug callback into the early
      startup phase which is running with interrupts disabled. It missed that
      the callbacks invoke sysfs functions which might sleep causing nice 'might
      sleep' splats with proper debugging enabled.
      
      Split the callbacks and only load the microcode in the early startup phase
      and move the sysfs handling back into the later threaded and preemptible
      bringup phase where it was before.
      
      Fixes: 78f4e932f776 ("x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback")
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906182228350.1766@nanos.tec.linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1746dc52
    • A
      x86/speculation: Allow guests to use SSBD even if host does not · 690049ed
      Alejandro Jimenez 提交于
      commit c1f7fec1eb6a2c86d01bc22afce772c743451d88 upstream.
      
      The bits set in x86_spec_ctrl_mask are used to calculate the guest's value
      of SPEC_CTRL that is written to the MSR before VMENTRY, and control which
      mitigations the guest can enable.  In the case of SSBD, unless the host has
      enabled SSBD always on mode (by passing "spec_store_bypass_disable=on" in
      the kernel parameters), the SSBD bit is not set in the mask and the guest
      can not properly enable the SSBD always on mitigation mode.
      
      This has been confirmed by running the SSBD PoC on a guest using the SSBD
      always on mitigation mode (booted with kernel parameter
      "spec_store_bypass_disable=on"), and verifying that the guest is vulnerable
      unless the host is also using SSBD always on mode. In addition, the guest
      OS incorrectly reports the SSB vulnerability as mitigated.
      
      Always set the SSBD bit in x86_spec_ctrl_mask when the host CPU supports
      it, allowing the guest to use SSBD whether or not the host has chosen to
      enable the mitigation in any of its modes.
      
      Fixes: be6fcb54 ("x86/bugs: Rework spec_ctrl base and mask logic")
      Signed-off-by: NAlejandro Jimenez <alejandro.j.jimenez@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NLiam Merwick <liam.merwick@oracle.com>
      Reviewed-by: NMark Kanda <mark.kanda@oracle.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: bp@alien8.de
      Cc: rkrcmar@redhat.com
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1560187210-11054-1-git-send-email-alejandro.j.jimenez@oracle.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      690049ed
    • N
      arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS · 85a3b1ef
      Nathan Chancellor 提交于
      commit fa63da2ab046b885a7f70291aafc4e8ce015429b upstream.
      
      This is a GCC only option, which warns about ABI changes within GCC, so
      unconditionally adding it breaks Clang with tons of:
      
      warning: unknown warning option '-Wno-psabi' [-Wunknown-warning-option]
      
      and link time failures:
      
      ld.lld: error: undefined symbol: __efistub___stack_chk_guard
      >>> referenced by arm-stub.c:73
      (/home/nathan/cbl/linux/drivers/firmware/efi/libstub/arm-stub.c:73)
      >>>               arm-stub.stub.o:(__efistub_install_memreserve_table)
      in archive ./drivers/firmware/efi/libstub/lib.a
      
      These failures come from the lack of -fno-stack-protector, which is
      added via cc-option in drivers/firmware/efi/libstub/Makefile. When an
      unknown flag is added to KBUILD_CFLAGS, clang will noisily warn that it
      is ignoring the option like above, unlike gcc, who will just error.
      
      $ echo "int main() { return 0; }" > tmp.c
      
      $ clang -Wno-psabi tmp.c; echo $?
      warning: unknown warning option '-Wno-psabi' [-Wunknown-warning-option]
      1 warning generated.
      0
      
      $ gcc -Wsometimes-uninitialized tmp.c; echo $?
      gcc: error: unrecognized command line option
      ‘-Wsometimes-uninitialized’; did you mean ‘-Wmaybe-uninitialized’?
      1
      
      For cc-option to work properly with clang and behave like gcc, -Werror
      is needed, which was done in commit c3f0d0bc ("kbuild, LLVMLinux:
      Add -Werror to cc-option to support clang").
      
      $ clang -Werror -Wno-psabi tmp.c; echo $?
      error: unknown warning option '-Wno-psabi'
      [-Werror,-Wunknown-warning-option]
      1
      
      As a consequence of this, when an unknown flag is unconditionally added
      to KBUILD_CFLAGS, it will cause cc-option to always fail and those flags
      will never get added:
      
      $ clang -Werror -Wno-psabi -fno-stack-protector tmp.c; echo $?
      error: unknown warning option '-Wno-psabi'
      [-Werror,-Wunknown-warning-option]
      1
      
      This can be seen when compiling the whole kernel as some warnings that
      are normally disabled (see below) show up. The full list of flags
      missing from drivers/firmware/efi/libstub are the following (gathered
      from diffing .arm64-stub.o.cmd):
      
      -fno-delete-null-pointer-checks
      -Wno-address-of-packed-member
      -Wframe-larger-than=2048
      -Wno-unused-const-variable
      -fno-strict-overflow
      -fno-merge-all-constants
      -fno-stack-check
      -Werror=date-time
      -Werror=incompatible-pointer-types
      -ffreestanding
      -fno-stack-protector
      
      Use cc-disable-warning so that it gets disabled for GCC and does nothing
      for Clang.
      
      Fixes: ebcc5928c5d9 ("arm64: Silence gcc warnings about arch ABI drift")
      Link: https://github.com/ClangBuiltLinux/linux/issues/511Reported-by: NQian Cai <cai@lca.pw>
      Acked-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85a3b1ef
  6. 25 6月, 2019 18 次提交