1. 29 8月, 2019 9 次提交
    • T
      x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h · e063b03b
      Tom Lendacky 提交于
      commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24 upstream.
      
      There have been reports of RDRAND issues after resuming from suspend on
      some AMD family 15h and family 16h systems. This issue stems from a BIOS
      not performing the proper steps during resume to ensure RDRAND continues
      to function properly.
      
      RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
      reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
      support using CPUID, including the kernel, will believe that RDRAND is
      not supported.
      
      Update the CPU initialization to clear the RDRAND CPUID bit for any family
      15h and 16h processor that supports RDRAND. If it is known that the family
      15h or family 16h system does not have an RDRAND resume issue or that the
      system will not be placed in suspend, the "rdrand=force" kernel parameter
      can be used to stop the clearing of the RDRAND CPUID bit.
      
      Additionally, update the suspend and resume path to save and restore the
      MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
      place after resuming from suspend.
      
      Note, that clearing the RDRAND CPUID bit does not prevent a processor
      that normally supports the RDRAND instruction from executing it. So any
      code that determined the support based on family and model won't #UD.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
      Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "x86@kernel.org" <x86@kernel.org>
      Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e063b03b
    • T
      x86/apic: Handle missing global clockevent gracefully · 685e598e
      Thomas Gleixner 提交于
      commit f897e60a12f0b9146357780d317879bce2a877dc upstream.
      
      Some newer machines do not advertise legacy timers. The kernel can handle
      that situation if the TSC and the CPU frequency are enumerated by CPUID or
      MSRs and the CPU supports TSC deadline timer. If the CPU does not support
      TSC deadline timer the local APIC timer frequency has to be known as well.
      
      Some Ryzens machines do not advertize legacy timers, but there is no
      reliable way to determine the bus frequency which feeds the local APIC
      timer when the machine allows overclocking of that frequency.
      
      As there is no legacy timer the local APIC timer calibration crashes due to
      a NULL pointer dereference when accessing the not installed global clock
      event device.
      
      Switch the calibration loop to a non interrupt based one, which polls
      either TSC (if frequency is known) or jiffies. The latter requires a global
      clockevent. As the machines which do not have a global clockevent installed
      have a known TSC frequency this is a non issue. For older machines where
      TSC frequency is not known, there is no known case where the legacy timers
      do not exist as that would have been reported long ago.
      Reported-by: NDaniel Drake <drake@endlessm.com>
      Reported-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Tested-by: NDaniel Drake <drake@endlessm.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de
      Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      685e598e
    • S
      x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386 · f9747104
      Sean Christopherson 提交于
      commit b63f20a778c88b6a04458ed6ffc69da953d3a109 upstream.
      
      Use 'lea' instead of 'add' when adjusting %rsp in CALL_NOSPEC so as to
      avoid clobbering flags.
      
      KVM's emulator makes indirect calls into a jump table of sorts, where
      the destination of the CALL_NOSPEC is a small blob of code that performs
      fast emulation by executing the target instruction with fixed operands.
      
        adcb_al_dl:
           0x000339f8 <+0>:   adc    %dl,%al
           0x000339fa <+2>:   ret
      
      A major motiviation for doing fast emulation is to leverage the CPU to
      handle consumption and manipulation of arithmetic flags, i.e. RFLAGS is
      both an input and output to the target of CALL_NOSPEC.  Clobbering flags
      results in all sorts of incorrect emulation, e.g. Jcc instructions often
      take the wrong path.  Sans the nops...
      
        asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
           0x0003595a <+58>:  mov    0xc0(%ebx),%eax
           0x00035960 <+64>:  mov    0x60(%ebx),%edx
           0x00035963 <+67>:  mov    0x90(%ebx),%ecx
           0x00035969 <+73>:  push   %edi
           0x0003596a <+74>:  popf
           0x0003596b <+75>:  call   *%esi
           0x000359a0 <+128>: pushf
           0x000359a1 <+129>: pop    %edi
           0x000359a2 <+130>: mov    %eax,0xc0(%ebx)
           0x000359b1 <+145>: mov    %edx,0x60(%ebx)
      
        ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
           0x000359a8 <+136>: mov    -0x10(%ebp),%eax
           0x000359ab <+139>: and    $0x8d5,%edi
           0x000359b4 <+148>: and    $0xfffff72a,%eax
           0x000359b9 <+153>: or     %eax,%edi
           0x000359bd <+157>: mov    %edi,0x4(%ebx)
      
      For the most part this has gone unnoticed as emulation of guest code
      that can trigger fast emulation is effectively limited to MMIO when
      running on modern hardware, and MMIO is rarely, if ever, accessed by
      instructions that affect or consume flags.
      
      Breakage is almost instantaneous when running with unrestricted guest
      disabled, in which case KVM must emulate all instructions when the guest
      has invalid state, e.g. when the guest is in Big Real Mode during early
      BIOS.
      
      Fixes: 776b043848fd2 ("x86/retpoline: Add initial retpoline support")
      Fixes: 1a29b5b7 ("KVM: x86: Make indirect calls in emulator speculation safe")
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190822211122.27579-1-sean.j.christopherson@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9747104
    • M
      KVM: arm: Don't write junk to CP15 registers on reset · ef61b790
      Marc Zyngier 提交于
      [ Upstream commit c69509c70aa45a8c4954c88c629a64acf4ee4a36 ]
      
      At the moment, the way we reset CP15 registers is mildly insane:
      We write junk to them, call the reset functions, and then check that
      we have something else in them.
      
      The "fun" thing is that this can happen while the guest is running
      (PSCI, for example). If anything in KVM has to evaluate the state
      of a CP15 register while junk is in there, bad thing may happen.
      
      Let's stop doing that. Instead, we track that we have called a
      reset function for that register, and assume that the reset
      function has done something.
      
      In the end, the very need of this reset check is pretty dubious,
      as it doesn't check everything (a lot of the CP15 reg leave outside
      of the cp15_regs[] array). It may well be axed in the near future.
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ef61b790
    • M
      KVM: arm64: Don't write junk to sysregs on reset · d5cb5b49
      Marc Zyngier 提交于
      [ Upstream commit 03fdfb2690099c19160a3f2c5b77db60b3afeded ]
      
      At the moment, the way we reset system registers is mildly insane:
      We write junk to them, call the reset functions, and then check that
      we have something else in them.
      
      The "fun" thing is that this can happen while the guest is running
      (PSCI, for example). If anything in KVM has to evaluate the state
      of a system register while junk is in there, bad thing may happen.
      
      Let's stop doing that. Instead, we track that we have called a
      reset function for that register, and assume that the reset
      function has done something. This requires fixing a couple of
      sysreg refinition in the trap table.
      
      In the end, the very need of this reset check is pretty dubious,
      as it doesn't check everything (a lot of the sysregs leave outside of
      the sys_regs[] array). It may well be axed in the near future.
      Tested-by: NZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d5cb5b49
    • V
      x86/lib/cpu: Address missing prototypes warning · 923de016
      Valdis Klētnieks 提交于
      [ Upstream commit 04f5bda84b0712d6f172556a7e8dca9ded5e73b9 ]
      
      When building with W=1, warnings about missing prototypes are emitted:
      
        CC      arch/x86/lib/cpu.o
      arch/x86/lib/cpu.c:5:14: warning: no previous prototype for 'x86_family' [-Wmissing-prototypes]
          5 | unsigned int x86_family(unsigned int sig)
            |              ^~~~~~~~~~
      arch/x86/lib/cpu.c:18:14: warning: no previous prototype for 'x86_model' [-Wmissing-prototypes]
         18 | unsigned int x86_model(unsigned int sig)
            |              ^~~~~~~~~
      arch/x86/lib/cpu.c:33:14: warning: no previous prototype for 'x86_stepping' [-Wmissing-prototypes]
         33 | unsigned int x86_stepping(unsigned int sig)
            |              ^~~~~~~~~~~~
      
      Add the proper include file so the prototypes are there.
      Signed-off-by: NValdis Kletnieks <valdis.kletnieks@vt.edu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/42513.1565234837@turing-policeSigned-off-by: NSasha Levin <sashal@kernel.org>
      923de016
    • V
      s390: put _stext and _etext into .text section · db106f69
      Vasily Gorbik 提交于
      [ Upstream commit 24350fdadbdec780406a1ef988e6cd3875e374a8 ]
      
      Perf relies on _etext and _stext symbols being one of 't', 'T', 'v' or
      'V'. Put them into .text section to guarantee that.
      
      Also moves padding to page boundary inside .text which has an effect that
      .text section is now padded with nops rather than 0's, which apparently
      has been the initial intention for specifying 0x0700 fill expression.
      Reported-by: NThomas Richter <tmricht@linux.ibm.com>
      Tested-by: NThomas Richter <tmricht@linux.ibm.com>
      Suggested-by: NAndreas Krebbel <krebbel@linux.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      db106f69
    • V
      mips: fix cacheinfo · 70b4edd7
      Vladimir Kondratiev 提交于
      [ Upstream commit b8bea8a5e5d942e62203416ab41edecaed4fda02 ]
      
      Because CONFIG_OF defined for MIPS, cacheinfo attempts to fill information
      from DT, ignoring data filled by architecture routine. This leads to error
      reported
      
       cacheinfo: Unable to detect cache hierarchy for CPU 0
      
      Way to fix this provided in
      commit fac51482 ("drivers: base: cacheinfo: fix x86 with
       CONFIG_OF enabled")
      
      Utilize same mechanism to report that cacheinfo set by architecture
      specific function
      Signed-off-by: NVladimir Kondratiev <vladimir.kondratiev@linux.intel.com>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      70b4edd7
    • T
      MIPS: kernel: only use i8253 clocksource with periodic clockevent · 326175aa
      Thomas Bogendoerfer 提交于
      [ Upstream commit a07e3324538a989b7cdbf2c679be6a7f9df2544f ]
      
      i8253 clocksource needs a free running timer. This could only
      be used, if i8253 clockevent is set up as periodic.
      Signed-off-by: NThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: NPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      326175aa
  2. 25 8月, 2019 8 次提交
  3. 16 8月, 2019 8 次提交
  4. 07 8月, 2019 15 次提交
    • T
      x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS · b88241ae
      Thomas Gleixner 提交于
      commit f36cf386e3fec258a341d446915862eded3e13d8 upstream
      
      Intel provided the following information:
      
       On all current Atom processors, instructions that use a segment register
       value (e.g. a load or store) will not speculatively execute before the
       last writer of that segment retires. Thus they will not use a
       speculatively written segment value.
      
      That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
      entry paths can be excluded from the extra LFENCE if PTI is disabled.
      
      Create a separate bug flag for the through SWAPGS speculation and mark all
      out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
      are excluded from the whole mitigation mess anyway.
      Reported-by: NAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b88241ae
    • J
      x86/entry/64: Use JMP instead of JMPQ · 931b6bfe
      Josh Poimboeuf 提交于
      commit 64dbc122b20f75183d8822618c24f85144a5a94d upstream
      
      Somehow the swapgs mitigation entry code patch ended up with a JMPQ
      instruction instead of JMP, where only the short jump is needed.  Some
      assembler versions apparently fail to optimize JMPQ into a two-byte JMP
      when possible, instead always using a 7-byte JMP with relocation.  For
      some reason that makes the entry code explode with a #GP during boot.
      
      Change it back to "JMP" as originally intended.
      
      Fixes: 18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      931b6bfe
    • J
      x86/speculation: Enable Spectre v1 swapgs mitigations · 23e7a7b3
      Josh Poimboeuf 提交于
      commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream
      
      The previous commit added macro calls in the entry code which mitigate the
      Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
      enabled.  Enable those features where applicable.
      
      The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
      
      There are different features which can affect the risk of attack:
      
      - When FSGSBASE is enabled, unprivileged users are able to place any
        value in GS, using the wrgsbase instruction.  This means they can
        write a GS value which points to any value in kernel space, which can
        be useful with the following gadget in an interrupt/exception/NMI
        handler:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	// dependent load or store based on the value of %reg
      	// for example: mov %(reg1), %reg2
      
        If an interrupt is coming from user space, and the entry code
        speculatively skips the swapgs (due to user branch mistraining), it
        may speculatively execute the GS-based load and a subsequent dependent
        load or store, exposing the kernel data to an L1 side channel leak.
      
        Note that, on Intel, a similar attack exists in the above gadget when
        coming from kernel space, if the swapgs gets speculatively executed to
        switch back to the user GS.  On AMD, this variant isn't possible
        because swapgs is serializing with respect to future GS-based
        accesses.
      
        NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
      	doesn't exist quite yet.
      
      - When FSGSBASE is disabled, the issue is mitigated somewhat because
        unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
        restricts GS values to user space addresses only.  That means the
        gadget would need an additional step, since the target kernel address
        needs to be read from user space first.  Something like:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	mov (%reg1), %reg2
      	// dependent load or store based on the value of %reg2
      	// for example: mov %(reg2), %reg3
      
        It's difficult to audit for this gadget in all the handlers, so while
        there are no known instances of it, it's entirely possible that it
        exists somewhere (or could be introduced in the future).  Without
        tooling to analyze all such code paths, consider it vulnerable.
      
        Effects of SMAP on the !FSGSBASE case:
      
        - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
          susceptible to Meltdown), the kernel is prevented from speculatively
          reading user space memory, even L1 cached values.  This effectively
          disables the !FSGSBASE attack vector.
      
        - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
          still prevents the kernel from speculatively reading user space
          memory.  But it does *not* prevent the kernel from reading the
          user value from L1, if it has already been cached.  This is probably
          only a small hurdle for an attacker to overcome.
      
      Thanks to Dave Hansen for contributing the speculative_smap() function.
      
      Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
      is serializing on AMD.
      
      [ tglx: Fixed the USER fence decision and polished the comment as suggested
        	by Dave Hansen ]
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23e7a7b3
    • J
      x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations · befb822c
      Josh Poimboeuf 提交于
      commit 18ec54fdd6d18d92025af097cd042a75cf0ea24c upstream
      
      Spectre v1 isn't only about array bounds checks.  It can affect any
      conditional checks.  The kernel entry code interrupt, exception, and NMI
      handlers all have conditional swapgs checks.  Those may be problematic in
      the context of Spectre v1, as kernel code can speculatively run with a user
      GS.
      
      For example:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg
      	mov (%reg), %reg1
      
      When coming from user space, the CPU can speculatively skip the swapgs, and
      then do a speculative percpu load using the user GS value.  So the user can
      speculatively force a read of any kernel value.  If a gadget exists which
      uses the percpu value as an address in another load/store, then the
      contents of the kernel value may become visible via an L1 side channel
      attack.
      
      A similar attack exists when coming from kernel space.  The CPU can
      speculatively do the swapgs, causing the user GS to get used for the rest
      of the speculative window.
      
      The mitigation is similar to a traditional Spectre v1 mitigation, except:
      
        a) index masking isn't possible; because the index (percpu offset)
           isn't user-controlled; and
      
        b) an lfence is needed in both the "from user" swapgs path and the
           "from kernel" non-swapgs path (because of the two attacks described
           above).
      
      The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
      CR3 write when PTI is enabled.  Since CR3 writes are serializing, the
      lfences can be skipped in those cases.
      
      On the other hand, the kernel entry swapgs paths don't depend on PTI.
      
      To avoid unnecessary lfences for the user entry case, create two separate
      features for alternative patching:
      
        X86_FEATURE_FENCE_SWAPGS_USER
        X86_FEATURE_FENCE_SWAPGS_KERNEL
      
      Use these features in entry code to patch in lfences where needed.
      
      The features aren't enabled yet, so there's no functional change.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      befb822c
    • F
      x86/cpufeatures: Combine word 11 and 12 into a new scattered features word · b5dd7f61
      Fenghua Yu 提交于
      commit acec0ce081de0c36459eea91647faf99296445a3 upstream
      
      It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
      whole feature bits words. To better utilize feature words, re-define
      word 11 to host scattered features and move the four X86_FEATURE_CQM_*
      features into Linux defined word 11. More scattered features can be
      added in word 11 in the future.
      
      Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
      Linux-defined leaf.
      
      Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
      name in the next patch when CPUID.7.1:EAX occupies world 12.
      
      Maximum number of RMID and cache occupancy scale are retrieved from
      CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
      code into a separate function.
      
      KVM doesn't support resctrl now. So it's safe to move the
      X86_FEATURE_CQM_* features to scattered features word 11 for KVM.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Babu Moger <babu.moger@amd.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86 <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5dd7f61
    • B
      x86/cpufeatures: Carve out CQM features retrieval · 16ad0b63
      Borislav Petkov 提交于
      commit 45fc56e629caa451467e7664fbd4c797c434a6c4 upstream
      
      ... into a separate function for better readability. Split out from a
      patch from Fenghua Yu <fenghua.yu@intel.com> to keep the mechanical,
      sole code movement separate for easy review.
      
      No functional changes.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: x86@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16ad0b63
    • A
      x86/vdso: Prevent segfaults due to hoisted vclock reads · 3732a473
      Andy Lutomirski 提交于
      commit ff17bbe0bb405ad8b36e55815d381841f9fdeebc upstream.
      
      GCC 5.5.0 sometimes cleverly hoists reads of the pvclock and/or hvclock
      pages before the vclock mode checks.  This creates a path through
      vclock_gettime() in which no vclock is enabled at all (due to disabled
      TSC on old CPUs, for example) but the pvclock or hvclock page
      nevertheless read.  This will segfault on bare metal.
      
      This fixes commit 459e3a21535a ("gcc-9: properly declare the
      {pv,hv}clock_page storage") in the sense that, before that commit, GCC
      didn't seem to generate the offending code.  There was nothing wrong
      with that commit per se, and -stable maintainers should backport this to
      all supported kernels regardless of whether the offending commit was
      present, since the same crash could just as easily be triggered by the
      phase of the moon.
      
      On GCC 9.1.1, this doesn't seem to affect the generated code at all, so
      I'm not too concerned about performance regressions from this fix.
      
      Cc: stable@vger.kernel.org
      Cc: x86@kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Reported-by: NDuncan Roe <duncan_roe@optusnet.com.au>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3732a473
    • L
      gcc-9: properly declare the {pv,hv}clock_page storage · 8320768d
      Linus Torvalds 提交于
      commit 459e3a21535ae3c7a9a123650e54f5c882b8fcbf upstream.
      
      The pvlock_page and hvclock_page variables are (as the name implies)
      addresses to pages, created by the linker script.
      
      But we declared them as just "extern u8" variables, which _works_, but
      now that gcc does some more bounds checking, it causes warnings like
      
          warning: array subscript 1 is outside array bounds of ‘u8[1]’
      
      when we then access more than one byte from those variables.
      
      Fix this by simply making the declaration of the variables match
      reality, which makes the compiler happy too.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8320768d
    • E
      ARC: enable uboot support unconditionally · 89f3896b
      Eugeniy Paltsev 提交于
      commit 493a2f812446e92bcb1e69a77381b4d39808d730 upstream.
      
      After reworking U-boot args handling code and adding paranoid
      arguments check we can eliminate CONFIG_ARC_UBOOT_SUPPORT and
      enable uboot support unconditionally.
      
      For JTAG case we can assume that core registers will come up
      reset value of 0 or in worst case we rely on user passing
      '-on=clear_regs' to Metaware debugger.
      
      Cc: stable@vger.kernel.org
      Tested-by: NCorentin LABBE <clabbe@baylibre.com>
      Signed-off-by: NEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89f3896b
    • W
      arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG} · 8dfef0f4
      Will Deacon 提交于
      commit 147b9635e6347104b91f48ca9dca61eb0fbf2a54 upstream.
      
      If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have
      their architecturally maximum values, which defeats the use of
      FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous
      machines.
      
      Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively
      saturate at zero.
      
      Fixes: 3c739b57 ("arm64: Keep track of CPU feature registers")
      Cc: <stable@vger.kernel.org> # 4.4.x-
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dfef0f4
    • W
      arm64: compat: Allow single-byte watchpoints on all addresses · 2bddc985
      Will Deacon 提交于
      commit 849adec41203ac5837c40c2d7e08490ffdef3c2c upstream.
      
      Commit d968d2b8 ("ARM: 7497/1: hw_breakpoint: allow single-byte
      watchpoints on all addresses") changed the validation requirements for
      hardware watchpoints on arch/arm/. Update our compat layer to implement
      the same relaxation.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NWill Deacon <will@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bddc985
    • H
      parisc: Fix build of compressed kernel even with debug enabled · 5f80ac50
      Helge Deller 提交于
      commit 3fe6c873af2f2247544debdbe51ec29f690a2ccf upstream.
      
      With debug info enabled (CONFIG_DEBUG_INFO=y) the resulting vmlinux may get
      that huge that we need to increase the start addresss for the decompression
      text section otherwise one will face a linker error.
      Reported-by: NSven Schnelle <svens@stackframe.org>
      Tested-by: NSven Schnelle <svens@stackframe.org>
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f80ac50
    • Z
      x86, boot: Remove multiple copy of static function sanitize_boot_params() · 84ce0452
      Zhenzhong Duan 提交于
      [ Upstream commit 8c5477e8046ca139bac250386c08453da37ec1ae ]
      
      Kernel build warns:
       'sanitize_boot_params' defined but not used [-Wunused-function]
      
      at below files:
        arch/x86/boot/compressed/cmdline.c
        arch/x86/boot/compressed/error.c
        arch/x86/boot/compressed/early_serial_console.c
        arch/x86/boot/compressed/acpi.c
      
      That's becausethey each include misc.h which includes a definition of
      sanitize_boot_params() via bootparam_utils.h.
      
      Remove the inclusion from misc.h and have the c file including
      bootparam_utils.h directly.
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1563283092-1189-1-git-send-email-zhenzhong.duan@oracle.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      84ce0452
    • J
      x86/paravirt: Fix callee-saved function ELF sizes · 740e0167
      Josh Poimboeuf 提交于
      [ Upstream commit 083db6764821996526970e42d09c1ab2f4155dd4 ]
      
      The __raw_callee_save_*() functions have an ELF symbol size of zero,
      which confuses objtool and other tools.
      
      Fixes a bunch of warnings like the following:
      
        arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pte_val() is missing an ELF size annotation
        arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pgd_val() is missing an ELF size annotation
        arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pte() is missing an ELF size annotation
        arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pgd() is missing an ELF size annotation
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJuergen Gross <jgross@suse.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/afa6d49bb07497ca62e4fc3b27a2d0cece545b4e.1563413318.git.jpoimboe@redhat.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      740e0167
    • J
      x86/kvm: Don't call kvm_spurious_fault() from .fixup · ba5c072f
      Josh Poimboeuf 提交于
      [ Upstream commit 3901336ed9887b075531bffaeef7742ba614058b ]
      
      After making a change to improve objtool's sibling call detection, it
      started showing the following warning:
      
        arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame
      
      The problem is the ____kvm_handle_fault_on_reboot() macro.  It does a
      fake call by pushing a fake RIP and doing a jump.  That tricks the
      unwinder into printing the function which triggered the exception,
      rather than the .fixup code.
      
      Instead of the hack to make it look like the original function made the
      call, just change the macro so that the original function actually does
      make the call.  This allows removal of the hack, and also makes objtool
      happy.
      
      I triggered a vmx instruction exception and verified that the stack
      trace is still sane:
      
        kernel BUG at arch/x86/kvm/x86.c:358!
        invalid opcode: 0000 [#1] SMP PTI
        CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16
        Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017
        RIP: 0010:kvm_spurious_fault+0x5/0x10
        Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
        RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246
        RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000
        RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0
        RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000
        R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0
        R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000
        FS:  00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        PKRU: 55555554
        Call Trace:
         loaded_vmcs_init+0x4f/0xe0
         alloc_loaded_vmcs+0x38/0xd0
         vmx_create_vcpu+0xf7/0x600
         kvm_vm_ioctl+0x5e9/0x980
         ? __switch_to_asm+0x40/0x70
         ? __switch_to_asm+0x34/0x70
         ? __switch_to_asm+0x40/0x70
         ? __switch_to_asm+0x34/0x70
         ? free_one_page+0x13f/0x4e0
         do_vfs_ioctl+0xa4/0x630
         ksys_ioctl+0x60/0x90
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x55/0x1c0
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7fa349b1ee5b
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.1563413318.git.jpoimboe@redhat.comSigned-off-by: NSasha Levin <sashal@kernel.org>
      ba5c072f