1. 20 8月, 2019 1 次提交
    • T
      x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h · c49a0a80
      Tom Lendacky 提交于
      There have been reports of RDRAND issues after resuming from suspend on
      some AMD family 15h and family 16h systems. This issue stems from a BIOS
      not performing the proper steps during resume to ensure RDRAND continues
      to function properly.
      
      RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
      reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
      support using CPUID, including the kernel, will believe that RDRAND is
      not supported.
      
      Update the CPU initialization to clear the RDRAND CPUID bit for any family
      15h and 16h processor that supports RDRAND. If it is known that the family
      15h or family 16h system does not have an RDRAND resume issue or that the
      system will not be placed in suspend, the "rdrand=force" kernel parameter
      can be used to stop the clearing of the RDRAND CPUID bit.
      
      Additionally, update the suspend and resume path to save and restore the
      MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
      place after resuming from suspend.
      
      Note, that clearing the RDRAND CPUID bit does not prevent a processor
      that normally supports the RDRAND instruction from executing it. So any
      code that determined the support based on family and model won't #UD.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
      Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: <stable@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "x86@kernel.org" <x86@kernel.org>
      Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com
      c49a0a80
  2. 19 8月, 2019 2 次提交
  3. 17 8月, 2019 1 次提交
  4. 16 8月, 2019 1 次提交
  5. 13 8月, 2019 2 次提交
    • T
      x86/fpu/math-emu: Address fallthrough warnings · 91be2587
      Thomas Gleixner 提交于
      /home/tglx/work/kernel/linus/linux/arch/x86/math-emu/errors.c: In function ‘FPU_printall’:
      /home/tglx/work/kernel/linus/linux/arch/x86/math-emu/errors.c:187:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
          tagi = FPU_Special(r);
          ~~~~~^~~~~~~~~~~~~~~~
      /home/tglx/work/kernel/linus/linux/arch/x86/math-emu/errors.c:188:3: note: here
         case TAG_Valid:
         ^~~~
      /home/tglx/work/kernel/linus/linux/arch/x86/math-emu/fpu_trig.c: In function ‘fyl2xp1’:
      /home/tglx/work/kernel/linus/linux/arch/x86/math-emu/fpu_trig.c:1353:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
          if (denormal_operand() < 0)
             ^
      /home/tglx/work/kernel/linus/linux/arch/x86/math-emu/fpu_trig.c:1356:3: note: here
         case TAG_Zero:
      
      Remove the pointless 'break;' after 'continue;' while at it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      91be2587
    • B
      x86/apic/32: Fix yet another implicit fallthrough warning · 5785675d
      Borislav Petkov 提交于
      Fix
      
        arch/x86/kernel/apic/probe_32.c: In function ‘default_setup_apic_routing’:
        arch/x86/kernel/apic/probe_32.c:146:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
            if (!APIC_XAPIC(version)) {
               ^
        arch/x86/kernel/apic/probe_32.c:151:3: note: here
         case X86_VENDOR_HYGON:
         ^~~~
      
      for 32-bit builds.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190811154036.29805-1-bp@alien8.de
      5785675d
  6. 12 8月, 2019 1 次提交
    • F
      x86/umwait: Fix error handling in umwait_init() · e7409258
      Fenghua Yu 提交于
      Currently, failure of cpuhp_setup_state() is ignored and the syscore ops
      and the control interfaces can still be added even after the failure. But,
      this error handling will cause a few issues:
      
      1. The CPUs may have different values in the IA32_UMWAIT_CONTROL
         MSR because there is no way to roll back the control MSR on
         the CPUs which already set the MSR before the failure.
      
      2. If the sysfs interface is added successfully, there will be a mismatch
         between the global control value and the control MSR:
         - The interface shows the default global control value. But,
           the control MSR is not set to the value because the CPU online
           function, which is supposed to set the MSR to the value,
           is not installed.
         - If the sysadmin changes the global control value through
           the interface, the control MSR on all current online CPUs is
           set to the new value. But, the control MSR on newly onlined CPUs
           after the value change will not be set to the new value due to
           lack of the CPU online function.
      
      3. On resume from suspend/hibernation, the boot CPU restores the control
         MSR to the global control value through the syscore ops. But, the
         control MSR on all APs is not set due to lake of the CPU online
         function.
      
      To solve the issues and enforce consistent behavior on the failure
      of the CPU hotplug setup, make the following changes:
      
      1. Cache the original control MSR value which is configured by
         hardware or BIOS before kernel boot. This value is likely to
         be 0. But it could be a different number as well. Cache the
         control MSR only once before the MSR is changed.
      2. Add the CPU offline function so that the MSR is restored to the
         original control value on all CPUs on the failure.
      3. On the failure, exit from cpumait_init() so that the syscore ops
         and the control interfaces are not added.
      Reported-by: NValdis Kletnieks <valdis.kletnieks@vt.edu>
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1565401237-60936-1-git-send-email-fenghua.yu@intel.com
      e7409258
  7. 08 8月, 2019 3 次提交
  8. 07 8月, 2019 2 次提交
  9. 05 8月, 2019 5 次提交
    • P
      x86: kvm: remove useless calls to kvm_para_available · 57b76bdb
      Paolo Bonzini 提交于
      Most code in arch/x86/kernel/kvm.c is called through x86_hyper_kvm, and thus only
      runs if KVM has been detected.  There is no need to check again for the CPUID
      base.
      
      Cc: Sergio Lopez <slp@redhat.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      57b76bdb
    • G
      KVM: no need to check return value of debugfs_create functions · 3e7093d0
      Greg KH 提交于
      When calling debugfs functions, there is no need to ever check the
      return value.  The function can work or not, but the code logic should
      never do something different based on this.
      
      Also, when doing this, change kvm_arch_create_vcpu_debugfs() to return
      void instead of an integer, as we should not care at all about if this
      function actually does anything or not.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <x86@kernel.org>
      Cc: <kvm@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3e7093d0
    • P
      KVM: remove kvm_arch_has_vcpu_debugfs() · 741cbbae
      Paolo Bonzini 提交于
      There is no need for this function as all arches have to implement
      kvm_arch_create_vcpu_debugfs() no matter what.  A #define symbol
      let us actually simplify the code.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      741cbbae
    • W
      KVM: Fix leak vCPU's VMCS value into other pCPU · 17e433b5
      Wanpeng Li 提交于
      After commit d73eb57b (KVM: Boost vCPUs that are delivering interrupts), a
      five years old bug is exposed. Running ebizzy benchmark in three 80 vCPUs VMs
      on one 80 pCPUs Skylake server, a lot of rcu_sched stall warning splatting
      in the VMs after stress testing:
      
       INFO: rcu_sched detected stalls on CPUs/tasks: { 4 41 57 62 77} (detected by 15, t=60004 jiffies, g=899, c=898, q=15073)
       Call Trace:
         flush_tlb_mm_range+0x68/0x140
         tlb_flush_mmu.part.75+0x37/0xe0
         tlb_finish_mmu+0x55/0x60
         zap_page_range+0x142/0x190
         SyS_madvise+0x3cd/0x9c0
         system_call_fastpath+0x1c/0x21
      
      swait_active() sustains to be true before finish_swait() is called in
      kvm_vcpu_block(), voluntarily preempted vCPUs are taken into account
      by kvm_vcpu_on_spin() loop greatly increases the probability condition
      kvm_arch_vcpu_runnable(vcpu) is checked and can be true, when APICv
      is enabled the yield-candidate vCPU's VMCS RVI field leaks(by
      vmx_sync_pir_to_irr()) into spinning-on-a-taken-lock vCPU's current
      VMCS.
      
      This patch fixes it by checking conservatively a subset of events.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Marc Zyngier <Marc.Zyngier@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: 98f4a146 (KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop)
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      17e433b5
    • W
      KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire · a48d06f9
      Wanpeng Li 提交于
      kvm_set_pending_timer() will take care to wake up the sleeping vCPU which
      has pending timer, don't need to check this in apic_timer_expired() again.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a48d06f9
  10. 31 7月, 2019 1 次提交
  11. 29 7月, 2019 1 次提交
  12. 26 7月, 2019 1 次提交
    • G
      perf/x86/intel: Mark expected switch fall-throughs · 7b26b91d
      Gustavo A. R. Silva 提交于
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warnings:
      
      arch/x86/events/intel/core.c: In function ‘intel_pmu_init’:
      arch/x86/events/intel/core.c:4959:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
         pmem = true;
         ~~~~~^~~~~~
      arch/x86/events/intel/core.c:4960:2: note: here
        case INTEL_FAM6_SKYLAKE_MOBILE:
        ^~~~
      arch/x86/events/intel/core.c:5008:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
         pmem = true;
         ~~~~~^~~~~~
      arch/x86/events/intel/core.c:5009:2: note: here
        case INTEL_FAM6_ICELAKE_MOBILE:
        ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      7b26b91d
  13. 25 7月, 2019 7 次提交
  14. 24 7月, 2019 3 次提交
    • W
      KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption · 266e85a5
      Wanpeng Li 提交于
      Commit 11752adb (locking/pvqspinlock: Implement hybrid PV queued/unfair locks)
      introduces hybrid PV queued/unfair locks
       - queued mode (no starvation)
       - unfair mode (good performance on not heavily contended lock)
      The lock waiter goes into the unfair mode especially in VMs with over-commit
      vCPUs since increaing over-commitment increase the likehood that the queue
      head vCPU may have been preempted and not actively spinning.
      
      However, reschedule queue head vCPU timely to acquire the lock still can get
      better performance than just depending on lock stealing in over-subscribe
      scenario.
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
                   vanilla     boosting    improved
       1VM          23520        25040         6%
       2VM           8000        13600        70%
       3VM           3100         5400        74%
      
      The lock holder vCPU yields to the queue head vCPU when unlock, to boost queue
      head vCPU which is involuntary preemption or the one which is voluntary halt
      due to fail to acquire the lock after a short spin in the guest.
      
      Cc: Waiman Long <longman@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      266e85a5
    • M
      x86/entry/32: Pass cr2 to do_async_page_fault() · b8f70953
      Matt Mullins 提交于
      Commit a0d14b89 ("x86/mm, tracing: Fix CR2 corruption") added the
      address parameter to do_async_page_fault(), but does not pass it from the
      32-bit entry point.  To plumb it through, factor-out
      common_exception_read_cr2 in the same fashion as common_exception, and uses
      it from both page_fault and async_page_fault.
      
      For a 32-bit KVM guest, this fixes:
      
        Run /sbin/init as init process
        Starting init: /sbin/init exists but couldn't execute it (error -14)
      
      Fixes: a0d14b89 ("x86/mm, tracing: Fix CR2 corruption")
      Signed-off-by: NMatt Mullins <mmullins@fb.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20190724042058.24506-1-mmullins@fb.com
      b8f70953
    • C
      Documentation: move Documentation/virtual to Documentation/virt · 2f5947df
      Christoph Hellwig 提交于
      Renaming docs seems to be en vogue at the moment, so fix on of the
      grossly misnamed directories.  We usually never use "virtual" as
      a shortcut for virtualization in the kernel, but always virt,
      as seen in the virt/ top-level directory.  Fix up the documentation
      to match that.
      
      Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2f5947df
  15. 22 7月, 2019 9 次提交