1. 03 3月, 2020 1 次提交
    • H
      KVM: SVM: Fix the svm vmexit code for WRMSR · aaca2100
      Haiwei Li 提交于
      In svm, exit_code for MSR writes is not EXIT_REASON_MSR_WRITE which
      belongs to vmx.
      
      According to amd manual, SVM_EXIT_MSR(7ch) is the exit_code of VMEXIT_MSR
      due to RDMSR or WRMSR access to protected MSR. Additionally, the processor
      indicates in the VMCB's EXITINFO1 whether a RDMSR(EXITINFO1=0) or
      WRMSR(EXITINFO1=1) was intercepted.
      Signed-off-by: NHaiwei Li <lihaiwei@tencent.com>
      Fixes: 1e9e2622 ("KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpath", 2019-11-21)
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      aaca2100
  2. 28 2月, 2020 2 次提交
  3. 23 2月, 2020 2 次提交
  4. 22 2月, 2020 3 次提交
  5. 12 2月, 2020 1 次提交
    • P
      KVM: x86: do not reset microcode version on INIT or RESET · bab0c318
      Paolo Bonzini 提交于
      Do not initialize the microcode version at RESET or INIT, only on vCPU
      creation.   Microcode updates are not lost during INIT, and exact
      behavior across a warm RESET is not specified by the architecture.
      
      Since we do not support a microcode update directly from the hypervisor,
      but only as a result of userspace setting the microcode version MSR,
      it's simpler for userspace if we do nothing in KVM and let userspace
      emulate behavior for RESET as it sees fit.
      
      Userspace can tie the fix to the availability of MSR_IA32_UCODE_REV in
      the list of emulated MSRs.
      Reported-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bab0c318
  6. 05 2月, 2020 13 次提交
  7. 28 1月, 2020 1 次提交
    • J
      kvm/svm: PKU not currently supported · a47970ed
      John Allen 提交于
      Current SVM implementation does not have support for handling PKU. Guests
      running on a host with future AMD cpus that support the feature will read
      garbage from the PKRU register and will hit segmentation faults on boot as
      memory is getting marked as protected that should not be. Ensure that cpuid
      from SVM does not advertise the feature.
      Signed-off-by: NJohn Allen <john.allen@amd.com>
      Cc: stable@vger.kernel.org
      Fixes: 0556cbdc ("x86/pkeys: Don't check if PKRU is zero before writing it")
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a47970ed
  8. 24 1月, 2020 5 次提交
  9. 21 1月, 2020 4 次提交
    • T
      KVM: SVM: Override default MMIO mask if memory encryption is enabled · 52918ed5
      Tom Lendacky 提交于
      The KVM MMIO support uses bit 51 as the reserved bit to cause nested page
      faults when a guest performs MMIO. The AMD memory encryption support uses
      a CPUID function to define the encryption bit position. Given this, it is
      possible that these bits can conflict.
      
      Use svm_hardware_setup() to override the MMIO mask if memory encryption
      support is enabled. Various checks are performed to ensure that the mask
      is properly defined and rsvd_bits() is used to generate the new mask (as
      was done prior to the change that necessitated this patch).
      
      Fixes: 28a1f3ac ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs")
      Suggested-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      52918ed5
    • S
      KVM: x86: Refactor and rename bit() to feature_bit() macro · 87382003
      Sean Christopherson 提交于
      Rename bit() to __feature_bit() to give it a more descriptive name, and
      add a macro, feature_bit(), to stuff the X68_FEATURE_ prefix to keep
      line lengths manageable for code that hardcodes the bit to be retrieved.
      
      No functional change intended.
      
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      87382003
    • S
      KVM: x86: Drop special XSAVE handling from guest_cpuid_has() · 96be4e06
      Sean Christopherson 提交于
      Now that KVM prevents setting host-reserved CR4 bits, drop the dedicated
      XSAVE check in guest_cpuid_has() in favor of open coding similar checks
      in the SVM/VMX XSAVES enabling flows.
      
      Note, checking boot_cpu_has(X86_FEATURE_XSAVE) in the XSAVES flows is
      technically redundant with respect to the CR4 reserved bit checks, e.g.
      XSAVES #UDs if CR4.OSXSAVE=0 and arch.xsaves_enabled is consumed if and
      only if CR4.OXSAVE=1 in guest.  Keep (add?) the explicit boot_cpu_has()
      checks to help document KVM's usage of arch.xsaves_enabled.
      
      No functional change intended.
      Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      96be4e06
    • W
      KVM: VMX: FIXED+PHYSICAL mode single target IPI fastpath · 1e9e2622
      Wanpeng Li 提交于
      ICR and TSCDEADLINE MSRs write cause the main MSRs write vmexits in our
      product observation, multicast IPIs are not as common as unicast IPI like
      RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc.
      
      This patch introduce a mechanism to handle certain performance-critical
      WRMSRs in a very early stage of KVM VMExit handler.
      
      This mechanism is specifically used for accelerating writes to x2APIC ICR
      that attempt to send a virtual IPI with physical destination-mode, fixed
      delivery-mode and single target. Which was found as one of the main causes
      of VMExits for Linux workloads.
      
      The reason this mechanism significantly reduce the latency of such virtual
      IPIs is by sending the physical IPI to the target vCPU in a very early stage
      of KVM VMExit handler, before host interrupts are enabled and before expensive
      operations such as reacquiring KVM’s SRCU lock.
      Latency is reduced even more when KVM is able to use APICv posted-interrupt
      mechanism (which allows to deliver the virtual IPI directly to target vCPU
      without the need to kick it to host).
      
      Testing on Xeon Skylake server:
      
      The virtual IPI latency from sender send to receiver receive reduces
      more than 200+ cpu cycles.
      Reviewed-by: NLiran Alon <liran.alon@oracle.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1e9e2622
  10. 09 1月, 2020 1 次提交
  11. 27 11月, 2019 1 次提交
  12. 15 11月, 2019 2 次提交
    • L
      KVM: SVM: Remove check if APICv enabled in SVM update_cr8_intercept() handler · 49d654d8
      Liran Alon 提交于
      This check is unnecessary as x86 update_cr8_intercept() which calls
      this VMX/SVM specific callback already performs this check.
      Reviewed-by: NJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: NLiran Alon <liran.alon@oracle.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      49d654d8
    • A
      KVM: retpolines: x86: eliminate retpoline from svm.c exit handlers · 3dcb2a3f
      Andrea Arcangeli 提交于
      It's enough to check the exit value and issue a direct call to avoid
      the retpoline for all the common vmexit reasons.
      
      After this commit is applied, here the most common retpolines executed
      under a high resolution timer workload in the guest on a SVM host:
      
      [..]
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get_update_offsets_now+70
          hrtimer_interrupt+131
          smp_apic_timer_interrupt+106
          apic_timer_interrupt+15
          start_sw_timer+359
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 1940
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_r12+33
          force_qs_rnp+217
          rcu_gp_kthread+1270
          kthread+268
          ret_from_fork+34
      ]: 4644
      @[]: 25095
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          lapic_next_event+28
          clockevents_program_event+148
          hrtimer_start_range_ns+528
          start_sw_timer+356
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 41474
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          clockevents_program_event+148
          hrtimer_start_range_ns+528
          start_sw_timer+356
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 41474
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get+58
          clockevents_program_event+84
          hrtimer_start_range_ns+528
          start_sw_timer+356
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 41887
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          lapic_next_event+28
          clockevents_program_event+148
          hrtimer_try_to_cancel+168
          hrtimer_cancel+21
          kvm_set_lapic_tscdeadline_msr+43
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 42723
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          clockevents_program_event+148
          hrtimer_try_to_cancel+168
          hrtimer_cancel+21
          kvm_set_lapic_tscdeadline_msr+43
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 42766
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get+58
          clockevents_program_event+84
          hrtimer_try_to_cancel+168
          hrtimer_cancel+21
          kvm_set_lapic_tscdeadline_msr+43
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 42848
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get+58
          start_sw_timer+279
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 499845
      
      @total: 1780243
      
      SVM has no TSC based programmable preemption timer so it is invoking
      ktime_get() frequently.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3dcb2a3f
  13. 31 10月, 2019 1 次提交
    • P
      KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active · 9167ab79
      Paolo Bonzini 提交于
      VMX already does so if the host has SMEP, in order to support the combination of
      CR0.WP=1 and CR4.SMEP=1.  However, it is perfectly safe to always do so, and in
      fact VMX already ends up running with EFER.NXE=1 on old processors that lack the
      "load EFER" controls, because it may help avoiding a slow MSR write.  Removing
      all the conditionals simplifies the code.
      
      SVM does not have similar code, but it should since recent AMD processors do
      support SMEP.  So this patch also makes the code for the two vendors more similar
      while fixing NPT=0, CR0.WP=1 and CR4.SMEP=1 on AMD processors.
      
      Cc: stable@vger.kernel.org
      Cc: Joerg Roedel <jroedel@suse.de>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9167ab79
  14. 23 10月, 2019 1 次提交
  15. 22 10月, 2019 2 次提交