1. 20 1月, 2022 1 次提交
    • S
      KVM: VMX: Reject KVM_RUN if emulation is required with pending exception · fc4fad79
      Sean Christopherson 提交于
      Reject KVM_RUN if emulation is required (because VMX is running without
      unrestricted guest) and an exception is pending, as KVM doesn't support
      emulating exceptions except when emulating real mode via vm86.  The vCPU
      is hosed either way, but letting KVM_RUN proceed triggers a WARN due to
      the impossible condition.  Alternatively, the WARN could be removed, but
      then userspace and/or KVM bugs would result in the vCPU silently running
      in a bad state, which isn't very friendly to users.
      
      Originally, the bug was hit by syzkaller with a nested guest as that
      doesn't require kvm_intel.unrestricted_guest=0.  That particular flavor
      is likely fixed by commit cd0e615c ("KVM: nVMX: Synthesize
      TRIPLE_FAULT for L2 if emulation is required"), but it's trivial to
      trigger the WARN with a non-nested guest, and userspace can likely force
      bad state via ioctls() for a nested guest as well.
      
      Checking for the impossible condition needs to be deferred until KVM_RUN
      because KVM can't force specific ordering between ioctls.  E.g. clearing
      exception.pending in KVM_SET_SREGS doesn't prevent userspace from setting
      it in KVM_SET_VCPU_EVENTS, and disallowing KVM_SET_VCPU_EVENTS with
      emulation_required would prevent userspace from queuing an exception and
      then stuffing sregs.  Note, if KVM were to try and detect/prevent the
      condition prior to KVM_RUN, handle_invalid_guest_state() and/or
      handle_emulation_failure() would need to be modified to clear the pending
      exception prior to exiting to userspace.
      
       ------------[ cut here ]------------
       WARNING: CPU: 6 PID: 137812 at arch/x86/kvm/vmx/vmx.c:1623 vmx_queue_exception+0x14f/0x160 [kvm_intel]
       CPU: 6 PID: 137812 Comm: vmx_invalid_nes Not tainted 5.15.2-7cc36c3e14ae-pop #279
       Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
       RIP: 0010:vmx_queue_exception+0x14f/0x160 [kvm_intel]
       Code: <0f> 0b e9 fd fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
       RSP: 0018:ffffa45c83577d38 EFLAGS: 00010202
       RAX: 0000000000000003 RBX: 0000000080000006 RCX: 0000000000000006
       RDX: 0000000000000000 RSI: 0000000000010002 RDI: ffff9916af734000
       RBP: ffff9916af734000 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000006
       R13: 0000000000000000 R14: ffff9916af734038 R15: 0000000000000000
       FS:  00007f1e1a47c740(0000) GS:ffff99188fb80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f1e1a6a8008 CR3: 000000026f83b005 CR4: 00000000001726e0
       Call Trace:
        kvm_arch_vcpu_ioctl_run+0x13a2/0x1f20 [kvm]
        kvm_vcpu_ioctl+0x279/0x690 [kvm]
        __x64_sys_ioctl+0x83/0xb0
        do_syscall_64+0x3b/0xc0
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Reported-by: syzbot+82112403ace4cbd780d8@syzkaller.appspotmail.com
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211228232437.1875318-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fc4fad79
  2. 18 1月, 2022 1 次提交
    • L
      KVM: x86: Making the module parameter of vPMU more common · 4732f244
      Like Xu 提交于
      The new module parameter to control PMU virtualization should apply
      to Intel as well as AMD, for situations where userspace is not trusted.
      If the module parameter allows PMU virtualization, there could be a
      new KVM_CAP or guest CPUID bits whereby userspace can enable/disable
      PMU virtualization on a per-VM basis.
      
      If the module parameter does not allow PMU virtualization, there
      should be no userspace override, since we have no precedent for
      authorizing that kind of override. If it's false, other counter-based
      profiling features (such as LBR including the associated CPUID bits
      if any) will not be exposed.
      
      Change its name from "pmu" to "enable_pmu" as we have temporary
      variables with the same name in our code like "struct kvm_pmu *pmu".
      
      Fixes: b1d66dad ("KVM: x86/svm: Add module param to control PMU virtualization")
      Suggested-by : Jim Mattson <jmattson@google.com>
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Message-Id: <20220111073823.21885-1-likexu@tencent.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4732f244
  3. 07 1月, 2022 1 次提交
    • M
      KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests · 405329fc
      Michael Roth 提交于
      Normally guests will set up CR3 themselves, but some guests, such as
      kselftests, and potentially CONFIG_PVH guests, rely on being booted
      with paging enabled and CR3 initialized to a pre-allocated page table.
      
      Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest
      VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this
      is too late, since it will have switched over to using the VMSA page
      prior to that point, with the VMSA CR3 copied from the VMCB initial
      CR3 value: 0.
      
      Address this by sync'ing the CR3 value into the VMCB save area
      immediately when KVM_SET_SREGS* is issued so it will find it's way into
      the initial VMSA.
      Suggested-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NMichael Roth <michael.roth@amd.com>
      Message-Id: <20211216171358.61140-10-michael.roth@amd.com>
      [Remove vmx_post_set_cr3; add a remark about kvm_set_cr3 not calling the
       new hook. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      405329fc
  4. 20 12月, 2021 1 次提交
    • M
      KVM: x86: Always set kvm_run->if_flag · c5063551
      Marc Orr 提交于
      The kvm_run struct's if_flag is a part of the userspace/kernel API. The
      SEV-ES patches failed to set this flag because it's no longer needed by
      QEMU (according to the comment in the source code). However, other
      hypervisors may make use of this flag. Therefore, set the flag for
      guests with encrypted registers (i.e., with guest_state_protected set).
      
      Fixes: f1c6366e ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
      Signed-off-by: NMarc Orr <marcorr@google.com>
      Message-Id: <20211209155257.128747-1-marcorr@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      c5063551
  5. 08 12月, 2021 11 次提交
  6. 30 11月, 2021 1 次提交
    • P
      KVM: x86: check PIR even for vCPUs with disabled APICv · 37c4dbf3
      Paolo Bonzini 提交于
      The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
      if APICv is disabled on the vCPU that receives it.  In that case, the
      interrupt will just cause a vmexit and leave the ON bit set together
      with the PIR bit corresponding to the interrupt.
      
      Right now, the interrupt would not be delivered until APICv is re-enabled.
      However, fixing this is just a matter of always doing the PIR->IRR
      synchronization, even if the vCPU has temporarily disabled APICv.
      
      This is not a problem for performance, or if anything it is an
      improvement.  First, in the common case where vcpu->arch.apicv_active is
      true, one fewer check has to be performed.  Second, static_call_cond will
      elide the function call if APICv is not present or disabled.  Finally,
      in the case for AMD hardware we can remove the sync_pir_to_irr callback:
      it is only needed for apic_has_interrupt_for_ppr, and that function
      already has a fallback for !APICv.
      
      Cc: stable@vger.kernel.org
      Co-developed-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      37c4dbf3
  7. 11 11月, 2021 3 次提交
    • V
      KVM: Move INVPCID type check from vmx and svm to the common kvm_handle_invpcid() · 796c83c5
      Vipin Sharma 提交于
      Handle #GP on INVPCID due to an invalid type in the common switch
      statement instead of relying on the callers (VMX and SVM) to manually
      validate the type.
      
      Unlike INVVPID and INVEPT, INVPCID is not explicitly documented to check
      the type before reading the operand from memory, so deferring the
      type validity check until after that point is architecturally allowed.
      Signed-off-by: NVipin Sharma <vipinsh@google.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211109174426.2350547-3-vipinsh@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      796c83c5
    • P
      KVM: SEV: Add support for SEV intra host migration · b5663931
      Peter Gonda 提交于
      For SEV to work with intra host migration, contents of the SEV info struct
      such as the ASID (used to index the encryption key in the AMD SP) and
      the list of memory regions need to be transferred to the target VM.
      This change adds a commands for a target VMM to get a source SEV VM's sev
      info.
      Signed-off-by: NPeter Gonda <pgonda@google.com>
      Suggested-by: NSean Christopherson <seanjc@google.com>
      Reviewed-by: NMarc Orr <marcorr@google.com>
      Cc: Marc Orr <marcorr@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Message-Id: <20211021174303.385706-3-pgonda@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b5663931
    • P
      KVM: SEV: Refactor out sev_es_state struct · b67a4cc3
      Peter Gonda 提交于
      Move SEV-ES vCPU metadata into new sev_es_state struct from vcpu_svm.
      Signed-off-by: NPeter Gonda <pgonda@google.com>
      Suggested-by: NTom Lendacky <thomas.lendacky@amd.com>
      Acked-by: NTom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: NSean Christopherson <seanjc@google.com>
      Cc: Marc Orr <marcorr@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Message-Id: <20211021174303.385706-2-pgonda@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b67a4cc3
  8. 25 10月, 2021 1 次提交
  9. 23 10月, 2021 1 次提交
  10. 22 10月, 2021 2 次提交
    • S
      KVM: x86: Move SVM's APICv sanity check to common x86 · ee49a893
      Sean Christopherson 提交于
      Move SVM's assertion that vCPU's APICv state is consistent with its VM's
      state out of svm_vcpu_run() and into x86's common inner run loop.  The
      assertion and underlying logic is not unique to SVM, it's just that SVM
      has more inhibiting conditions and thus is more likely to run headfirst
      into any KVM bugs.
      
      Add relevant comments to document exactly why the update path has unusual
      ordering between the update the kick, why said ordering is safe, and also
      the basic rules behind the assertion in the run loop.
      
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211022004927.1448382-3-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ee49a893
    • S
      KVM: x86: Add vendor name to kvm_x86_ops, use it for error messages · 9dadfc4a
      Sean Christopherson 提交于
      Paul pointed out the error messages when KVM fails to load are unhelpful
      in understanding exactly what went wrong if userspace probes the "wrong"
      module.
      
      Add a mandatory kvm_x86_ops field to track vendor module names, kvm_intel
      and kvm_amd, and use the name for relevant error message when KVM fails
      to load so that the user knows which module failed to load.
      
      Opportunistically tweak the "disabled by bios" error message to clarify
      that _support_ was disabled, not that the module itself was magically
      disabled by BIOS.
      Suggested-by: NPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20211018183929.897461-2-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9dadfc4a
  11. 04 10月, 2021 1 次提交
  12. 01 10月, 2021 3 次提交
  13. 30 9月, 2021 2 次提交
  14. 23 9月, 2021 2 次提交
  15. 22 9月, 2021 3 次提交
  16. 21 8月, 2021 6 次提交