1. 09 2月, 2021 2 次提交
  2. 04 2月, 2021 12 次提交
    • S
      KVM: x86: SEV: Treat C-bit as legal GPA bit regardless of vCPU mode · ca29e145
      Sean Christopherson 提交于
      Rename cr3_lm_rsvd_bits to reserved_gpa_bits, and use it for all GPA
      legality checks.  AMD's APM states:
      
        If the C-bit is an address bit, this bit is masked from the guest
        physical address when it is translated through the nested page tables.
      
      Thus, any access that can conceivably be run through NPT should ignore
      the C-bit when checking for validity.
      
      For features that KVM emulates in software, e.g. MTRRs, there is no
      clear direction in the APM for how the C-bit should be handled.  For
      such cases, follow the SME behavior inasmuch as possible, since SEV is
      is essentially a VM-specific variant of SME.  For SME, the APM states:
      
        In this case the upper physical address bits are treated as reserved
        when the feature is enabled except where otherwise indicated.
      
      Collecting the various relavant SME snippets in the APM and cross-
      referencing the omissions with Linux kernel code, this leaves MTTRs and
      APIC_BASE as the only flows that KVM emulates that should _not_ ignore
      the C-bit.
      
      Note, this means the reserved bit checks in the page tables are
      technically broken.  This will be remedied in a future patch.
      
      Although the page table checks are technically broken, in practice, it's
      all but guaranteed to be irrelevant.  NPT is required for SEV, i.e.
      shadowing page tables isn't needed in the common case.  Theoretically,
      the checks could be in play for nested NPT, but it's extremely unlikely
      that anyone is running nested VMs on SEV, as doing so would require L1
      to expose sensitive data to L0, e.g. the entire VMCB.  And if anyone is
      running nested VMs, L0 can't read the guest's encrypted memory, i.e. L1
      would need to put its NPT in shared memory, in which case the C-bit will
      never be set.  Or, L1 could use shadow paging, but again, if L0 needs to
      read page tables, e.g. to load PDPTRs, the memory can't be encrypted if
      L1 has any expectation of L0 doing the right thing.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210204000117.3303214-8-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ca29e145
    • P
      KVM: x86: move kvm_inject_gp up from kvm_set_xcr to callers · bbefd4fc
      Paolo Bonzini 提交于
      Push the injection of #GP up to the callers, so that they can just use
      kvm_complete_insn_gp.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bbefd4fc
    • K
      KVM: SVM: Replace hard-coded value with #define · 04548ed0
      Krish Sadhukhan 提交于
      Replace the hard-coded value for bit# 1 in EFLAGS, with the available
      #define.
      Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Message-Id: <20210203012842.101447-2-krish.sadhukhan@oracle.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      04548ed0
    • M
      KVM: SVM: use .prepare_guest_switch() to handle CPU register save/setup · a7fc06dd
      Michael Roth 提交于
      Currently we save host state like user-visible host MSRs, and do some
      initial guest register setup for MSR_TSC_AUX and MSR_AMD64_TSC_RATIO
      in svm_vcpu_load(). Defer this until just before we enter the guest by
      moving the handling to kvm_x86_ops.prepare_guest_switch() similarly to
      how it is done for the VMX implementation.
      
      Additionally, since handling of saving/restoring host user MSRs is the
      same both with/without SEV-ES enabled, move that handling to common
      code.
      Suggested-by: NSean Christopherson <seanjc@google.com>
      Signed-off-by: NMichael Roth <michael.roth@amd.com>
      Message-Id: <20210202190126.2185715-4-michael.roth@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a7fc06dd
    • M
      KVM: SVM: remove uneeded fields from host_save_users_msrs · 553cc15f
      Michael Roth 提交于
      Now that the set of host user MSRs that need to be individually
      saved/restored are the same with/without SEV-ES, we can drop the
      .sev_es_restored flag and just iterate through the list unconditionally
      for both cases. A subsequent patch can then move these loops to a
      common path.
      Signed-off-by: NMichael Roth <michael.roth@amd.com>
      Message-Id: <20210202190126.2185715-3-michael.roth@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      553cc15f
    • M
      KVM: SVM: use vmsave/vmload for saving/restoring additional host state · e79b91bb
      Michael Roth 提交于
      Using a guest workload which simply issues 'hlt' in a tight loop to
      generate VMEXITs, it was observed (on a recent EPYC processor) that a
      significant amount of the VMEXIT overhead measured on the host was the
      result of MSR reads/writes in svm_vcpu_load/svm_vcpu_put according to
      perf:
      
        67.49%--kvm_arch_vcpu_ioctl_run
                |
                |--23.13%--vcpu_put
                |          kvm_arch_vcpu_put
                |          |
                |          |--21.31%--native_write_msr
                |          |
                |           --1.27%--svm_set_cr4
                |
                |--16.11%--vcpu_load
                |          |
                |           --15.58%--kvm_arch_vcpu_load
                |                     |
                |                     |--13.97%--svm_set_cr4
                |                     |          |
                |                     |          |--12.64%--native_read_msr
      
      Most of these MSRs relate to 'syscall'/'sysenter' and segment bases, and
      can be saved/restored using 'vmsave'/'vmload' instructions rather than
      explicit MSR reads/writes. In doing so there is a significant reduction
      in the svm_vcpu_load/svm_vcpu_put overhead measured for the above
      workload:
      
        50.92%--kvm_arch_vcpu_ioctl_run
                |
                |--19.28%--disable_nmi_singlestep
                |
                |--13.68%--vcpu_load
                |          kvm_arch_vcpu_load
                |          |
                |          |--9.19%--svm_set_cr4
                |          |          |
                |          |           --6.44%--native_read_msr
                |          |
                |           --3.55%--native_write_msr
                |
                |--6.05%--kvm_inject_nmi
                |--2.80%--kvm_sev_es_mmio_read
                |--2.19%--vcpu_put
                |          |
                |           --1.25%--kvm_arch_vcpu_put
                |                     native_write_msr
      
      Quantifying this further, if we look at the raw cycle counts for a
      normal iteration of the above workload (according to 'rdtscp'),
      kvm_arch_vcpu_ioctl_run() takes ~4600 cycles from start to finish with
      the current behavior. Using 'vmsave'/'vmload', this is reduced to
      ~2800 cycles, a savings of 39%.
      
      While this approach doesn't seem to manifest in any noticeable
      improvement for more realistic workloads like UnixBench, netperf, and
      kernel builds, likely due to their exit paths generally involving IO
      with comparatively high latencies, it does improve overall overhead
      of KVM_RUN significantly, which may still be noticeable for certain
      situations. It also simplifies some aspects of the code.
      
      With this change, explicit save/restore is no longer needed for the
      following host MSRs, since they are documented[1] as being part of the
      VMCB State Save Area:
      
        MSR_STAR, MSR_LSTAR, MSR_CSTAR,
        MSR_SYSCALL_MASK, MSR_KERNEL_GS_BASE,
        MSR_IA32_SYSENTER_CS,
        MSR_IA32_SYSENTER_ESP,
        MSR_IA32_SYSENTER_EIP,
        MSR_FS_BASE, MSR_GS_BASE
      
      and only the following MSR needs individual handling in
      svm_vcpu_put/svm_vcpu_load:
      
        MSR_TSC_AUX
      
      We could drop the host_save_user_msrs array/loop and instead handle
      MSR read/write of MSR_TSC_AUX directly, but we leave that for now as
      a potential follow-up.
      
      Since 'vmsave'/'vmload' also handles the LDTR and FS/GS segment
      registers (and associated hidden state)[2], some of the code
      previously used to handle this is no longer needed, so we drop it
      as well.
      
      The first public release of the SVM spec[3] also documents the same
      handling for the host state in question, so we make these changes
      unconditionally.
      
      Also worth noting is that we 'vmsave' to the same page that is
      subsequently used by 'vmrun' to record some host additional state. This
      is okay, since, in accordance with the spec[2], the additional state
      written to the page by 'vmrun' does not overwrite any fields written by
      'vmsave'. This has also been confirmed through testing (for the above
      CPU, at least).
      
      [1] AMD64 Architecture Programmer's Manual, Rev 3.33, Volume 2, Appendix B, Table B-2
      [2] AMD64 Architecture Programmer's Manual, Rev 3.31, Volume 3, Chapter 4, VMSAVE/VMLOAD
      [3] Secure Virtual Machine Architecture Reference Manual, Rev 3.01
      Suggested-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NMichael Roth <michael.roth@amd.com>
      Message-Id: <20210202190126.2185715-2-michael.roth@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e79b91bb
    • S
      KVM: SVM: Use asm goto to handle unexpected #UD on SVM instructions · 35a78319
      Sean Christopherson 提交于
      Add svm_asm*() macros, a la the existing vmx_asm*() macros, to handle
      faults on SVM instructions instead of using the generic __ex(), a.k.a.
      __kvm_handle_fault_on_reboot().  Using asm goto generates slightly
      better code as it eliminates the in-line JMP+CALL sequences that are
      needed by __kvm_handle_fault_on_reboot() to avoid triggering BUG()
      from fixup (which generates bad stack traces).
      
      Using SVM specific macros also drops the last user of __ex() and the
      the last asm linkage to kvm_spurious_fault(), and adds a helper for
      VMSAVE, which may gain an addition call site in the future (as part
      of optimizing the SVM context switching).
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20201231002702.22237077-8-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      35a78319
    • J
      KVM: X86: prepend vmx/svm prefix to additional kvm_x86_ops functions · b6a7cc35
      Jason Baron 提交于
      A subsequent patch introduces macros in preparation for simplifying the
      definition for vmx_x86_ops and svm_x86_ops. Making the naming more uniform
      expands the coverage of the macros. Add vmx/svm prefix to the following
      functions: update_exception_bitmap(), enable_nmi_window(),
      enable_irq_window(), update_cr8_intercept and enable_smi_window().
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NJason Baron <jbaron@akamai.com>
      Message-Id: <ed594696f8e2c2b2bfc747504cee9bbb2a269300.1610680941.git.jbaron@akamai.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      b6a7cc35
    • W
      KVM: SVM: Fix #GP handling for doubly-nested virtualization · 14c2bf81
      Wei Huang 提交于
      Under the case of nested on nested (L0, L1, L2 are all hypervisors),
      we do not support emulation of the vVMLOAD/VMSAVE feature, the
      L0 hypervisor can inject the proper #VMEXIT to inform L1 of what is
      happening and L1 can avoid invoking the #GP workaround.  For this
      reason we turns on guest VM's X86_FEATURE_SVME_ADDR_CHK bit for KVM
      running inside VM to receive the notification and change behavior.
      
      Similarly we check if vcpu is under guest mode before emulating the
      vmware-backdoor instructions. For the case of nested on nested, we
      let the guest handle it.
      Co-developed-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NWei Huang <wei.huang2@amd.com>
      Tested-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210126081831.570253-5-wei.huang2@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      14c2bf81
    • W
      KVM: SVM: Add support for SVM instruction address check change · 3b9c723e
      Wei Huang 提交于
      New AMD CPUs have a change that checks #VMEXIT intercept on special SVM
      instructions before checking their EAX against reserved memory region.
      This change is indicated by CPUID_0x8000000A_EDX[28]. If it is 1, #VMEXIT
      is triggered before #GP. KVM doesn't need to intercept and emulate #GP
      faults as #GP is supposed to be triggered.
      Co-developed-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Signed-off-by: NWei Huang <wei.huang2@amd.com>
      Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210126081831.570253-4-wei.huang2@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3b9c723e
    • B
      KVM: SVM: Add emulation support for #GP triggered by SVM instructions · 82a11e9c
      Bandan Das 提交于
      While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
      CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
      before checking VMCB's instruction intercept. If EAX falls into such
      memory areas, #GP is triggered before VMEXIT. This causes problem under
      nested virtualization. To solve this problem, KVM needs to trap #GP and
      check the instructions triggering #GP. For VM execution instructions,
      KVM emulates these instructions.
      Co-developed-by: NWei Huang <wei.huang2@amd.com>
      Signed-off-by: NWei Huang <wei.huang2@amd.com>
      Signed-off-by: NBandan Das <bsd@redhat.com>
      Message-Id: <20210126081831.570253-3-wei.huang2@amd.com>
      [Conditionally enable #GP intercept. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      82a11e9c
    • C
      KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW · 9a3ecd5e
      Chenyi Qiang 提交于
      DR6_INIT contains the 1-reserved bits as well as the bit that is cleared
      to 0 when the condition (e.g. RTM) happens. The value can be used to
      initialize dr6 and also be the XOR mask between the #DB exit
      qualification (or payload) and DR6.
      
      Concerning that DR6_INIT is used as initial value only once, rename it
      to DR6_ACTIVE_LOW and apply it in other places, which would make the
      incoming changes for bus lock debug exception more simple.
      Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com>
      Message-Id: <20210202090433.13441-2-chenyi.qiang@intel.com>
      [Define DR6_FIXED_1 from DR6_ACTIVE_LOW and DR6_VOLATILE. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a3ecd5e
  3. 03 2月, 2021 1 次提交
    • S
      KVM: SVM: Treat SVM as unsupported when running as an SEV guest · ccd85d90
      Sean Christopherson 提交于
      Don't let KVM load when running as an SEV guest, regardless of what
      CPUID says.  Memory is encrypted with a key that is not accessible to
      the host (L0), thus it's impossible for L0 to emulate SVM, e.g. it'll
      see garbage when reading the VMCB.
      
      Technically, KVM could decrypt all memory that needs to be accessible to
      the L0 and use shadow paging so that L0 does not need to shadow NPT, but
      exposing such information to L0 largely defeats the purpose of running as
      an SEV guest.  This can always be revisited if someone comes up with a
      use case for running VMs inside SEV guests.
      
      Note, VMLOAD, VMRUN, etc... will also #GP on GPAs with C-bit set, i.e. KVM
      is doomed even if the SEV guest is debuggable and the hypervisor is willing
      to decrypt the VMCB.  This may or may not be fixed on CPUs that have the
      SVME_ADDR_CHK fix.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSean Christopherson <seanjc@google.com>
      Message-Id: <20210202212017.2486595-1-seanjc@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ccd85d90
  4. 26 1月, 2021 1 次提交
    • L
      kvm: tracing: Fix unmatched kvm_entry and kvm_exit events · d95df951
      Lorenzo Brescia 提交于
      On VMX, if we exit and then re-enter immediately without leaving
      the vmx_vcpu_run() function, the kvm_entry event is not logged.
      That means we will see one (or more) kvm_exit, without its (their)
      corresponding kvm_entry, as shown here:
      
       CPU-1979 [002] 89.871187: kvm_entry: vcpu 1
       CPU-1979 [002] 89.871218: kvm_exit:  reason MSR_WRITE
       CPU-1979 [002] 89.871259: kvm_exit:  reason MSR_WRITE
      
      It also seems possible for a kvm_entry event to be logged, but then
      we leave vmx_vcpu_run() right away (if vmx->emulation_required is
      true). In this case, we will have a spurious kvm_entry event in the
      trace.
      
      Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run()
      (where trace_kvm_exit() already is).
      
      A trace obtained with this patch applied looks like this:
      
       CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0
       CPU-14295 [000] 8388.395392: kvm_exit:  reason MSR_WRITE
       CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0
       CPU-14295 [000] 8388.395503: kvm_exit:  reason EXTERNAL_INTERRUPT
      
      Of course, not calling trace_kvm_entry() in common x86 code any
      longer means that we need to adjust the SVM side of things too.
      Signed-off-by: NLorenzo Brescia <lorenzo.brescia@edu.unito.it>
      Signed-off-by: NDario Faggioli <dfaggioli@suse.com>
      Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d95df951
  5. 08 1月, 2021 2 次提交
    • T
      KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2
      Tom Lendacky 提交于
      Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
      where the guest vCPU register state is updated and then the vCPU is VMRUN
      to begin execution of the AP. For an SEV-ES guest, this won't work because
      the guest register state is encrypted.
      
      Following the GHCB specification, the hypervisor must not alter the guest
      register state, so KVM must track an AP/vCPU boot. Should the guest want
      to park the AP, it must use the AP Reset Hold exit event in place of, for
      example, a HLT loop.
      
      First AP boot (first INIT-SIPI-SIPI sequence):
        Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
        support. It is up to the guest to transfer control of the AP to the
        proper location.
      
      Subsequent AP boot:
        KVM will expect to receive an AP Reset Hold exit event indicating that
        the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
        awaken it. When the AP Reset Hold exit event is received, KVM will place
        the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
        sequence, KVM will make the vCPU runnable. It is again up to the guest
        to then transfer control of the AP to the proper location.
      
        To differentiate between an actual HLT and an AP Reset Hold, a new MP
        state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
        placed in upon receiving the AP Reset Hold exit event. Additionally, to
        communicate the AP Reset Hold exit event up to userspace (if needed), a
        new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.
      
      A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
      to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
      original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
      a new function that, for non SEV-ES guests, invokes the original SIPI
      delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
      implements the logic above.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      647daca2
    • U
      KVM/SVM: Remove leftover __svm_vcpu_run prototype from svm.c · 52782d5b
      Uros Bizjak 提交于
      Commit 16809ecd moved __svm_vcpu_run the prototype to svm.h,
      but forgot to remove the original from svm.c.
      
      Fixes: 16809ecd ("KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests")
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
      Message-Id: <20201220200339.65115-1-ubizjak@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      52782d5b
  6. 15 12月, 2020 22 次提交
    • T
      KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests · 16809ecd
      Tom Lendacky 提交于
      The run sequence is different for an SEV-ES guest compared to a legacy or
      even an SEV guest. The guest vCPU register state of an SEV-ES guest will
      be restored on VMRUN and saved on VMEXIT. There is no need to restore the
      guest registers directly and through VMLOAD before VMRUN and no need to
      save the guest registers directly and through VMSAVE on VMEXIT.
      
      Update the svm_vcpu_run() function to skip register state saving and
      restoring and provide an alternative function for running an SEV-ES guest
      in vmenter.S
      
      Additionally, certain host state is restored across an SEV-ES VMRUN. As
      a result certain register states are not required to be restored upon
      VMEXIT (e.g. FS, GS, etc.), so only do that if the guest is not an SEV-ES
      guest.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <fb1c66d32f2194e171b95fc1a8affd6d326e10c1.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      16809ecd
    • T
      KVM: SVM: Provide support for SEV-ES vCPU loading · 86137773
      Tom Lendacky 提交于
      An SEV-ES vCPU requires additional VMCB vCPU load/put requirements. SEV-ES
      hardware will restore certain registers on VMEXIT, but not save them on
      VMRUN (see Table B-3 and Table B-4 of the AMD64 APM Volume 2), so make the
      following changes:
      
      General vCPU load changes:
        - During vCPU loading, perform a VMSAVE to the per-CPU SVM save area and
          save the current values of XCR0, XSS and PKRU to the per-CPU SVM save
          area as these registers will be restored on VMEXIT.
      
      General vCPU put changes:
        - Do not attempt to restore registers that SEV-ES hardware has already
          restored on VMEXIT.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <019390e9cb5e93cd73014fa5a040c17d42588733.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      86137773
    • T
      KVM: SVM: Provide support for SEV-ES vCPU creation/loading · 376c6d28
      Tom Lendacky 提交于
      An SEV-ES vCPU requires additional VMCB initialization requirements for
      vCPU creation and vCPU load/put requirements. This includes:
      
      General VMCB initialization changes:
        - Set a VMCB control bit to enable SEV-ES support on the vCPU.
        - Set the VMCB encrypted VM save area address.
        - CRx registers are part of the encrypted register state and cannot be
          updated. Remove the CRx register read and write intercepts and replace
          them with CRx register write traps to track the CRx register values.
        - Certain MSR values are part of the encrypted register state and cannot
          be updated. Remove certain MSR intercepts (EFER, CR_PAT, etc.).
        - Remove the #GP intercept (no support for "enable_vmware_backdoor").
        - Remove the XSETBV intercept since the hypervisor cannot modify XCR0.
      
      General vCPU creation changes:
        - Set the initial GHCB gpa value as per the GHCB specification.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <3a8aef366416eddd5556dfa3fdc212aafa1ad0a2.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      376c6d28
    • T
      KVM: SVM: Set the encryption mask for the SVM host save area · 85ca8be9
      Tom Lendacky 提交于
      The SVM host save area is used to restore some host state on VMEXIT of an
      SEV-ES guest. After allocating the save area, clear it and add the
      encryption mask to the SVM host save area physical address that is
      programmed into the VM_HSAVE_PA MSR.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <b77aa28af6d7f1a0cb545959e08d6dc75e0c3cba.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      85ca8be9
    • T
      KVM: SVM: Add NMI support for an SEV-ES guest · 4444dfe4
      Tom Lendacky 提交于
      The GHCB specification defines how NMIs are to be handled for an SEV-ES
      guest. To detect the completion of an NMI the hypervisor must not
      intercept the IRET instruction (because a #VC while running the NMI will
      issue an IRET) and, instead, must receive an NMI Complete exit event from
      the guest.
      
      Update the KVM support for detecting the completion of NMIs in the guest
      to follow the GHCB specification. When an SEV-ES guest is active, the
      IRET instruction will no longer be intercepted. Now, when the NMI Complete
      exit event is received, the iret_interception() function will be called
      to simulate the completion of the NMI.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <5ea3dd69b8d4396cefdc9048ebc1ab7caa70a847.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      4444dfe4
    • T
      KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest · ed02b213
      Tom Lendacky 提交于
      The guest FPU state is automatically restored on VMRUN and saved on VMEXIT
      by the hardware, so there is no reason to do this in KVM. Eliminate the
      allocation of the guest_fpu save area and key off that to skip operations
      related to the guest FPU state.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <173e429b4d0d962c6a443c4553ffdaf31b7665a4.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ed02b213
    • T
      KVM: SVM: Do not report support for SMM for an SEV-ES guest · 5719455f
      Tom Lendacky 提交于
      SEV-ES guests do not currently support SMM. Update the has_emulated_msr()
      kvm_x86_ops function to take a struct kvm parameter so that the capability
      can be reported at a VM level.
      
      Since this op is also called during KVM initialization and before a struct
      kvm instance is available, comments will be added to each implementation
      of has_emulated_msr() to indicate the kvm parameter can be null.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <75de5138e33b945d2fb17f81ae507bda381808e3.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5719455f
    • T
      KVM: SVM: Add support for CR8 write traps for an SEV-ES guest · d1949b93
      Tom Lendacky 提交于
      For SEV-ES guests, the interception of control register write access
      is not recommended. Control register interception occurs prior to the
      control register being modified and the hypervisor is unable to modify
      the control register itself because the register is located in the
      encrypted register state.
      
      SEV-ES guests introduce new control register write traps. These traps
      provide intercept support of a control register write after the control
      register has been modified. The new control register value is provided in
      the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
      of the guest control registers.
      
      Add support to track the value of the guest CR8 register using the control
      register write trap so that the hypervisor understands the guest operating
      mode.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <5a01033f4c8b3106ca9374b7cadf8e33da852df1.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d1949b93
    • T
      KVM: SVM: Add support for CR4 write traps for an SEV-ES guest · 5b51cb13
      Tom Lendacky 提交于
      For SEV-ES guests, the interception of control register write access
      is not recommended. Control register interception occurs prior to the
      control register being modified and the hypervisor is unable to modify
      the control register itself because the register is located in the
      encrypted register state.
      
      SEV-ES guests introduce new control register write traps. These traps
      provide intercept support of a control register write after the control
      register has been modified. The new control register value is provided in
      the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
      of the guest control registers.
      
      Add support to track the value of the guest CR4 register using the control
      register write trap so that the hypervisor understands the guest operating
      mode.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <c3880bf2db8693aa26f648528fbc6e967ab46e25.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      5b51cb13
    • T
      KVM: SVM: Add support for CR0 write traps for an SEV-ES guest · f27ad38a
      Tom Lendacky 提交于
      For SEV-ES guests, the interception of control register write access
      is not recommended. Control register interception occurs prior to the
      control register being modified and the hypervisor is unable to modify
      the control register itself because the register is located in the
      encrypted register state.
      
      SEV-ES support introduces new control register write traps. These traps
      provide intercept support of a control register write after the control
      register has been modified. The new control register value is provided in
      the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
      of the guest control registers.
      
      Add support to track the value of the guest CR0 register using the control
      register write trap so that the hypervisor understands the guest operating
      mode.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <182c9baf99df7e40ad9617ff90b84542705ef0d7.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f27ad38a
    • T
      KVM: SVM: Add support for EFER write traps for an SEV-ES guest · 2985afbc
      Tom Lendacky 提交于
      For SEV-ES guests, the interception of EFER write access is not
      recommended. EFER interception occurs prior to EFER being modified and
      the hypervisor is unable to modify EFER itself because the register is
      located in the encrypted register state.
      
      SEV-ES support introduces a new EFER write trap. This trap provides
      intercept support of an EFER write after it has been modified. The new
      EFER value is provided in the VMCB EXITINFO1 field, allowing the
      hypervisor to track the setting of the guest EFER.
      
      Add support to track the value of the guest EFER value using the EFER
      write trap so that the hypervisor understands the guest operating mode.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <8993149352a3a87cd0625b3b61bfd31ab28977e1.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2985afbc
    • T
      KVM: SVM: Support string IO operations for an SEV-ES guest · 7ed9abfe
      Tom Lendacky 提交于
      For an SEV-ES guest, string-based port IO is performed to a shared
      (un-encrypted) page so that both the hypervisor and guest can read or
      write to it and each see the contents.
      
      For string-based port IO operations, invoke SEV-ES specific routines that
      can complete the operation using common KVM port IO support.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <9d61daf0ffda496703717218f415cdc8fd487100.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      7ed9abfe
    • T
      KVM: SVM: Add initial support for a VMGEXIT VMEXIT · 291bd20d
      Tom Lendacky 提交于
      SEV-ES adds a new VMEXIT reason code, VMGEXIT. Initial support for a
      VMGEXIT includes mapping the GHCB based on the guest GPA, which is
      obtained from a new VMCB field, and then validating the required inputs
      for the VMGEXIT exit reason.
      
      Since many of the VMGEXIT exit reasons correspond to existing VMEXIT
      reasons, the information from the GHCB is copied into the VMCB control
      exit code areas and KVM register areas. The standard exit handlers are
      invoked, similar to standard VMEXIT processing. Before restarting the
      vCPU, the GHCB is updated with any registers that have been updated by
      the hypervisor.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <c6a4ed4294a369bd75c44d03bd7ce0f0c3840e50.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      291bd20d
    • T
      KVM: SVM: Prepare for SEV-ES exit handling in the sev.c file · e9093fd4
      Tom Lendacky 提交于
      This is a pre-patch to consolidate some exit handling code into callable
      functions. Follow-on patches for SEV-ES exit handling will then be able
      to use them from the sev.c file.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <5b8b0ffca8137f3e1e257f83df9f5c881c8a96a3.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e9093fd4
    • T
      KVM: SVM: Cannot re-initialize the VMCB after shutdown with SEV-ES · 8164a5ff
      Tom Lendacky 提交于
      When a SHUTDOWN VMEXIT is encountered, normally the VMCB is re-initialized
      so that the guest can be re-launched. But when a guest is running as an
      SEV-ES guest, the VMSA cannot be re-initialized because it has been
      encrypted. For now, just return -EINVAL to prevent a possible attempt at
      a guest reset.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <aa6506000f6f3a574de8dbcdab0707df844cb00c.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8164a5ff
    • T
      KVM: SVM: Do not allow instruction emulation under SEV-ES · bc624d9f
      Tom Lendacky 提交于
      When a guest is running as an SEV-ES guest, it is not possible to emulate
      instructions. Add support to prevent instruction emulation.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <f6355ea3024fda0a3eb5eb99c6b62dca10d792bd.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      bc624d9f
    • T
      KVM: SVM: Prevent debugging under SEV-ES · 8d4846b9
      Tom Lendacky 提交于
      Since the guest register state of an SEV-ES guest is encrypted, debugging
      is not supported. Update the code to prevent guest debugging when the
      guest has protected state.
      
      Additionally, an SEV-ES guest must only and always intercept DR7 reads and
      writes. Update set_dr_intercepts() and clr_dr_intercepts() to account for
      this.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <8db966fa2f9803d6454ce773863025d0e2e7f3cc.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      8d4846b9
    • T
      KVM: SVM: Add required changes to support intercepts under SEV-ES · f1c6366e
      Tom Lendacky 提交于
      When a guest is running under SEV-ES, the hypervisor cannot access the
      guest register state. There are numerous places in the KVM code where
      certain registers are accessed that are not allowed to be accessed (e.g.
      RIP, CR0, etc). Add checks to prevent register accesses and add intercept
      update support at various points within the KVM code.
      
      Also, when handling a VMGEXIT, exceptions are passed back through the
      GHCB. Since the RDMSR/WRMSR intercepts (may) inject a #GP on error,
      update the SVM intercepts to handle this for SEV-ES guests.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      [Redo MSR part using the .complete_emulated_msr callback. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f1c6366e
    • P
      KVM: x86: introduce complete_emulated_msr callback · f9a4d621
      Paolo Bonzini 提交于
      This will be used by SEV-ES to inject MSR failure via the GHCB.
      Reviewed-by: NTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f9a4d621
    • T
      KVM: SVM: Add support for the SEV-ES VMSA · add5e2f0
      Tom Lendacky 提交于
      Allocate a page during vCPU creation to be used as the encrypted VM save
      area (VMSA) for the SEV-ES guest. Provide a flag in the kvm_vcpu_arch
      structure that indicates whether the guest state is protected.
      
      When freeing a VMSA page that has been encrypted, the cache contents must
      be flushed using the MSR_AMD64_VM_PAGE_FLUSH before freeing the page.
      
      [ i386 build warnings ]
      Reported-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <fde272b17eec804f3b9db18c131262fe074015c5.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      add5e2f0
    • T
      KVM: SVM: Add support for SEV-ES capability in KVM · 916391a2
      Tom Lendacky 提交于
      Add support to KVM for determining if a system is capable of supporting
      SEV-ES as well as determining if a guest is an SEV-ES guest.
      Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e66792323982c822350e40c7a1cf67ea2978a70b.1607620209.git.thomas.lendacky@amd.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      916391a2
    • U
      KVM/VMX/SVM: Move kvm_machine_check function to x86.h · 3f1a18b9
      Uros Bizjak 提交于
      Move kvm_machine_check to x86.h to avoid two exact copies
      of the same function in kvm.c and svm.c.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: NUros Bizjak <ubizjak@gmail.com>
      Message-Id: <20201029135600.122392-1-ubizjak@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3f1a18b9