提交 · 06692e4b8055cc0c6b136fa7df77221ae9639e97 · openeuler / Kernel

22 9月, 2021 1 次提交

KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode · e85d3e7b

由 Maxim Levitsky 提交于 9月 13, 2021

Currently the KVM_REQ_GET_NESTED_STATE_PAGES on SVM only reloads PDPTRs,
and MSR bitmap, with former not really needed for SMM as SMM exit code
reloads them again from SMRAM'S CR3, and later happens to work
since MSR bitmap isn't modified while in SMM.

Still it is better to be consistient with VMX.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-5-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e85d3e7b

21 8月, 2021 3 次提交

KVM: SVM: AVIC: drop unsupported AVIC base relocation code · 73143035

由 Maxim Levitsky 提交于 8月 10, 2021

APIC base relocation is not supported anyway and won't work
correctly so just drop the code that handles it and keep AVIC
MMIO bar at the default APIC base.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210810205251.424103-17-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73143035

KVM: SVM: remove svm_toggle_avic_for_irq_window · 30eed56a

由 Maxim Levitsky 提交于 8月 10, 2021

Now that kvm_request_apicv_update doesn't need to drop the kvm->srcu lock,
we can call kvm_request_apicv_update directly.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210810205251.424103-13-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

30eed56a

KVM: x86: don't disable APICv memslot when inhibited · 36222b11

由 Maxim Levitsky 提交于 8月 10, 2021

Thanks to the former patches, it is now possible to keep the APICv
memslot always enabled, and it will be invisible to the guest
when it is inhibited

This code is based on a suggestion from Sean Christopherson:
https://lkml.org/lkml/2021/7/19/2970Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210810205251.424103-9-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

36222b11

26 7月, 2021 2 次提交

KVM: nSVM: Swap the parameter order for svm_copy_vmrun_state()/svm_copy_vmloadsave_state() · 2bb16bea

由 Vitaly Kuznetsov 提交于 7月 19, 2021

Make svm_copy_vmrun_state()/svm_copy_vmloadsave_state() interface match
'memcpy(dest, src)' to avoid any confusion.

No functional change intended.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210719090322.625277-1-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2bb16bea

KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state() · 9a9e7481

由 Vitaly Kuznetsov 提交于 7月 16, 2021

To match svm_copy_vmrun_state(), rename nested_svm_vmloadsave() to
svm_copy_vmloadsave_state().

Opportunistically add missing braces to 'else' branch in
vmload_vmsave_interception().

No functional change intended.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210716144104.465269-1-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9a9e7481

15 7月, 2021 3 次提交

KVM: nSVM: Restore nested control upon leaving SMM · bb00bd9c

由 Vitaly Kuznetsov 提交于 6月 28, 2021

If the VM was migrated while in SMM, no nested state was saved/restored,
and therefore svm_leave_smm has to load both save and control area
of the vmcb12. Save area is already loaded from HSAVE area,
so now load the control area as well from the vmcb12.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210628104425.391276-6-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bb00bd9c

KVM: nSVM: Introduce svm_copy_vmrun_state() · 0a758290

由 Vitaly Kuznetsov 提交于 6月 28, 2021

Separate the code setting non-VMLOAD-VMSAVE state from
svm_set_nested_state() into its own function. This is going to be
re-used from svm_enter_smm()/svm_leave_smm().
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210628104425.391276-4-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0a758290

KVM: SVM: add module param to control the #SMI interception · 4b639a9f

由 Maxim Levitsky 提交于 7月 07, 2021

In theory there are no side effects of not intercepting #SMI,
because then #SMI becomes transparent to the OS and the KVM.

Plus an observation on recent Zen2 CPUs reveals that these
CPUs ignore #SMI interception and never deliver #SMI VMexits.

This is also useful to test nested KVM to see that L1
handles #SMIs correctly in case when L1 doesn't intercept #SMI.

Finally the default remains the same, the SMI are intercepted
by default thus this patch doesn't have any effect unless
non default module param value is used.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210707125100.677203-4-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4b639a9f

18 6月, 2021 3 次提交

KVM: SVM: hyper-v: Enlightened MSR-Bitmap support · c4327f15

由 Vineeth Pillai 提交于 6月 03, 2021

Enlightened MSR-Bitmap as per TLFS:

 "The L1 hypervisor may collaborate with the L0 hypervisor to make MSR
  accesses more efficient. It can enable enlightened MSR bitmaps by setting
  the corresponding field in the enlightened VMCS to 1. When enabled, L0
  hypervisor does not monitor the MSR bitmaps for changes. Instead, the L1
  hypervisor must invalidate the corresponding clean field after making
  changes to one of the MSR bitmaps."

Enable this for SVM.

Related VMX changes:
commit ceef7d10 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
Signed-off-by: NVineeth Pillai <viremana@linux.microsoft.com>
Message-Id: <87df0710f95d28b91cc4ea014fc4d71056eebbee.1622730232.git.viremana@linux.microsoft.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c4327f15

KVM: SVM: Software reserved fields · 59d21d67

由 Vineeth Pillai 提交于 6月 03, 2021

SVM added support for certain reserved fields to be used by
software or hypervisor. Add the following reserved fields:
  - VMCB offset 0x3e0 - 0x3ff
  - Clean bit 31
  - SVM intercept exit code 0xf0000000

Later patches will make use of this for supporting Hyper-V
nested virtualization enhancements.
Signed-off-by: NVineeth Pillai <viremana@linux.microsoft.com>
Message-Id: <a1f17a43a8e9e751a1a9cc0281649d71bdbf721b.1622730232.git.viremana@linux.microsoft.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

59d21d67

KVM: x86: Use common 'enable_apicv' variable for both APICv and AVIC · fdf513e3

由 Vitaly Kuznetsov 提交于 6月 09, 2021

Unify VMX and SVM code by moving APICv/AVIC enablement tracking to common
'enable_apicv' variable. Note: unlike APICv, AVIC is disabled by default.

No functional change intended.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210609150911.1471882c-2-vkuznets@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fdf513e3

25 5月, 2021 1 次提交

KVM: SVM: make the avic parameter a bool · 28a4aa11

由 Paolo Bonzini 提交于 5月 24, 2021

Make it consistent with kvm_intel.enable_apicv.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

28a4aa11

10 5月, 2021 1 次提交

x86/sev: Move GHCB MSR protocol and NAE definitions in a common header · b81fc74d

由 Brijesh Singh 提交于 4月 27, 2021

The guest and the hypervisor contain separate macros to get and set
the GHCB MSR protocol and NAE event fields. Consolidate the GHCB
protocol definitions and helper macros in one place.

Leave the supported protocol version define in separate files to keep
the guest and hypervisor flexibility to support different GHCB version
in the same release.

There is no functional change intended.
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/20210427111636.1207-3-brijesh.singh@amd.com

b81fc74d

07 5月, 2021 1 次提交

KVM: SVM: Move GHCB unmapping to fix RCU warning · ce7ea0cf

由 Tom Lendacky 提交于 5月 06, 2021

When an SEV-ES guest is running, the GHCB is unmapped as part of the
vCPU run support. However, kvm_vcpu_unmap() triggers an RCU dereference
warning with CONFIG_PROVE_LOCKING=y because the SRCU lock is released
before invoking the vCPU run support.

Move the GHCB unmapping into the prepare_guest_switch callback, which is
invoked while still holding the SRCU lock, eliminating the RCU dereference
warning.

Fixes: 291bd20d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Reported-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Message-Id: <b2f9b79d15166f2c3e4375c0d9bc3268b7696455.1620332081.git.thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ce7ea0cf

26 4月, 2021 5 次提交

KVM: SVM: Drop redundant svm_sev_enabled() helper · a5c1c5aa

由 Sean Christopherson 提交于 4月 21, 2021

Replace calls to svm_sev_enabled() with direct checks on sev_enabled, or
in the case of svm_mem_enc_op, simply drop the call to svm_sev_enabled().
This effectively replaces checks against a valid max_sev_asid with checks
against sev_enabled. sev_enabled is forced off by sev_hardware_setup()
if max_sev_asid is invalid, all call sites are guaranteed to run after
sev_hardware_setup(), and all of the checks care about SEV being fully
enabled (as opposed to intentionally handling the scenario where
max_sev_asid is valid but SEV enabling fails due to OOM).

Reviewed by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210422021125.3417167-14-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a5c1c5aa

KVM: SVM: Move SEV VMCB tracking allocation to sev.c · b95c221c

由 Sean Christopherson 提交于 4月 21, 2021

Move the allocation of the SEV VMCB array to sev.c to help pave the way
toward encapsulating SEV enabling wholly within sev.c.

No functional change intended.

Reviewed by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210422021125.3417167-13-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b95c221c

KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features · d9db0fd6

由 Paolo Bonzini 提交于 4月 21, 2021

Add a reverse-CPUID entry for the memory encryption word, 0x8000001F.EAX,
and use it to override the supported CPUID flags reported to userspace.
Masking the reported CPUID flags avoids over-reporting KVM support, e.g.
without the mask a SEV-SNP capable CPU may incorrectly advertise SNP
support to userspace.

Clear SEV/SEV-ES if their corresponding module parameters are disabled,
and clear the memory encryption leaf completely if SEV is not fully
supported in KVM.  Advertise SME_COHERENT in addition to SEV and SEV-ES,
as the guest can use SME_COHERENT to avoid CLFLUSH operations.

Explicitly omit SME and VM_PAGE_FLUSH from the reporting.  These features
are used by KVM, but are not exposed to the guest, e.g. guest access to
related MSRs will fault.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Co-developed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210422021125.3417167-6-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d9db0fd6

KVM: SVM: Move SEV module params/variables to sev.c · e8126bda

由 Sean Christopherson 提交于 4月 21, 2021

Unconditionally invoke sev_hardware_setup() when configuring SVM and
handle clearing the module params/variable 'sev' and 'sev_es' in
sev_hardware_setup().  This allows making said variables static within
sev.c and reduces the odds of a collision with guest code, e.g. the guest
side of things has already laid claim to 'sev_enabled'.
Reviewed-by: NTom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210422021125.3417167-5-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e8126bda

KVM: SVM: Delay restoration of host MSR_TSC_AUX until return to userspace · 844d69c2

由 Sean Christopherson 提交于 4月 23, 2021

Use KVM's "user return MSRs" framework to defer restoring the host's
MSR_TSC_AUX until the CPU returns to userspace.  Add/improve comments to
clarify why MSR_TSC_AUX is intercepted on both RDMSR and WRMSR, and why
it's safe for KVM to keep the guest's value loaded even if KVM is
scheduled out.

Cc: Reiji Watanabe <reijiw@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210423223404.3860547-5-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

844d69c2

22 4月, 2021 1 次提交

KVM: x86: Support KVM VMs sharing SEV context · 54526d1f

由 Nathan Tempelman 提交于 4月 08, 2021

Add a capability for userspace to mirror SEV encryption context from
one vm to another. On our side, this is intended to support a
Migration Helper vCPU, but it can also be used generically to support
other in-guest workloads scheduled by the host. The intention is for
the primary guest and the mirror to have nearly identical memslots.

The primary benefits of this are that:
1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they
can't accidentally clobber each other.
2) The VMs can have different memory-views, which is necessary for post-copy
migration (the migration vCPUs on the target need to read and write to
pages, when the primary guest would VMEXIT).

This does not change the threat model for AMD SEV. Any memory involved
is still owned by the primary guest and its initial state is still
attested to through the normal SEV_LAUNCH_* flows. If userspace wanted
to circumvent SEV, they could achieve the same effect by simply attaching
a vCPU to the primary VM.
This patch deliberately leaves userspace in charge of the memslots for the
mirror, as it already has the power to mess with them in the primary guest.

This patch does not support SEV-ES (much less SNP), as it does not
handle handing off attested VMSAs to the mirror.

For additional context, we need a Migration Helper because SEV PSP
migration is far too slow for our live migration on its own. Using
an in-guest migrator lets us speed this up significantly.
Signed-off-by: NNathan Tempelman <natet@google.com>
Message-Id: <20210408223214.2582277-1-natet@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54526d1f

20 4月, 2021 3 次提交

KVM: SVM: Define actual size of IOPM and MSRPM tables · 47903dc1

由 Krish Sadhukhan 提交于 4月 12, 2021

Define the actual size of the IOPM and MSRPM tables so that the actual size
can be used when initializing them and when checking the consistency of their
physical address.
These #defines are placed in svm.h so that they can be shared.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com>
Message-Id: <20210412215611.110095-2-krish.sadhukhan@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

47903dc1

KVM: SVM: Add a comment to clarify what vcpu_svm.vmcb points at · 554cf314

由 Sean Christopherson 提交于 4月 06, 2021

Add a comment above the declaration of vcpu_svm.vmcb to call out that it
is simply a shorthand for current_vmcb->ptr.  The myriad accesses to
svm->vmcb are quite confusing without this crucial detail.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210406171811.4043363-4-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

554cf314

KVM: SVM: Drop vcpu_svm.vmcb_pa · d1788191

由 Sean Christopherson 提交于 4月 06, 2021

Remove vmcb_pa from vcpu_svm and simply read current_vmcb->pa directly in
the one path where it is consumed.  Unlike svm->vmcb, use of the current
vmcb's address is very limited, as evidenced by the fact that its use
can be trimmed to a single dereference.

Opportunistically add a comment about using vmcb01 for VMLOAD/VMSAVE, at
first glance using vmcb01 instead of vmcb_pa looks wrong.

No functional change intended.

Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210406171811.4043363-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d1788191

17 4月, 2021 1 次提交

KVM: nSVM: improve SYSENTER emulation on AMD · adc2a237

由 Maxim Levitsky 提交于 4月 01, 2021

Currently to support Intel->AMD migration, if CPU vendor is GenuineIntel,
we emulate the full 64 value for MSR_IA32_SYSENTER_{EIP|ESP}
msrs, and we also emulate the sysenter/sysexit instruction in long mode.

(Emulator does still refuse to emulate sysenter in 64 bit mode, on the
ground that the code for that wasn't tested and likely has no users)

However when virtual vmload/vmsave is enabled, the vmload instruction will
update these 32 bit msrs without triggering their msr intercept,
which will lead to having stale values in kvm's shadow copy of these msrs,
which relies on the intercept to be up to date.

Fix/optimize this by doing the following:

1. Enable the MSR intercepts for SYSENTER MSRs iff vendor=GenuineIntel
   (This is both a tiny optimization and also ensures that in case
   the guest cpu vendor is AMD, the msrs will be 32 bit wide as
   AMD defined).

2. Store only high 32 bit part of these msrs on interception and combine
   it with hardware msr value on intercepted read/writes
   iff vendor=GenuineIntel.

3. Disable vmload/vmsave virtualization if vendor=GenuineIntel.
   (It is somewhat insane to set vendor=GenuineIntel and still enable
   SVM for the guest but well whatever).
   Then zero the high 32 bit parts when kvm intercepts and emulates vmload.

Thanks a lot to Paulo Bonzini for helping me with fixing this in the most
correct way.

This patch fixes nested migration of 32 bit nested guests, that was
broken because incorrect cached values of SYSENTER msrs were stored in
the migration stream if L1 changed these msrs with
vmload prior to L2 entry.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210401111928.996871-3-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

adc2a237

05 4月, 2021 1 次提交

svm/sev: Register SEV and SEV-ES ASIDs to the misc controller · 7aef27f0

由 Vipin Sharma 提交于 3月 29, 2021

Secure Encrypted Virtualization (SEV) and Secure Encrypted
Virtualization - Encrypted State (SEV-ES) ASIDs are used to encrypt KVMs
on AMD platform. These ASIDs are available in the limited quantities on
a host.

Register their capacity and usage to the misc controller for tracking
via cgroups.
Signed-off-by: NVipin Sharma <vipinsh@google.com>
Reviewed-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

7aef27f0

15 3月, 2021 7 次提交

KVM: nSVM: Optimize vmcb12 to vmcb02 save area copies · 8173396e

由 Cathy Avery 提交于 3月 01, 2021

Use the vmcb12 control clean field to determine which vmcb12.save
registers were marked dirty in order to minimize register copies
when switching from L1 to L2. Those vmcb12 registers marked as dirty need
to be copied to L0's vmcb02 as they will be used to update the vmcb
state cache for the L2 VMRUN.  In the case where we have a different
vmcb12 from the last L2 VMRUN all vmcb12.save registers must be
copied over to L2.save.

Tested:
kvm-unit-tests
kvm selftests
Fedora L1 L2
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NCathy Avery <cavery@redhat.com>
Message-Id: <20210301200844.2000-1-cavery@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8173396e

KVM: nSVM: Add helper to synthesize nested VM-Exit without collateral · 3a87c7e0

由 Sean Christopherson 提交于 3月 02, 2021

Add a helper to consolidate boilerplate for nested VM-Exits that don't
provide any data in exit_info_*.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210302174515.2812275-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3a87c7e0

KVM: SVM: Pass struct kvm_vcpu to exit handlers (and many, many other places) · 63129754

由 Paolo Bonzini 提交于 3月 02, 2021

Refactor the svm_exit_handlers API to pass @vcpu instead of @svm to
allow directly invoking common x86 exit handlers (in a future patch).
Opportunistically convert an absurd number of instances of 'svm->vcpu'
to direct uses of 'vcpu' to avoid pointless casting.

No functional change intended.
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210205005750.3841462-4-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

63129754

KVM: nSVM: rename functions and variables according to vmcbXY nomenclature · 9e8f0fbf

由 Paolo Bonzini 提交于 11月 17, 2020

Now that SVM is using a separate vmcb01 and vmcb02 (and also uses the vmcb12
naming) we can give clearer names to functions that write to and read
from those VMCBs.  Likewise, variables and parameters can be renamed
from nested_vmcb to vmcb12.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9e8f0fbf

KVM: nSVM: Track the ASID generation of the vmcb vmrun through the vmcb · 193015ad

由 Cathy Avery 提交于 1月 12, 2021

This patch moves the asid_generation from the vcpu to the vmcb
in order to track the ASID generation that was active the last
time the vmcb was run. If sd->asid_generation changes between
two runs, the old ASID is invalid and must be changed.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NCathy Avery <cavery@redhat.com>
Message-Id: <20210112164313.4204-3-cavery@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

193015ad

KVM: nSVM: Track the physical cpu of the vmcb vmrun through the vmcb · af18fa77

由 Cathy Avery 提交于 1月 12, 2021

This patch moves the physical cpu tracking from the vcpu
to the vmcb in svm_switch_vmcb. If either vmcb01 or vmcb02
change physical cpus from one vmrun to the next the vmcb's
previous cpu is preserved for comparison with the current
cpu and the vmcb is marked dirty if different. This prevents
the processor from using old cached data for a vmcb that may
have been updated on a prior run on a different processor.

It also moves the physical cpu check from svm_vcpu_load
to pre_svm_run as the check only needs to be done at run.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NCathy Avery <cavery@redhat.com>
Message-Id: <20210112164313.4204-2-cavery@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

af18fa77

KVM: SVM: Use a separate vmcb for the nested L2 guest · 4995a368

由 Cathy Avery 提交于 1月 13, 2021

svm->vmcb will now point to a separate vmcb for L1 (not nested) or L2
(nested).

The main advantages are removing get_host_vmcb and hsave, in favor of
concepts that are shared with VMX.

We don't need anymore to stash the L1 registers in hsave while L2
runs, but we need to copy the VMLOAD/VMSAVE registers from VMCB01 to
VMCB02 and back.  This more or less has the same cost, but code-wise
nested_svm_vmloadsave can be reused.

This patch omits several optimizations that are possible:

- for simplicity there is some wholesale copying of vmcb.control areas
which can go away.

- we should be able to better use the VMCB01 and VMCB02 clean bits.

- another possibility is to always use VMCB01 for VMLOAD and VMSAVE,
thus avoiding the copy of VMLOAD/VMSAVE registers from VMCB01 to
VMCB02 and back.

Tested:
kvm-unit-tests
kvm self tests
Loaded fedora nested guest on fedora
Signed-off-by: NCathy Avery <cavery@redhat.com>
Message-Id: <20201011184818.3609-3-cavery@redhat.com>
[Fix conflicts; keep VMCB02 G_PAT up to date whenever guest writes the
 PAT MSR; do not copy CR4 over from VMCB01 as it is not needed anymore; add
 a few more comments. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4995a368

04 2月, 2021 3 次提交

KVM: SVM: use .prepare_guest_switch() to handle CPU register save/setup · a7fc06dd

由 Michael Roth 提交于 2月 02, 2021

Currently we save host state like user-visible host MSRs, and do some
initial guest register setup for MSR_TSC_AUX and MSR_AMD64_TSC_RATIO
in svm_vcpu_load(). Defer this until just before we enter the guest by
moving the handling to kvm_x86_ops.prepare_guest_switch() similarly to
how it is done for the VMX implementation.

Additionally, since handling of saving/restoring host user MSRs is the
same both with/without SEV-ES enabled, move that handling to common
code.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NMichael Roth <michael.roth@amd.com>
Message-Id: <20210202190126.2185715-4-michael.roth@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7fc06dd

KVM: SVM: remove uneeded fields from host_save_users_msrs · 553cc15f

由 Michael Roth 提交于 2月 02, 2021

Now that the set of host user MSRs that need to be individually
saved/restored are the same with/without SEV-ES, we can drop the
.sev_es_restored flag and just iterate through the list unconditionally
for both cases. A subsequent patch can then move these loops to a
common path.
Signed-off-by: NMichael Roth <michael.roth@amd.com>
Message-Id: <20210202190126.2185715-3-michael.roth@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

553cc15f

KVM: SVM: use vmsave/vmload for saving/restoring additional host state · e79b91bb

由 Michael Roth 提交于 2月 02, 2021

Using a guest workload which simply issues 'hlt' in a tight loop to
generate VMEXITs, it was observed (on a recent EPYC processor) that a
significant amount of the VMEXIT overhead measured on the host was the
result of MSR reads/writes in svm_vcpu_load/svm_vcpu_put according to
perf:

  67.49%--kvm_arch_vcpu_ioctl_run
          |
          |--23.13%--vcpu_put
          |          kvm_arch_vcpu_put
          |          |
          |          |--21.31%--native_write_msr
          |          |
          |           --1.27%--svm_set_cr4
          |
          |--16.11%--vcpu_load
          |          |
          |           --15.58%--kvm_arch_vcpu_load
          |                     |
          |                     |--13.97%--svm_set_cr4
          |                     |          |
          |                     |          |--12.64%--native_read_msr

Most of these MSRs relate to 'syscall'/'sysenter' and segment bases, and
can be saved/restored using 'vmsave'/'vmload' instructions rather than
explicit MSR reads/writes. In doing so there is a significant reduction
in the svm_vcpu_load/svm_vcpu_put overhead measured for the above
workload:

  50.92%--kvm_arch_vcpu_ioctl_run
          |
          |--19.28%--disable_nmi_singlestep
          |
          |--13.68%--vcpu_load
          |          kvm_arch_vcpu_load
          |          |
          |          |--9.19%--svm_set_cr4
          |          |          |
          |          |           --6.44%--native_read_msr
          |          |
          |           --3.55%--native_write_msr
          |
          |--6.05%--kvm_inject_nmi
          |--2.80%--kvm_sev_es_mmio_read
          |--2.19%--vcpu_put
          |          |
          |           --1.25%--kvm_arch_vcpu_put
          |                     native_write_msr

Quantifying this further, if we look at the raw cycle counts for a
normal iteration of the above workload (according to 'rdtscp'),
kvm_arch_vcpu_ioctl_run() takes ~4600 cycles from start to finish with
the current behavior. Using 'vmsave'/'vmload', this is reduced to
~2800 cycles, a savings of 39%.

While this approach doesn't seem to manifest in any noticeable
improvement for more realistic workloads like UnixBench, netperf, and
kernel builds, likely due to their exit paths generally involving IO
with comparatively high latencies, it does improve overall overhead
of KVM_RUN significantly, which may still be noticeable for certain
situations. It also simplifies some aspects of the code.

With this change, explicit save/restore is no longer needed for the
following host MSRs, since they are documented[1] as being part of the
VMCB State Save Area:

  MSR_STAR, MSR_LSTAR, MSR_CSTAR,
  MSR_SYSCALL_MASK, MSR_KERNEL_GS_BASE,
  MSR_IA32_SYSENTER_CS,
  MSR_IA32_SYSENTER_ESP,
  MSR_IA32_SYSENTER_EIP,
  MSR_FS_BASE, MSR_GS_BASE

and only the following MSR needs individual handling in
svm_vcpu_put/svm_vcpu_load:

  MSR_TSC_AUX

We could drop the host_save_user_msrs array/loop and instead handle
MSR read/write of MSR_TSC_AUX directly, but we leave that for now as
a potential follow-up.

Since 'vmsave'/'vmload' also handles the LDTR and FS/GS segment
registers (and associated hidden state)[2], some of the code
previously used to handle this is no longer needed, so we drop it
as well.

The first public release of the SVM spec[3] also documents the same
handling for the host state in question, so we make these changes
unconditionally.

Also worth noting is that we 'vmsave' to the same page that is
subsequently used by 'vmrun' to record some host additional state. This
is okay, since, in accordance with the spec[2], the additional state
written to the page by 'vmrun' does not overwrite any fields written by
'vmsave'. This has also been confirmed through testing (for the above
CPU, at least).

[1] AMD64 Architecture Programmer's Manual, Rev 3.33, Volume 2, Appendix B, Table B-2
[2] AMD64 Architecture Programmer's Manual, Rev 3.31, Volume 3, Chapter 4, VMSAVE/VMLOAD
[3] Secure Virtual Machine Architecture Reference Manual, Rev 3.01
Suggested-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NMichael Roth <michael.roth@amd.com>
Message-Id: <20210202190126.2185715-2-michael.roth@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e79b91bb

03 2月, 2021 1 次提交

KVM: x86: cleanup CR3 reserved bits checks · c1c35cf7

由 Paolo Bonzini 提交于 11月 13, 2020

If not in long mode, the low bits of CR3 are reserved but not enforced to
be zero, so remove those checks.  If in long mode, however, the MBZ bits
extend down to the highest physical address bit of the guest, excluding
the encryption bit.

Make the checks consistent with the above, and match them between
nested_vmcb_checks and KVM_SET_SREGS.

Cc: stable@vger.kernel.org
Fixes: 761e4169 ("KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests")
Fixes: a780a3ea ("KVM: X86: Fix reserved bits check for MOV to CR3")
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c1c35cf7

08 1月, 2021 1 次提交

KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2

由 Tom Lendacky 提交于 1月 04, 2021

Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
where the guest vCPU register state is updated and then the vCPU is VMRUN
to begin execution of the AP. For an SEV-ES guest, this won't work because
the guest register state is encrypted.

Following the GHCB specification, the hypervisor must not alter the guest
register state, so KVM must track an AP/vCPU boot. Should the guest want
to park the AP, it must use the AP Reset Hold exit event in place of, for
example, a HLT loop.

First AP boot (first INIT-SIPI-SIPI sequence):
Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
support. It is up to the guest to transfer control of the AP to the
proper location.

Subsequent AP boot:
KVM will expect to receive an AP Reset Hold exit event indicating that
the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
awaken it. When the AP Reset Hold exit event is received, KVM will place
the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
sequence, KVM will make the vCPU runnable. It is again up to the guest
to then transfer control of the AP to the proper location.

To differentiate between an actual HLT and an AP Reset Hold, a new MP
state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
placed in upon receiving the AP Reset Hold exit event. Additionally, to
communicate the AP Reset Hold exit event up to userspace (if needed), a
new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.

A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
a new function that, for non SEV-ES guests, invokes the original SIPI
delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
implements the logic above.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

647daca2

16 12月, 2020 1 次提交

KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting · 8640ca58

由 Tom Lendacky 提交于 12月 15, 2020

The GHCB specification requires the hypervisor to save the address of an
AP Jump Table so that, for example, vCPUs that have been parked by UEFI
can be started by the OS. Provide support for the AP Jump Table set/get
exit code.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8640ca58

15 12月, 2020 1 次提交

KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests · 16809ecd

由 Tom Lendacky 提交于 12月 10, 2020

The run sequence is different for an SEV-ES guest compared to a legacy or
even an SEV guest. The guest vCPU register state of an SEV-ES guest will
be restored on VMRUN and saved on VMEXIT. There is no need to restore the
guest registers directly and through VMLOAD before VMRUN and no need to
save the guest registers directly and through VMSAVE on VMEXIT.

Update the svm_vcpu_run() function to skip register state saving and
restoring and provide an alternative function for running an SEV-ES guest
in vmenter.S

Additionally, certain host state is restored across an SEV-ES VMRUN. As
a result certain register states are not required to be restored upon
VMEXIT (e.g. FS, GS, etc.), so only do that if the guest is not an SEV-ES
guest.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Message-Id: <fb1c66d32f2194e171b95fc1a8affd6d326e10c1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

16809ecd

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功