提交 · fc4fad79fc3d8841562e2a85808079da5b4835f6 · openeuler / Kernel

20 1月, 2022 1 次提交

KVM: VMX: Reject KVM_RUN if emulation is required with pending exception · fc4fad79

由 Sean Christopherson 提交于 12月 28, 2021

Reject KVM_RUN if emulation is required (because VMX is running without
unrestricted guest) and an exception is pending, as KVM doesn't support
emulating exceptions except when emulating real mode via vm86.  The vCPU
is hosed either way, but letting KVM_RUN proceed triggers a WARN due to
the impossible condition.  Alternatively, the WARN could be removed, but
then userspace and/or KVM bugs would result in the vCPU silently running
in a bad state, which isn't very friendly to users.

Originally, the bug was hit by syzkaller with a nested guest as that
doesn't require kvm_intel.unrestricted_guest=0.  That particular flavor
is likely fixed by commit cd0e615c ("KVM: nVMX: Synthesize
TRIPLE_FAULT for L2 if emulation is required"), but it's trivial to
trigger the WARN with a non-nested guest, and userspace can likely force
bad state via ioctls() for a nested guest as well.

Checking for the impossible condition needs to be deferred until KVM_RUN
because KVM can't force specific ordering between ioctls.  E.g. clearing
exception.pending in KVM_SET_SREGS doesn't prevent userspace from setting
it in KVM_SET_VCPU_EVENTS, and disallowing KVM_SET_VCPU_EVENTS with
emulation_required would prevent userspace from queuing an exception and
then stuffing sregs.  Note, if KVM were to try and detect/prevent the
condition prior to KVM_RUN, handle_invalid_guest_state() and/or
handle_emulation_failure() would need to be modified to clear the pending
exception prior to exiting to userspace.

 ------------[ cut here ]------------
 WARNING: CPU: 6 PID: 137812 at arch/x86/kvm/vmx/vmx.c:1623 vmx_queue_exception+0x14f/0x160 [kvm_intel]
 CPU: 6 PID: 137812 Comm: vmx_invalid_nes Not tainted 5.15.2-7cc36c3e14ae-pop #279
 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
 RIP: 0010:vmx_queue_exception+0x14f/0x160 [kvm_intel]
 Code: <0f> 0b e9 fd fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
 RSP: 0018:ffffa45c83577d38 EFLAGS: 00010202
 RAX: 0000000000000003 RBX: 0000000080000006 RCX: 0000000000000006
 RDX: 0000000000000000 RSI: 0000000000010002 RDI: ffff9916af734000
 RBP: ffff9916af734000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000006
 R13: 0000000000000000 R14: ffff9916af734038 R15: 0000000000000000
 FS:  00007f1e1a47c740(0000) GS:ffff99188fb80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f1e1a6a8008 CR3: 000000026f83b005 CR4: 00000000001726e0
 Call Trace:
  kvm_arch_vcpu_ioctl_run+0x13a2/0x1f20 [kvm]
  kvm_vcpu_ioctl+0x279/0x690 [kvm]
  __x64_sys_ioctl+0x83/0xb0
  do_syscall_64+0x3b/0xc0
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported-by: syzbot+82112403ace4cbd780d8@syzkaller.appspotmail.com
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20211228232437.1875318-2-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fc4fad79

18 1月, 2022 1 次提交

KVM: x86: Making the module parameter of vPMU more common · 4732f244

由 Like Xu 提交于 1月 11, 2022

The new module parameter to control PMU virtualization should apply
to Intel as well as AMD, for situations where userspace is not trusted.
If the module parameter allows PMU virtualization, there could be a
new KVM_CAP or guest CPUID bits whereby userspace can enable/disable
PMU virtualization on a per-VM basis.

If the module parameter does not allow PMU virtualization, there
should be no userspace override, since we have no precedent for
authorizing that kind of override. If it's false, other counter-based
profiling features (such as LBR including the associated CPUID bits
if any) will not be exposed.

Change its name from "pmu" to "enable_pmu" as we have temporary
variables with the same name in our code like "struct kvm_pmu *pmu".

Fixes: b1d66dad ("KVM: x86/svm: Add module param to control PMU virtualization")
Suggested-by : Jim Mattson <jmattson@google.com>
Signed-off-by: NLike Xu <likexu@tencent.com>
Message-Id: <20220111073823.21885-1-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4732f244

07 1月, 2022 1 次提交

KVM: SVM: include CR3 in initial VMSA state for SEV-ES guests · 405329fc

由 Michael Roth 提交于 12月 16, 2021

Normally guests will set up CR3 themselves, but some guests, such as
kselftests, and potentially CONFIG_PVH guests, rely on being booted
with paging enabled and CR3 initialized to a pre-allocated page table.

Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest
VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this
is too late, since it will have switched over to using the VMSA page
prior to that point, with the VMSA CR3 copied from the VMCB initial
CR3 value: 0.

Address this by sync'ing the CR3 value into the VMCB save area
immediately when KVM_SET_SREGS* is issued so it will find it's way into
the initial VMSA.
Suggested-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NMichael Roth <michael.roth@amd.com>
Message-Id: <20211216171358.61140-10-michael.roth@amd.com>
[Remove vmx_post_set_cr3; add a remark about kvm_set_cr3 not calling the
 new hook. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

405329fc

20 12月, 2021 1 次提交

KVM: x86: Always set kvm_run->if_flag · c5063551

由 Marc Orr 提交于 12月 09, 2021

The kvm_run struct's if_flag is a part of the userspace/kernel API. The
SEV-ES patches failed to set this flag because it's no longer needed by
QEMU (according to the comment in the source code). However, other
hypervisors may make use of this flag. Therefore, set the flag for
guests with encrypted registers (i.e., with guest_state_protected set).

Fixes: f1c6366e ("KVM: SVM: Add required changes to support intercepts under SEV-ES")
Signed-off-by: NMarc Orr <marcorr@google.com>
Message-Id: <20211209155257.128747-1-marcorr@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>

c5063551

08 12月, 2021 11 次提交

KVM: X86: Remove mmu parameter from load_pdptrs() · 2df4a5eb

由 Lai Jiangshan 提交于 11月 24, 2021

It uses vcpu->arch.walk_mmu always; nested EPT does not have PDPTRs,
and nested NPT treats them like all other non-leaf page table levels
instead of caching them.
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211124122055.64424-11-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2df4a5eb

KVM: SVM: Allocate sd->save_area with __GFP_ZERO · 58356767

由 Lai Jiangshan 提交于 11月 18, 2021

And remove clear_page() on it.
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211118110814.2568-10-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

58356767

KVM: SVM: Rename get_max_npt_level() to get_npt_level() · 1af4a119

由 Lai Jiangshan 提交于 11月 18, 2021

It returns the only proper NPT level, so the "max" in the name
is not appropriate.
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211118110814.2568-9-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1af4a119

KVM: vmx, svm: clean up mass updates to regs_avail/regs_dirty bits · 41e68b69

由 Paolo Bonzini 提交于 11月 26, 2021

Document the meaning of the three combinations of regs_avail and
regs_dirty.  Update regs_dirty just after writeback instead of
doing it later after vmexit.  After vmexit, instead, we clear the
regs_avail bits corresponding to lazily-loaded registers.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

41e68b69

KVM: SVM: Remove references to VCPU_EXREG_CR3 · aec9c240

由 Lai Jiangshan 提交于 11月 08, 2021

VCPU_EXREG_CR3 is never cleared from vcpu->arch.regs_avail or
vcpu->arch.regs_dirty in SVM; therefore, marking CR3 as available is
merely a NOP, and testing it will likewise always succeed.
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211108124407.12187-9-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

aec9c240

KVM: SVM: Remove outdated comment in svm_load_mmu_pgd() · 8f29bf12

由 Lai Jiangshan 提交于 11月 08, 2021

The comment had been added in the commit 689f3bf2 ("KVM: x86: unify
callbacks to load paging root") and its related code was removed later,
and it has nothing to do with the next line of code.

So the comment should be removed too.
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211108124407.12187-8-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8f29bf12

KVM: SVM: Track dirtiness of PDPTRs even if NPT is disabled · 40e49c4f

由 Lai Jiangshan 提交于 11月 08, 2021

Use the same logic to handle the availability of VCPU_EXREG_PDPTR
as VMX, also removing a branch in svm_vcpu_run().
Signed-off-by: NLai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211108124407.12187-4-jiangshanlai@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

40e49c4f

KVM: x86/svm: Add module param to control PMU virtualization · b1d66dad

由 Like Xu 提交于 11月 17, 2021

For Intel, the guest PMU can be disabled via clearing the PMU CPUID.
For AMD, all hw implementations support the base set of four
performance counters, with current mainstream hardware indicating
the presence of two additional counters via X86_FEATURE_PERFCTR_CORE.

In the virtualized world, the AMD guest driver may detect
the presence of at least one counter MSR. Most hypervisor
vendors would introduce a module param (like lbrv for svm)
to disable PMU for all guests.

Another control proposal per-VM is to pass PMU disable information
via MSR_IA32_PERF_CAPABILITIES or one bit in CPUID Fn4000_00[FF:00].
Both of methods require some guest-side changes, so a module
parameter may not be sufficiently granular, but practical enough.
Signed-off-by: NLike Xu <likexu@tencent.com>
Message-Id: <20211117080304.38989-1-likexu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b1d66dad

KVM: nSVM: introduce struct vmcb_ctrl_area_cached · 8fc78909

由 Emanuele Giuseppe Esposito 提交于 11月 03, 2021

This structure will replace vmcb_control_area in
svm_nested_state, providing only the fields that are actually
used by the nested state. This avoids having and copying around
uninitialized fields. The cost of this, however, is that all
functions (in this case vmcb_is_intercept) expect the old
structure, so they need to be duplicated.

In addition, in svm_get_nested_state() user space expects a
vmcb_control_area struct, so we need to copy back all fields
in a temporary structure before copying it to userspace.
Signed-off-by: NEmanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20211103140527.752797-7-eesposit@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8fc78909

KVM: nSVM: rename nested_load_control_from_vmcb12 in nested_copy_vmcb_control_to_cache · 7907160d

由 Emanuele Giuseppe Esposito 提交于 11月 03, 2021

Following the same naming convention of the previous patch,
rename nested_load_control_from_vmcb12.
In addition, inline copy_vmcb_control_area as it is only called
by this function.

__nested_copy_vmcb_control_to_cache() works with vmcb_control_area
parameters and it will be useful in next patches, when we use
local variables instead of svm cached state.
Signed-off-by: NEmanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20211103140527.752797-4-eesposit@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7907160d

KVM: nSVM: introduce svm->nested.save to cache save area before checks · f2740a8d

由 Emanuele Giuseppe Esposito 提交于 11月 03, 2021

This is useful in the next patch, to keep a saved copy
of vmcb12 registers and pass it around more easily.

Instead of blindly copying everything, we just copy EFER, CR0, CR3, CR4,
DR6 and DR7 which are needed by the VMRUN checks.  If more fields will
need to be checked, it will be quite obvious to see that they must be added
in struct vmcb_save_area_cached and in nested_copy_vmcb_save_to_cache().

__nested_copy_vmcb_save_to_cache() takes a vmcb_save_area_cached
parameter, which is useful in order to save the state to a local
variable.
Signed-off-by: NEmanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20211103140527.752797-3-eesposit@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f2740a8d

30 11月, 2021 1 次提交

KVM: x86: check PIR even for vCPUs with disabled APICv · 37c4dbf3

由 Paolo Bonzini 提交于 11月 22, 2021

The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
if APICv is disabled on the vCPU that receives it.  In that case, the
interrupt will just cause a vmexit and leave the ON bit set together
with the PIR bit corresponding to the interrupt.

Right now, the interrupt would not be delivered until APICv is re-enabled.
However, fixing this is just a matter of always doing the PIR->IRR
synchronization, even if the vCPU has temporarily disabled APICv.

This is not a problem for performance, or if anything it is an
improvement.  First, in the common case where vcpu->arch.apicv_active is
true, one fewer check has to be performed.  Second, static_call_cond will
elide the function call if APICv is not present or disabled.  Finally,
in the case for AMD hardware we can remove the sync_pir_to_irr callback:
it is only needed for apic_has_interrupt_for_ppr, and that function
already has a fallback for !APICv.

Cc: stable@vger.kernel.org
Co-developed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Message-Id: <20211123004311.2954158-4-pbonzini@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

37c4dbf3

11 11月, 2021 3 次提交

KVM: Move INVPCID type check from vmx and svm to the common kvm_handle_invpcid() · 796c83c5

由 Vipin Sharma 提交于 11月 09, 2021

Handle #GP on INVPCID due to an invalid type in the common switch
statement instead of relying on the callers (VMX and SVM) to manually
validate the type.

Unlike INVVPID and INVEPT, INVPCID is not explicitly documented to check
the type before reading the operand from memory, so deferring the
type validity check until after that point is architecturally allowed.
Signed-off-by: NVipin Sharma <vipinsh@google.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20211109174426.2350547-3-vipinsh@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

796c83c5

KVM: SEV: Add support for SEV intra host migration · b5663931

由 Peter Gonda 提交于 10月 21, 2021

For SEV to work with intra host migration, contents of the SEV info struct
such as the ASID (used to index the encryption key in the AMD SP) and
the list of memory regions need to be transferred to the target VM.
This change adds a commands for a target VMM to get a source SEV VM's sev
info.
Signed-off-by: NPeter Gonda <pgonda@google.com>
Suggested-by: NSean Christopherson <seanjc@google.com>
Reviewed-by: NMarc Orr <marcorr@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-3-pgonda@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b5663931

KVM: SEV: Refactor out sev_es_state struct · b67a4cc3

由 Peter Gonda 提交于 10月 21, 2021

Move SEV-ES vCPU metadata into new sev_es_state struct from vcpu_svm.
Signed-off-by: NPeter Gonda <pgonda@google.com>
Suggested-by: NTom Lendacky <thomas.lendacky@amd.com>
Acked-by: NTom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-2-pgonda@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b67a4cc3

25 10月, 2021 1 次提交

KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info · 0a62a031

由 David Edmondson 提交于 9月 20, 2021

Extend the get_exit_info static call to provide the reason for the VM
exit. Modify relevant trace points to use this rather than extracting
the reason in the caller.
Signed-off-by: NDavid Edmondson <david.edmondson@oracle.com>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210920103737.2696756-3-david.edmondson@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0a62a031

23 10月, 2021 1 次提交

x86/kvm: Convert FPU handling to a single swap buffer · d69c1382

由 Thomas Gleixner 提交于 10月 22, 2021

For the upcoming AMX support it's necessary to do a proper integration with
KVM. Currently KVM allocates two FPU structs which are used for saving the user
state of the vCPU thread and restoring the guest state when entering
vcpu_run() and doing the reverse operation before leaving vcpu_run().

With the new fpstate mechanism this can be reduced to one extra buffer by
swapping the fpstate pointer in current::thread::fpu. This makes the
upcoming support for AMX and XFD simpler because then fpstate information
(features, sizes, xfd) are always consistent and it does not require any
nasty workarounds.

Convert the KVM FPU code over to this new scheme.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211022185313.019454292@linutronix.de

d69c1382

22 10月, 2021 2 次提交

KVM: x86: Move SVM's APICv sanity check to common x86 · ee49a893

由 Sean Christopherson 提交于 10月 21, 2021

Move SVM's assertion that vCPU's APICv state is consistent with its VM's
state out of svm_vcpu_run() and into x86's common inner run loop.  The
assertion and underlying logic is not unique to SVM, it's just that SVM
has more inhibiting conditions and thus is more likely to run headfirst
into any KVM bugs.

Add relevant comments to document exactly why the update path has unusual
ordering between the update the kick, why said ordering is safe, and also
the basic rules behind the assertion in the run loop.

Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20211022004927.1448382-3-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ee49a893

KVM: x86: Add vendor name to kvm_x86_ops, use it for error messages · 9dadfc4a

由 Sean Christopherson 提交于 10月 18, 2021

Paul pointed out the error messages when KVM fails to load are unhelpful
in understanding exactly what went wrong if userspace probes the "wrong"
module.

Add a mandatory kvm_x86_ops field to track vendor module names, kvm_intel
and kvm_amd, and use the name for relevant error message when KVM fails
to load so that the user knows which module failed to load.

Opportunistically tweak the "disabled by bios" error message to clarify
that _support_ was disabled, not that the module itself was magically
disabled by BIOS.
Suggested-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20211018183929.897461-2-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9dadfc4a

04 10月, 2021 1 次提交

x86/sev: Replace occurrences of sev_active() with cc_platform_has() · 4d96f910

由 Tom Lendacky 提交于 9月 08, 2021

Replace uses of sev_active() with the more generic cc_platform_has()
using CC_ATTR_GUEST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_GUEST_MEM_ENCRYPT
can be updated, as required.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210928191009.32551-7-bp@alien8.de

4d96f910

01 10月, 2021 3 次提交

KVM: x86: nSVM: implement nested TSC scaling · 5228eb96

由 Maxim Levitsky 提交于 9月 14, 2021

This was tested by booting a nested guest with TSC=1Ghz,
observing the clocks, and doing about 100 cycles of migration.

Note that qemu patch is needed to support migration because
of a new MSR that needs to be placed in the migration state.

The patch will be sent to the qemu mailing list soon.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-14-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5228eb96

KVM: x86: SVM: add module param to control TSC scaling · f800650a

由 Maxim Levitsky 提交于 9月 14, 2021

This allows to easily simulate a CPU without this feature.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-13-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f800650a

KVM: x86: SVM: don't set VMLOAD/VMSAVE intercepts on vCPU reset · 36e8194d

由 Paolo Bonzini 提交于 9月 23, 2021

Commit adc2a237 ("KVM: nSVM: improve SYSENTER emulation on AMD"),
made init_vmcb set vmload/vmsave intercepts unconditionally,
and relied on svm_vcpu_after_set_cpuid to clear them when possible.

However init_vmcb is also called when the vCPU is reset, and it is
not followed by another call to svm_vcpu_after_set_cpuid because
the CPUID is already set.  This mistake makes the VMSAVE/VMLOAD intercept
to be set when it is not needed, and harms performance of the nested
guest.

Extract the relevant parts of svm_vcpu_after_set_cpuid so that they
can be called again on reset.

Fixes: adc2a237 ("KVM: nSVM: improve SYSENTER emulation on AMD")
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

36e8194d

30 9月, 2021 2 次提交

KVM: x86: SVM: add module param to control LBR virtualization · 4c84926e

由 Maxim Levitsky 提交于 9月 14, 2021

This is useful for debug and also makes it consistent with
the rest of the SVM optional features.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-9-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4c84926e

KVM: SVM: Move RESET emulation to svm_vcpu_reset() · 9ebe530b

由 Sean Christopherson 提交于 9月 20, 2021

Move RESET emulation for SVM vCPUs to svm_vcpu_reset(), and drop an extra
init_vmcb() from svm_create_vcpu() in the process.  Hopefully KVM will
someday expose a dedicated RESET ioctl(), and in the meantime separating
"create" from "RESET" is a nice cleanup.

Keep the call to svm_switch_vmcb() so that misuse of svm->vmcb at worst
breaks the guest, e.g. premature accesses doesn't cause a NULL pointer
dereference.

Cc: Reiji Watanabe <reijiw@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210921000303.400537-10-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9ebe530b

23 9月, 2021 2 次提交

KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround · d1cba6c9

由 Maxim Levitsky 提交于 9月 14, 2021

GP SVM errata workaround made the #GP handler always emulate
the SVM instructions.

However these instructions #GP in case the operand is not 4K aligned,
but the workaround code didn't check this and we ended up
emulating these instructions anyway.

This is only an emulation accuracy check bug as there is no harm for
KVM to read/write unaligned vmcb images.

Fixes: 82a11e9c ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-4-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d1cba6c9

KVM: x86: nSVM: restore int_vector in svm_clear_vintr · aee77e11

由 Maxim Levitsky 提交于 9月 14, 2021

In svm_clear_vintr we try to restore the virtual interrupt
injection that might be pending, but we fail to restore
the interrupt vector.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-2-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

aee77e11

22 9月, 2021 3 次提交

KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm · 136a55c0

由 Maxim Levitsky 提交于 9月 22, 2021

Use return statements instead of nested if, and fix error
path to free all the maps that were allocated.
Suggested-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-2-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

136a55c0

KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode · e85d3e7b

由 Maxim Levitsky 提交于 9月 13, 2021

Currently the KVM_REQ_GET_NESTED_STATE_PAGES on SVM only reloads PDPTRs,
and MSR bitmap, with former not really needed for SMM as SMM exit code
reloads them again from SMRAM'S CR3, and later happens to work
since MSR bitmap isn't modified while in SMM.

Still it is better to be consistient with VMX.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-5-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e85d3e7b

KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit · e2e6e449

由 Maxim Levitsky 提交于 9月 13, 2021

Otherwise guest entry code might see incorrect L1 state (e.g paging state).

Fixes: 37be407b ("KVM: nSVM: Fix L1 state corruption upon return from SMM")
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-3-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e2e6e449

21 8月, 2021 6 次提交

KVM: SVM: Add 5-level page table support for SVM · 43e540cc

由 Wei Huang 提交于 8月 18, 2021

When the 5-level page table is enabled on host OS, the nested page table
for guest VMs must use 5-level as well. Update get_npt_level() function
to reflect this requirement. In the meanwhile, remove the code that
prevents kvm-amd driver from being loaded when 5-level page table is
detected.
Signed-off-by: NWei Huang <wei.huang2@amd.com>
Message-Id: <20210818165549.3771014-4-wei.huang2@amd.com>
[Tweak condition as suggested by Sean. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

43e540cc

KVM: x86: Allow CPU to force vendor-specific TDP level · 746700d2

由 Wei Huang 提交于 8月 18, 2021

AMD future CPUs will require a 5-level NPT if host CR4.LA57 is set.
To prevent kvm_mmu_get_tdp_level() from incorrectly changing NPT level
on behalf of CPUs, add a new parameter in kvm_configure_mmu() to force
a fixed TDP level.
Signed-off-by: NWei Huang <wei.huang2@amd.com>
Message-Id: <20210818165549.3771014-2-wei.huang2@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

746700d2

KVM: SVM: split svm_handle_invalid_exit · 7a4bca85

由 Maxim Levitsky 提交于 8月 11, 2021

Split the check for having a vmexit handler to svm_check_exit_valid,
and make svm_handle_invalid_exit only handle a vmexit that is
already not valid.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210811122927.900604-2-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7a4bca85

KVM: SVM: AVIC: drop unsupported AVIC base relocation code · 73143035

由 Maxim Levitsky 提交于 8月 10, 2021

APIC base relocation is not supported anyway and won't work
correctly so just drop the code that handles it and keep AVIC
MMIO bar at the default APIC base.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210810205251.424103-17-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73143035

KVM: SVM: move check for kvm_vcpu_apicv_active outside of avic_vcpu_{put|load} · bf5f6b9d

由 Maxim Levitsky 提交于 8月 10, 2021

No functional change intended.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210810205251.424103-15-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bf5f6b9d

KVM: SVM: remove svm_toggle_avic_for_irq_window · 30eed56a

由 Maxim Levitsky 提交于 8月 10, 2021

Now that kvm_request_apicv_update doesn't need to drop the kvm->srcu lock,
we can call kvm_request_apicv_update directly.
Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210810205251.424103-13-mlevitsk@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

30eed56a

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功