- 02 8月, 2021 14 次提交
-
-
由 Sean Christopherson 提交于
Use the "internal" variants of setting segment registers when stuffing state on nested VM-Exit in order to skip the "emulation required" updates. VM-Exit must always go to protected mode, and all segments are mostly hardcoded (to valid values) on VM-Exit. The bits of the segments that aren't hardcoded are explicitly checked during VM-Enter, e.g. the selector RPLs must all be zero. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-30-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Don't refresh "emulation required" when stuffing segments during transitions to/from real mode when running without unrestricted guest. The checks are unnecessary as vmx_set_cr0() unconditionally rechecks "emulation required". They also happen to be broken, as enter_pmode() and enter_rmode() run with a stale vcpu->arch.cr0. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-29-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the long mode and EPT w/o unrestricted guest side effect processing down in vmx_set_cr0() so that the EPT && !URG case doesn't have to stuff vcpu->arch.cr0 early. This also fixes an oddity where CR0 might not be marked available, i.e. the early vcpu->arch.cr0 write would appear to be in danger of being overwritten, though that can't actually happen in the current code since CR0.TS is the only guest-owned bit, and CR0.TS is not read by vmx_set_cr4(). Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-28-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Tweak the logic for grabbing vmcs.GUEST_CR3 in vmx_cache_reg() to look directly at the execution controls, as opposed to effectively inferring the controls based on vCPUs. Inferring the controls isn't wrong, but it creates a very subtle dependency between the caching logic, the state of vcpu->arch.cr0 (via is_paging()), and the behavior of vmx_set_cr0(). Using the execution controls doesn't completely eliminate the dependency in vmx_set_cr0(), e.g. neglecting to cache CR3 before enabling interception would still break the guest, but it does reduce the code dependency and mostly eliminate the logical dependency (that CR3 loads are intercepted in certain scenarios). Eliminating the subtle read of vcpu->arch.cr0 will also allow for additional cleanup in vmx_set_cr0(). Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-26-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Keep CR3 load/store exiting enable as needed when running L2 in order to honor L1's desires. This fixes a largely theoretical bug where L1 could intercept CR3 but not CR0.PG and end up not getting the desired CR3 exits when L2 enables paging. In other words, the existing !is_paging() check inadvertantly handles the normal case for L2 where vmx_set_cr0() is called during VM-Enter, which is guaranteed to run with paging enabled, and thus will never clear the bits. Removing the !is_paging() check will also allow future consolidation and cleanup of the related code. From a performance perspective, this is all a nop, as the VMCS controls shadow will optimize away the VMWRITE when the controls are in the desired state. Add a comment explaining why CR3 is intercepted, with a big disclaimer about not querying the old CR3. Because vmx_set_cr0() is used for flows that are not directly tied to MOV CR3, e.g. vCPU RESET/INIT and nested VM-Enter, it's possible that is_paging() is not synchronized with CR3 load/store exiting. This is actually guaranteed in the current code, as KVM starts with CR3 interception disabled. Obviously that can be fixed, but there's no good reason to play whack-a-mole, and it tends to end poorly, e.g. descriptor table exiting for UMIP emulation attempted to be precise in the past and ended up botching the interception toggling. Fixes: fe3ef05c ("KVM: nVMX: Prepare vmcs02 from vmcs01 and vmcs12") Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-25-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the CR0/CR3/CR4 shenanigans for EPT without unrestricted guest back into vmx_set_cr0(). This will allow a future patch to eliminate the rather gross stuffing of vcpu->arch.cr0 in the paging transition cases by snapshotting the old CR0. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-24-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Remove a bogus write to vcpu->arch.cr0 that immediately precedes vmx_set_cr0() during vCPU RESET/INIT. For RESET, this is a nop since the "old" CR0 value is meaningless. But for INIT, if the vCPU is coming from paging enabled mode, crushing vcpu->arch.cr0 will cause the various is_paging() checks in vmx_set_cr0() to get false negatives. For the exit_lmode() case, the false negative is benign as vmx_set_efer() is called immediately after vmx_set_cr0(). For EPT without unrestricted guest, the false negative will cause KVM to unnecessarily run with CR3 load/store exiting. But again, this is benign, albeit sub-optimal. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-23-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Opt-in to forcing CR0.WP=1 for shadow paging, and stop lying about WP being "always on" for unrestricted guest. In addition to making KVM a wee bit more honest, this paves the way for additional cleanup. No functional change intended. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-22-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the EDX initialization at vCPU RESET, which is now identical between VMX and SVM, into common code. No functional change intended. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-20-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Consolidate the APIC base RESET logic, which is currently spread out across both x86 and vendor code. For an in-kernel APIC, the vendor code is redundant. But for a userspace APIC, KVM relies on the vendor code to initialize vcpu->arch.apic_base. Hoist the vcpu->arch.apic_base initialization above the !apic check so that it applies to both flavors of APIC emulation, and delete the vendor code. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-19-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Write vcpu->arch.apic_base directly instead of bouncing through kvm_set_apic_base(). This is a glorified nop, and is a step towards cleaning up the mess that is local APIC creation. When using an in-kernel APIC, kvm_create_lapic() explicitly sets vcpu->arch.apic_base to MSR_IA32_APICBASE_ENABLE to avoid its own kvm_lapic_set_base() call in kvm_lapic_reset() from triggering state changes. That call during RESET exists purely to set apic->base_address to the default base value. As a result, by the time VMX gets control, the only missing piece is the BSP bit being set for the reset BSP. For a userspace APIC, there are no side effects to process (for the APIC). In both cases, the call to kvm_update_cpuid_runtime() is a nop because the vCPU hasn't yet been exposed to userspace, i.e. there can't be any CPUID entries. No functional change intended. Reviewed-by: NReiji Watanabe <reijiw@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-17-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop an explicit MMU reset when entering emulated real mode now that the vCPU INIT/RESET path correctly handles conditional MMU resets, e.g. if INIT arrives while the vCPU is in 64-bit mode. Note, while there are multiple other direct calls to vmx_set_cr0(), i.e. paths that change CR0 without invoking kvm_post_set_cr0(), only the INIT emulation can reach enter_rmode(). CLTS emulation only toggles CR.TS, VM-Exit (and late VM-Fail) emulation cannot architecturally transition to Real Mode, and VM-Enter to Real Mode is possible if and only if Unrestricted Guest is enabled (exposed to L1). This effectively reverts commit 8668a3c4 ("KVM: VMX: Reset mmu context when entering real mode") Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-8-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Set EDX at RESET/INIT based on the userspace-defined CPUID model when possible, i.e. when CPUID.0x1.EAX is defind by userspace. At RESET/INIT, all CPUs that support CPUID set EDX to the FMS enumerated in CPUID.0x1.EAX. If no CPUID match is found, fall back to KVM's default of 0x600 (Family '6'), which is the least awful approximation of KVM's virtual CPU model. Fixes: 6aa8b732 ("[PATCH] kvm: userspace interface") Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-5-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NIsaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <0e8760a26151f47dc47052b25ca8b84fffe0641e.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 6月, 2021 2 次提交
-
-
由 Maxim Levitsky 提交于
This better reflects the purpose of this variable on AMD, since on AMD the AVIC's memory slot can be enabled and disabled dynamically. Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210623113002.111448-4-mlevitsk@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Aaron Lewis 提交于
When the kvm_intel module unloads the module parameter 'allow_smaller_maxphyaddr' is not cleared because the backing variable is defined in the kvm module. As a result, if the module parameter's state was set before kvm_intel unloads, it will also be set when it reloads. Explicitly clear the state in vmx_exit() to prevent this from happening. Signed-off-by: NAaron Lewis <aaronlewis@google.com> Message-Id: <20210623203426.1891402-1-aaronlewis@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com>
-
- 24 6月, 2021 2 次提交
-
-
由 Sean Christopherson 提交于
Mark #ACs that won't be reinjected to the guest as wanted by L0 so that KVM handles split-lock #AC from L2 instead of forwarding the exception to L1. Split-lock #AC isn't yet virtualized, i.e. L1 will treat it like a regular #AC and do the wrong thing, e.g. reinject it into L2. Fixes: e6f8b6c1 ("KVM: VMX: Extend VMXs #AC interceptor to handle split lock #AC in guest") Cc: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210622172244.3561540-1-seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jim Mattson 提交于
Failed VM-entry is often due to a faulty core. To help identify bad cores, print the id of the last logical processor that attempted VM-entry whenever dumping a VMCS or VMCB. Signed-off-by: NJim Mattson <jmattson@google.com> Message-Id: <20210621221648.1833148-1-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 22 6月, 2021 1 次提交
-
-
由 Jim Mattson 提交于
As part of smaller maxphyaddr emulation, kvm needs to intercept present page faults to see if it needs to add the RSVD flag (bit 3) to the error code. However, there is no need to intercept page faults that already have the RSVD flag set. When setting up the page fault intercept, add the RSVD flag into the #PF error code mask field (but not the #PF error code match field) to skip the intercept when the RSVD flag is already set. Signed-off-by: NJim Mattson <jmattson@google.com> Message-Id: <20210618235941.1041604-1-jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 6月, 2021 11 次提交
-
-
由 Sean Christopherson 提交于
Refuse to load KVM if NX support is not available and EPT is not enabled. Shadow paging has assumed NX support since commit 9167ab79 ("KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active"), so for all intents and purposes this has been a de facto requirement for over a year. Do not require NX support if EPT is enabled purely because Intel CPUs let firmware disable NX support via MSR_IA32_MISC_ENABLES. If not for that, VMX (and KVM as a whole) could require NX support with minimal risk to breaking userspace. Fixes: 9167ab79 ("KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active") Signed-off-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Message-Id: <20210615164535.2146172-2-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Instead of checking 'vmx->nested.hv_evmcs' use '-1' in 'vmx->nested.hv_evmcs_vmptr' to indicate 'evmcs is not in use' state. This matches how we check 'vmx->nested.current_vmptr'. Introduce EVMPTR_INVALID and evmptr_is_valid() and use it instead of raw '-1' check as a preparation to adding other 'special' values. No functional change intended. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <20210526132026.270394-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vineeth Pillai 提交于
Currently the remote TLB flush logic is specific to VMX. Move it to a common place so that SVM can use it as well. Signed-off-by: NVineeth Pillai <viremana@linux.microsoft.com> Message-Id: <4f4e4ca19778437dae502f44363a38e99e3ef5d1.1622730232.git.viremana@linux.microsoft.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Krish Sadhukhan 提交于
Currently, the 'nested_run' statistic counts all guest-entry attempts, including those that fail during vmentry checks on Intel and during consistency checks on AMD. Convert this statistic to count only those guest-entries that make it past these state checks and make it to guest code. This will tell us the number of guest-entries that actually executed or tried to execute guest code. Signed-off-by: NKrish Sadhukhan <Krish.Sadhukhan@oracle.com> Message-Id: <20210609180340.104248-2-krish.sadhukhan@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that .post_leave_smm() is gone, drop "pre_" from the remaining helpers. The helpers aren't invoked purely before SMI/RSM processing, e.g. both helpers are invoked after state is snapshotted (from regs or SMRAM), and the RSM helper is invoked after some amount of register state has been stuffed. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210609185619.992058-10-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Now that APICv/AVIC enablement is kept in common 'enable_apicv' variable, there's no need to call kvm_apicv_init() from vendor specific code. No functional change intended. Reviewed-by: NSean Christopherson <seanjc@google.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210609150911.1471882c-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Unify VMX and SVM code by moving APICv/AVIC enablement tracking to common 'enable_apicv' variable. Note: unlike APICv, AVIC is disabled by default. No functional change intended. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210609150911.1471882c-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ilias Stamatis 提交于
Currently vmx_vcpu_load_vmcs() writes the TSC_MULTIPLIER field of the VMCS every time the VMCS is loaded. Instead of doing this, set this field from common code on initialization and whenever the scaling ratio changes. Additionally remove vmx->current_tsc_ratio. This field is redundant as vcpu->arch.tsc_scaling_ratio already tracks the current TSC scaling ratio. The vmx->current_tsc_ratio field is only used for avoiding unnecessary writes but it is no longer needed after removing the code from the VMCS load path. Suggested-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NIlias Stamatis <ilstam@amazon.com> Message-Id: <20210607105438.16541-1-ilstam@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ilias Stamatis 提交于
The write_l1_tsc_offset() callback has a misleading name. It does not set L1's TSC offset, it rather updates the current TSC offset which might be different if a nested guest is executing. Additionally, both the vmx and svm implementations use the same logic for calculating the current TSC before writing it to hardware. Rename the function and move the common logic to the caller. The vmx/svm specific code now merely sets the given offset to the corresponding hardware structure. Signed-off-by: NIlias Stamatis <ilstam@amazon.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210526184418.28881-9-ilstam@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ilias Stamatis 提交于
In order to implement as much of the nested TSC scaling logic as possible in common code, we need these vendor callbacks for retrieving the TSC offset and the TSC multiplier that L1 has set for L2. Signed-off-by: NIlias Stamatis <ilstam@amazon.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210526184418.28881-7-ilstam@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ilias Stamatis 提交于
Store L1's scaling ratio in the kvm_vcpu_arch struct like we already do for L1's TSC offset. This allows for easy save/restore when we enter and then exit the nested guest. Signed-off-by: NIlias Stamatis <ilstam@amazon.com> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210526184418.28881-3-ilstam@amazon.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 6月, 2021 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple of warnings by explicitly adding break statements instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Message-Id: <20210528200756.GA39320@embeddedor> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 5月, 2021 1 次提交
-
-
由 Yuan Yao 提交于
The kvm_get_linear_rip() handles x86/long mode cases well and has better readability, __kvm_set_rflags() also use the paired function kvm_is_linear_rip() to check the vcpu->arch.singlestep_rip set in kvm_arch_vcpu_ioctl_set_guest_debug(), so change the "CS.BASE + RIP" code in kvm_arch_vcpu_ioctl_set_guest_debug() and handle_exception_nmi() to this one. Signed-off-by: NYuan Yao <yuan.yao@intel.com> Message-Id: <20210526063828.1173-1-yuan.yao@linux.intel.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 5月, 2021 1 次提交
-
-
由 Marcelo Tosatti 提交于
For VMX, when a vcpu enters HLT emulation, pi_post_block will: 1) Add vcpu to per-cpu list of blocked vcpus. 2) Program the posted-interrupt descriptor "notification vector" to POSTED_INTR_WAKEUP_VECTOR With interrupt remapping, an interrupt will set the PIR bit for the vector programmed for the device on the CPU, test-and-set the ON bit on the posted interrupt descriptor, and if the ON bit is clear generate an interrupt for the notification vector. This way, the target CPU wakes upon a device interrupt and wakes up the target vcpu. Problem is that pi_post_block only programs the notification vector if kvm_arch_has_assigned_device() is true. Its possible for the following to happen: 1) vcpu V HLTs on pcpu P, kvm_arch_has_assigned_device is false, notification vector is not programmed 2) device is assigned to VM 3) device interrupts vcpu V, sets ON bit (notification vector not programmed, so pcpu P remains in idle) 4) vcpu 0 IPIs vcpu V (in guest), but since pi descriptor ON bit is set, kvm_vcpu_kick is skipped 5) vcpu 0 busy spins on vcpu V's response for several seconds, until RCU watchdog NMIs all vCPUs. To fix this, use the start_assignment kvm_x86_ops callback to kick vcpus out of the halt loop, so the notification vector is properly reprogrammed to the wakeup vector. Reported-by: NPei Zhang <pezhang@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Message-Id: <20210526172014.GA29007@fuller.cnet> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 5月, 2021 7 次提交
-
-
由 Paolo Bonzini 提交于
Bus lock debug exception is an ability to notify the kernel by an #DB trap after the instruction acquires a bus lock and is executed when CPL>0. This allows the kernel to enforce user application throttling or mitigations. Existence of bus lock debug exception is enumerated via CPUID.(EAX=7,ECX=0).ECX[24]. Software can enable these exceptions by setting bit 2 of the MSR_IA32_DEBUGCTL. Expose the CPUID to guest and emulate the MSR handling when guest enables it. Support for this feature was originally developed by Xiaoyao Li and Chenyi Qiang, but code has since changed enough that this patch has nothing in common with theirs, except for this commit message. Co-developed-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NXiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: NChenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210202090433.13441-4-chenyi.qiang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Squish the Intel and AMD emulation of MSR_TSC_AUX together and tie it to the guest CPU model instead of the host CPU behavior. While not strictly necessary to avoid guest breakage, emulating cross-vendor "architecture" will provide consistent behavior for the guest, e.g. WRMSR fault behavior won't change if the vCPU is migrated to a host with divergent behavior. Note, the "new" kvm_is_supported_user_return_msr() checks do not add new functionality on either SVM or VMX. On SVM, the equivalent was "tsc_aux_uret_slot < 0", and on VMX the check was buried in the vmx_find_uret_msr() call at the find_uret_msr label. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-15-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Now that SVM and VMX both probe MSRs before "defining" user return slots for them, consolidate the code for probe+define into common x86 and eliminate the odd behavior of having the vendor code define the slot for a given MSR. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-14-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Tag TSX_CTRL as not needing to be loaded when RTM isn't supported in the host. Crushing the write mask to '0' has the same effect, but requires more mental gymnastics to understand. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-12-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop VMX's global list of user return MSRs now that VMX doesn't resort said list to isolate "active" MSRs, i.e. now that VMX's list and x86's list have the same MSRs in the same order. In addition to eliminating the redundant list, this will also allow moving more of the list management into common x86. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-11-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Explicitly flag a uret MSR as needing to be loaded into hardware instead of resorting the list of "active" MSRs and tracking how many MSRs in total need to be loaded. The only benefit to sorting the list is that the loop to load MSRs during vmx_prepare_switch_to_guest() doesn't need to iterate over all supported uret MRS, only those that are active. But that is a pointless optimization, as the most common case, running a 64-bit guest, will load the vast majority of MSRs. Not to mention that a single WRMSR is far more expensive than iterating over the list. Providing a stable list order obviates the need to track a given MSR's "slot" in the per-CPU list of user return MSRs; all lists simply use the same ordering. Future patches will take advantage of the stable order to further simplify the related code. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-10-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Configure the list of user return MSRs that are actually supported at module init instead of reprobing the list of possible MSRs every time a vCPU is created. Curating the list on a per-vCPU basis is pointless; KVM is completely hosed if the set of supported MSRs changes after module init, or if the set of MSRs differs per physical PCU. The per-vCPU lists also increase complexity (see __vmx_find_uret_msr()) and creates corner cases that _should_ be impossible, but theoretically exist in KVM, e.g. advertising RDTSCP to userspace without actually being able to virtualize RDTSCP if probing MSR_TSC_AUX fails. Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-9-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-