- 14 5月, 2020 7 次提交
-
-
由 Sean Christopherson 提交于
Skip the Indirect Branch Prediction Barrier that is triggered on a VMCS switch when temporarily loading vmcs02 to synchronize it to vmcs12, i.e. give copy_vmcs02_to_vmcs12_rare() the same treatment as vmx_switch_vmcs(). Make vmx_vcpu_load() static now that it's only referenced within vmx.c. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200506235850.22600-3-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Skip the Indirect Branch Prediction Barrier that is triggered on a VMCS switch when running with spectre_v2_user=on/auto if the switch is between two VMCSes in the same guest, i.e. between vmcs01 and vmcs02. The IBPB is intended to prevent one guest from attacking another, which is unnecessary in the nested case as it's the same guest from KVM's perspective. This all but eliminates the overhead observed for nested VMX transitions when running with CONFIG_RETPOLINE=y and spectre_v2_user=on/auto, which can be significant, e.g. roughly 3x on current systems. Reported-by: NAlexander Graf <graf@amazon.com> Cc: KarimAllah Raslan <karahmed@amazon.de> Cc: stable@vger.kernel.org Fixes: 15d45071 ("KVM/x86: Add IBPB support") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200501163117.4655-1-sean.j.christopherson@intel.com> [Invert direction of bool argument. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Check for an unblocked SMI in vmx_check_nested_events() so that pending SMIs are correctly prioritized over IRQs and NMIs when the latter events will trigger VM-Exit. This also fixes an issue where an SMI that was marked pending while processing a nested VM-Enter wouldn't trigger an immediate exit, i.e. would be incorrectly delayed until L2 happened to take a VM-Exit. Fixes: 64d60670 ("KVM: x86: stubs for SMM support") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200423022550.15113-10-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Short circuit vmx_check_nested_events() if an unblocked IRQ/NMI is pending and needs to be injected into L2, priority between coincident events is not dependent on exiting behavior. Fixes: b6b8a145 ("KVM: nVMX: Rework interception of IRQs and NMIs") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200423022550.15113-9-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Report NMIs as allowed when the vCPU is in L2 and L2 is being run with Exit-on-NMI enabled, as NMIs are always unblocked from L1's perspective in this case. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200423022550.15113-7-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a kvm_x86_ops hook to detect a nested pending "hypervisor timer" and use it to effectively open a window for servicing the expired timer. Like pending SMIs on VMX, opening a window simply means requesting an immediate exit. This fixes a bug where an expired VMX preemption timer (for L2) will be delayed and/or lost if a pending exception is injected into L2. The pending exception is rightly prioritized by vmx_check_nested_events() and injected into L2, with the preemption timer left pending. Because no window opened, L2 is free to run uninterrupted. Fixes: f4124500 ("KVM: nVMX: Fully emulate preemption timer") Reported-by: NJim Mattson <jmattson@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Peter Shier <pshier@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200423022550.15113-3-sean.j.christopherson@intel.com> [Check it in kvm_vcpu_has_events too, to ensure that the preemption timer is serviced promptly even if the vCPU is halted and L1 is not intercepting HLT. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Short circuit vmx_check_nested_events() if an exception is pending and needs to be injected into L2, priority between coincident events is not dependent on exiting behavior. This fixes a bug where a single-step #DB that is not intercepted by L1 is incorrectly dropped due to servicing a VMX Preemption Timer VM-Exit. Injected exceptions also need to be blocked if nested VM-Enter is pending or an exception was already injected, otherwise injecting the exception could overwrite an existing event injection from L1. Technically, this scenario should be impossible, i.e. KVM shouldn't inject its own exception during nested VM-Enter. This will be addressed in a future patch. Note, event priority between SMI, NMI and INTR is incorrect for L2, e.g. SMI should take priority over VM-Exit on NMI/INTR, and NMI that is injected into L2 should take priority over VM-Exit INTR. This will also be addressed in a future patch. Fixes: b6b8a145 ("KVM: nVMX: Rework interception of IRQs and NMIs") Reported-by: NJim Mattson <jmattson@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Peter Shier <pshier@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200423022550.15113-2-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 5月, 2020 1 次提交
-
-
由 Sean Christopherson 提交于
Use BUG() in the impossible-to-hit default case when switching on the scope of INVEPT to squash a warning with clang 11 due to clang treating the BUG_ON() as conditional. >> arch/x86/kvm/vmx/nested.c:5246:3: warning: variable 'roots_to_free' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] BUG_ON(1); Reported-by: Nkbuild test robot <lkp@intel.com> Fixes: ce8fe7b7 ("KVM: nVMX: Free only the affected contexts when emulating INVEPT") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200504153506.28898-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 4月, 2020 1 次提交
-
-
由 Sean Christopherson 提交于
Use an unsigned long for 'exit_qual' in nested_vmx_reflect_vmexit(), the EXIT_QUALIFICATION field is naturally sized, not a 32-bit field. The bug is most easily observed by doing VMXON (or any VMX instruction) in L2 with a negative displacement, in which case dropping the upper bits on nested VM-Exit results in L1 calculating the wrong virtual address for the memory operand, e.g. "vmxon -0x8(%rbp)" yields: Unhandled cpu exception 14 #PF at ip 0000000000400553 rbp=0000000000537000 cr2=0000000100536ff8 Fixes: fbdd5025 ("KVM: nVMX: Move VM-Fail check out of nested_vmx_exit_reflected()") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200423001127.13490-1-sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 4月, 2020 1 次提交
-
-
由 Sean Christopherson 提交于
Drop nested_vmx_l1_wants_exit()'s initialization of intr_info from vmx_get_intr_info() that was inadvertantly introduced along with the caching mechanism. EXIT_REASON_EXCEPTION_NMI, the only consumer of intr_info, populates the variable before using it. Fixes: bb53120d67cd ("KVM: VMX: Cache vmcs.EXIT_INTR_INFO using arch avail_reg flags") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200421075328.14458-2-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 4月, 2020 1 次提交
-
-
由 Paolo Bonzini 提交于
Clean up some of the patching of kvm_x86_ops, by moving kvm_x86_ops related to nested virtualization into a separate struct. As a result, these ops will always be non-NULL on VMX. This is not a problem: * check_nested_events is only called if is_guest_mode(vcpu) returns true * get_nested_state treats VMXOFF state the same as nested being disabled * set_nested_state fails if you attempt to set nested state while nesting is disabled * nested_enable_evmcs could already be called on a CPU without VMX enabled in CPUID. * nested_get_evmcs_version was fixed in the previous patch Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 4月, 2020 27 次提交
-
-
由 Sean Christopherson 提交于
Remove all references to cr3_target_value[0-3] and replace the fields in vmcs12 with "dead_space" to preserve the vmcs12 layout. KVM doesn't support emulating CR3-target values, despite a variety of code that implies otherwise, as KVM unconditionally reports '0' for the number of supported CR3-target values. This technically fixes a bug where KVM would incorrectly allow VMREAD and VMWRITE to nonexistent fields, i.e. cr3_target_value[0-3]. Per Intel's SDM, the number of supported CR3-target values reported in VMX_MISC also enumerates the existence of the associated VMCS fields: If a future implementation supports more than 4 CR3-target values, they will be encoded consecutively following the 4 encodings given here. Alternatively, the "bug" could be fixed by actually advertisting support for 4 CR3-target values, but that'd likely just enable kvm-unit-tests given that no one has complained about lack of support for going on ten years, e.g. KVM, Xen and HyperV don't use CR3-target values. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200416000739.9012-1-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Introduce a new "extended register" type, EXIT_INFO_2 (to pair with the nomenclature in .get_exit_info()), and use it to cache VMX's vmcs.EXIT_INTR_INFO. Drop a comment in vmx_recover_nmi_blocking() that is obsoleted by the generic caching mechanism. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415203454.8296-6-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Introduce a new "extended register" type, EXIT_INFO_1 (to pair with the nomenclature in .get_exit_info()), and use it to cache VMX's vmcs.EXIT_QUALIFICATION. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415203454.8296-5-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the call to vmx_segment_cache_clear() in vmx_switch_vmcs() now that the entire register cache is reset when switching the active VMCS, e.g. vmx_segment_cache_test_set() will reset the segment cache due to VCPU_EXREG_SEGMENTS being unavailable. Move vmx_segment_cache_clear() to vmx.c now that it's no longer invoked by the nested code. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415203454.8296-4-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Reset the per-vCPU available and dirty register masks when switching between vmcs01 and vmcs02, as the masks track state relative to the current VMCS. The stale masks don't cause problems in the current code base because the registers are either unconditionally written on nested transitions or, in the case of segment registers, have an additional tracker that is manually reset. Note, by dropping (previously implicitly, now explicitly) the dirty mask when switching the active VMCS, KVM is technically losing writes to the associated fields. But, the only regs that can be dirtied (RIP, RSP and PDPTRs) are unconditionally written on nested transitions, e.g. explicit writeback is a waste of cycles, and a WARN_ON would be rather pointless. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415203454.8296-3-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Invoke ept_save_pdptrs() when restoring L1's host state on a "late" VM-Fail if and only if PAE paging is enabled. This saves a CALL in the common case where L1 is a 64-bit host, and avoids incorrectly marking the PDPTRs as dirty. WARN if ept_save_pdptrs() is called with PAE disabled now that the nested usage pre-checks is_pae_paging(). Barring a bug in KVM's MMU, attempting to read the PDPTRs with PAE disabled is now impossible. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415203454.8296-2-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use "vm_exit_reason" for code related to injecting a nested VM-Exit to VM-Exits to make it clear that nested_vmx_vmexit() expects the full exit eason, not just the basic exit reason. The basic exit reason (bits 15:0 of vmcs.VM_EXIT_REASON) is colloquially referred to as simply "exit reason". Note, other flows, e.g. vmx_handle_exit(), are intentionally left as is. A future patch will convert vmx->exit_reason to a union + bit-field, and the exempted flows will interact with the unionized of "exit_reason". Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-10-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Explicitly check only the basic exit reason when emulating an external interrupt VM-Exit in nested_vmx_vmexit(). Checking the full exit reason doesn't currently cause problems, but only because the only exit reason modifier support by KVM is FAILED_VMENTRY, which is mutually exclusive with EXTERNAL_INTERRUPT. Future modifiers, e.g. ENCLAVE_MODE, will coexist with EXTERNAL_INTERRUPT. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-9-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Grab the exit reason from the vcpu struct in nested_vmx_reflect_vmexit() instead of having the exit reason explicitly passed from the caller. This fixes a discrepancy between VM-Fail and VM-Exit handling, as the VM-Fail case is already handled by checking vcpu_vmx, e.g. the exit reason previously passed on the stack is bogus if vmx->fail is set. Not taking the exit reason on the stack also avoids having to document that nested_vmx_reflect_vmexit() requires the full exit reason, as opposed to just the basic exit reason, which is not at all obvious since the only usages of the full exit reason are for tracing and way down in prepare_vmcs12() where it's propagated to vmcs12. No functional change intended. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-8-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Drop the WARN in nested_vmx_reflect_vmexit() that fires if KVM attempts to reflect an external interrupt. The WARN is blatantly impossible to hit now that nested_vmx_l0_wants_exit() is called from nested_vmx_reflect_vmexit() unconditionally returns true for EXTERNAL_INTERRUPT. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-7-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Split the logic that determines whether a nested VM-Exit is reflected into L1 into "L0 wants" and "L1 wants" to document the core control flow at a high level. If L0 wants the VM-Exit, e.g. because the exit is due to a hardware event that isn't passed through to L1, then KVM should handle the exit in L0 without considering L1's configuration. Then, if L0 doesn't want the exit, KVM needs to query L1's wants to determine whether or not L1 "caused" the exit, e.g. by setting an exiting control, versus the exit occurring due to an L0 setting, e.g. when L0 intercepts an action that L1 chose to pass-through. Note, this adds an extra read on vmcs.VM_EXIT_INTR_INFO for exception. This will be addressed in a future patch via a VMX-wide enhancement, rather than pile on another case where vmx->exit_intr_info is conditionally available. Suggested-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-6-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move the tracepoint for nested VM-Exits in preparation of splitting the reflection logic into L1 wants the exit vs. L0 always handles the exit. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-5-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Check for VM-Fail on nested VM-Enter in nested_vmx_reflect_vmexit() in preparation for separating nested_vmx_exit_reflected() into separate "L0 wants exit exit" and "L1 wants the exit" helpers. Explicitly set exit_intr_info and exit_qual to zero instead of reading them from vmcs02, as they are invalid on VM-Fail (and thankfully ignored by nested_vmx_vmexit() for nested VM-Fail). Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-4-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Uninline nested_vmx_reflect_vmexit() in preparation of refactoring nested_vmx_exit_reflected() to split up the reflection logic into more consumable chunks, e.g. VM-Fail vs. L1 wants the exit vs. L0 always handles the exit. No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200415175519.14230-3-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Rename functions and variables in kvm_mmu_new_cr3() and related code to replace "cr3" with "pgd", i.e. continue the work started by commit 727a7e27 ("KVM: x86: rename set_cr3 callback and related flags to load_mmu_pgd"). kvm_mmu_new_cr3() and company are not always loading a new CR3, e.g. when nested EPT is enabled "cr3" is actually an EPTP. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-37-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add logic to handle_invept() to free only those roots that match the target EPT context when emulating a single-context INVEPT. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-36-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Unconditionally skip the TLB flush triggered when reusing a root for a nested transition as nested_vmx_transition_tlb_flush() ensures the TLB is flushed when needed, regardless of whether the MMU can reuse a cached root (or the last root). Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-35-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Skip the MMU sync when reusing a cached root if EPT is enabled or L1 enabled VPID for L2. If EPT is enabled, guest-physical mappings aren't flushed even if VPID is disabled, i.e. L1 can't expect stale TLB entries to be flushed if it has enabled EPT and L0 isn't shadowing PTEs (for L1 or L2) if L1 has EPT disabled. If VPID is enabled (and EPT is disabled), then L1 can't expect stale TLB entries to be flushed (for itself or L2). Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-34-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a separate "skip" override for MMU sync, a future change to avoid TLB flushes on nested VMX transitions may need to sync the MMU even if the TLB flush is unnecessary. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-32-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Defer reloading L1's APIC page by logging the need for a reload and processing it during nested VM-Exit instead of unconditionally reloading the APIC page on nested VM-Exit. This eliminates a TLB flush on the majority of VM-Exits as the APIC page rarely needs to be reloaded. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-28-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Flush only the current context, as opposed to all contexts, when requesting a TLB flush to handle the scenario where a L1 does not expect a TLB flush, but one is required because L1 and L2 shared an ASID. This occurs if EPT is disabled (no per-EPTP tag), VPID is enabled (hardware doesn't flush unconditionally) and vmcs02 does not have its own VPID due to exhaustion of available VPIDs. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-27-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add KVM_REQ_TLB_FLUSH_CURRENT to allow optimized TLB flushing of VMX's EPTP/VPID contexts[*] from the KVM MMU and/or in a deferred manner, e.g. to flush L2's context during nested VM-Enter. Convert KVM_REQ_TLB_FLUSH to KVM_REQ_TLB_FLUSH_CURRENT in flows where the flush is directly associated with vCPU-scoped instruction emulation, i.e. MOV CR3 and INVPCID. Add a comment in vmx_vcpu_load_vmcs() above its KVM_REQ_TLB_FLUSH to make it clear that it deliberately requests a flush of all contexts. Service any pending flush request on nested VM-Exit as it's possible a nested VM-Exit could occur after requesting a flush for L2. Add the same logic for nested VM-Enter even though it's _extremely_ unlikely for flush to be pending on nested VM-Enter, but theoretically possible (in the future) due to RSM (SMM) emulation. [*] Intel also has an Address Space Identifier (ASID) concept, e.g. EPTP+VPID+PCID == ASID, it's just not documented in the SDM because the rules of invalidation are different based on which piece of the ASID is being changed, i.e. whether the EPTP, VPID, or PCID context must be invalidated. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-25-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add a helper to determine whether or not a full TLB flush needs to be performed on nested VM-Enter/VM-Exit, as the logic is identical for both flows and needs a fairly beefy comment to boot. This also provides a common point to make future adjustments to the logic. Handle vpid12 changes the new helper as well even though it is specific to VM-Enter. The vpid12 logic is an extension of the flushing logic, and it's worth the extra bool parameter to provide a single location for the flushing logic. Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-24-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Move nested_get_vpid02() to vmx/nested.h so that a future patch can reference it from vmx.c to implement context-specific TLB flushing. No functional change intended. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-20-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use vpid_sync_vcpu_addr() to emulate the "individual address" variant of INVVPID now that said function handles the fallback case of the (host) CPU not supporting "individual address". Note, the "vpid == 0" checks in the vpid_sync_*() helpers aren't actually redundant with the "!operand.vpid" check in handle_invvpid(), as the vpid passed to vpid_sync_vcpu_addr() is a KVM (host) controlled value, i.e. vpid02 can be zero even if operand.vpid is non-zero. No functional change intended. Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-14-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Use vpid_sync_context() directly for flows that run if and only if enable_vpid=1, or more specifically, nested VMX flows that are gated by vmx->nested.msrs.secondary_ctls_high.SECONDARY_EXEC_ENABLE_VPID being set, which is allowed if and only if enable_vpid=1. Because these flows call __vmx_flush_tlb() with @invalidate_gpa=false, the if-statement that decides between INVEPT and INVVPID will always go down the INVVPID path, i.e. call vpid_sync_context() because "enable_ept && (invalidate_gpa || !enable_vpid)" always evaluates false. This helps pave the way toward removing @invalidate_gpa and @vpid from __vmx_flush_tlb() and its callers. Opportunstically drop unnecessary brackets in handle_invvpid() around an affected __vmx_flush_tlb()->vpid_sync_context() conversion. No functional change intended. Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-10-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Junaid Shahid 提交于
When injecting a page fault or EPT violation/misconfiguration, KVM is not syncing any shadow PTEs associated with the faulting address, including those in previous MMUs that are associated with L1's current EPTP (in a nested EPT scenario), nor is it flushing any hardware TLB entries. All this is done by kvm_mmu_invalidate_gva. Page faults that are either !PRESENT or RSVD are exempt from the flushing, as the CPU is not allowed to cache such translations. Signed-off-by: NJunaid Shahid <junaids@google.com> Co-developed-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-8-sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 4月, 2020 2 次提交
-
-
由 Junaid Shahid 提交于
Free all roots when emulating INVVPID for L1 and EPT is disabled, as outstanding changes to the page tables managed by L1 need to be recognized. Because L1 and L2 share an MMU when EPT is disabled, and because VPID is not tracked by the MMU role, all roots in the current MMU (root_mmu) need to be freed, otherwise a future nested VM-Enter or VM-Exit could do a fast CR3 switch (without a flush/sync) and consume stale SPTEs. Fixes: 5c614b35 ("KVM: nVMX: nested VPID emulation") Signed-off-by: NJunaid Shahid <junaids@google.com> [sean: ported to upstream KVM, reworded the comment and changelog] Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-5-sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Free all L2 (guest_mmu) roots when emulating INVEPT for L1. Outstanding changes to the EPT tables managed by L1 need to be recognized, and relying on KVM to always flush L2's EPTP context on nested VM-Enter is dangerous. Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote TLB flush if necessary, e.g. if L1 has never entered L2 then there is nothing to be done. Nuking all L2 roots is overkill for the single-context variant, but it's the safe and easy bet. A more precise zap mechanism will be added in the future. Add a TODO to call out that KVM only needs to invalidate affected contexts. Fixes: 14c07ad8 ("x86/kvm/mmu: introduce guest_mmu") Reported-by: NJim Mattson <jmattson@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200320212833.3507-4-sean.j.christopherson@intel.com> Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-