- 05 4月, 2018 1 次提交
-
-
由 Sean Christopherson 提交于
Remove the WARN_ON in handle_ept_misconfig() as it is unnecessary and causes false positives. Return the unmodified result of kvm_mmu_page_fault() instead of converting a system error code to KVM_EXIT_UNKNOWN so that userspace sees the error code of the actual failure, not a generic "we don't know what went wrong". * kvm_mmu_page_fault() will WARN if reserved bits are set in the SPTEs, i.e. it covers the case where an EPT misconfig occurred because of a KVM bug. * The WARN_ON will fire on any system error code that is hit while handling the fault, e.g. -ENOMEM from mmu_topup_memory_caches() while handling a legitmate MMIO EPT misconfig or -EFAULT from kvm_handle_bad_page() if the corresponding HVA is invalid. In either case, userspace should receive the original error code and firing a warning is incorrect behavior as KVM is operating as designed. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 4月, 2018 1 次提交
-
-
由 Sean Christopherson 提交于
Exit to userspace with KVM_INTERNAL_ERROR_EMULATION if we encounter an exception in Protected Mode while emulating guest due to invalid guest state. Unlike Big RM, KVM doesn't support emulating exceptions in PM, i.e. PM exceptions are always injected via the VMCS. Because we will never do VMRESUME due to emulation_required, the exception is never realized and we'll keep emulating the faulting instruction over and over until we receive a signal. Exit to userspace iff there is a pending exception, i.e. don't exit simply on a requested event. The purpose of this check and exit is to aid in debugging a guest that is in all likelihood already doomed. Invalid guest state in PM is extremely limited in normal operation, e.g. it generally only occurs for a few instructions early in BIOS, and any exception at this time is all but guaranteed to be fatal. Non-vectored interrupts, e.g. INIT, SIPI and SMI, can be cleanly handled/emulated, while checking for vectored interrupts, e.g. INTR and NMI, without hitting false positives would add a fair amount of complexity for almost no benefit (getting hit by lightning seems more likely than encountering this specific scenario). Add a WARN_ON_ONCE to vmx_queue_exception() if we try to inject an exception via the VMCS and emulation_required is true. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 29 3月, 2018 6 次提交
-
-
由 Liran Alon 提交于
When vCPU runs L2 and there is a pending event that requires to exit from L2 to L1 and nested_run_pending=1, vcpu_enter_guest() will request an immediate-exit from L2 (See req_immediate_exit). Since now handling of req_immediate_exit also makes sure to set KVM_REQ_EVENT, there is no need to also set it on vmx_vcpu_run() when nested_run_pending=1. This optimizes cases where VMRESUME was executed by L1 to enter L2 and there is no pending events that require exit from L2 to L1. Previously, this would have set KVM_REQ_EVENT unnecessarly. Signed-off-by: NLiran Alon <liran.alon@oracle.com> Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Liran Alon 提交于
For exceptions & NMIs events, KVM code use the following coding convention: *) "pending" represents an event that should be injected to guest at some point but it's side-effects have not yet occurred. *) "injected" represents an event that it's side-effects have already occurred. However, interrupts don't conform to this coding convention. All current code flows mark interrupt.pending when it's side-effects have already taken place (For example, bit moved from LAPIC IRR to ISR). Therefore, it makes sense to just rename interrupt.pending to interrupt.injected. This change follows logic of previous commit 664f8e26 ("KVM: X86: Fix loss of exception which has not yet been injected") which changed exception to follow this coding convention as well. It is important to note that in case !lapic_in_kernel(vcpu), interrupt.pending usage was and still incorrect. In this case, interrrupt.pending can only be set using one of the following ioctls: KVM_INTERRUPT, KVM_SET_VCPU_EVENTS and KVM_SET_SREGS. Looking at how QEMU uses these ioctls, one can see that QEMU uses them either to re-set an "interrupt.pending" state it has received from KVM (via KVM_GET_VCPU_EVENTS interrupt.pending or via KVM_GET_SREGS interrupt_bitmap) or by dispatching a new interrupt from QEMU's emulated LAPIC which reset bit in IRR and set bit in ISR before sending ioctl to KVM. So it seems that indeed "interrupt.pending" in this case is also suppose to represent "interrupt.injected". However, kvm_cpu_has_interrupt() & kvm_cpu_has_injectable_intr() is misusing (now named) interrupt.injected in order to return if there is a pending interrupt. This leads to nVMX/nSVM not be able to distinguish if it should exit from L2 to L1 on EXTERNAL_INTERRUPT on pending interrupt or should re-inject an injected interrupt. Therefore, add a FIXME at these functions for handling this issue. This patch introduce no semantics change. Signed-off-by: NLiran Alon <liran.alon@oracle.com> Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Vitaly Kuznetsov 提交于
Enlightened VMCS is just a structure in memory, the main benefit besides avoiding somewhat slower VMREAD/VMWRITE is using clean field mask: we tell the underlying hypervisor which fields were modified since VMEXIT so there's no need to inspect them all. Tight CPUID loop test shows significant speedup: Before: 18890 cycles After: 8304 cycles Static key is being used to avoid performance penalty for non-Hyper-V deployments. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Babu Moger 提交于
This patch brings some of the code from vmx to x86.h header file. Now, we can share this code between vmx and svm. Modified couple functions to make it common. Signed-off-by: NBabu Moger <babu.moger@amd.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Babu Moger 提交于
Get rid of ple_window_actual_max, because its benefits are really minuscule and the logic is complicated. The overflows(and underflow) are controlled in __ple_window_grow and _ple_window_shrink respectively. Suggested-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NBabu Moger <babu.moger@amd.com> [Fixed potential wraparound and change the max to UINT_MAX. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Babu Moger 提交于
The vmx module parameters are supposed to be unsigned variants. Also fixed the checkpatch errors like the one below. WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. +module_param(ple_gap, uint, S_IRUGO); Signed-off-by: NBabu Moger <babu.moger@amd.com> [Expanded uint to unsigned int in code. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 28 3月, 2018 1 次提交
-
-
由 Andi Kleen 提交于
KVM and perf have a special backdoor mechanism to report the IP for interrupts re-executed after vm exit. This works for the NMIs that perf normally uses. However when perf is in timer mode it doesn't work because the timer interrupt doesn't get this special treatment. This is common when KVM is running nested in another hypervisor which may not implement the PMU, so only timer mode is available. Call the functions to set up the backdoor IP also for non NMI interrupts. I renamed the functions to set up the backdoor IP reporting to be more appropiate for their new use. The SVM change is only compile tested. v2: Moved the functions inline. For the normal interrupt case the before/after functions are now called from x86.c, not arch specific code. For the NMI case we still need to call it in the architecture specific code, because it's already needed in the low level *_run functions. Signed-off-by: NAndi Kleen <ak@linux.intel.com> [Removed unnecessary calls from arch handle_external_intr. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 24 3月, 2018 3 次提交
-
-
由 Sean Christopherson 提交于
Add struct kvm_vmx, which wraps struct kvm, and a helper to_kvm_vmx() that retrieves 'struct kvm_vmx *' from 'struct kvm *'. Move the VMX specific variables out of kvm_arch and into kvm_vmx. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add kvm_x86_ops->set_identity_map_addr and set ept_identity_map_addr in VMX specific code so that ept_identity_map_addr can be moved out of 'struct kvm_arch' in a future patch. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Define kvm_arch_[alloc|free]_vm in x86 as pass through functions to new kvm_x86_ops vm_alloc and vm_free, and move the current allocation logic as-is to SVM and VMX. Vendor specific alloc/free functions set the stage for SVM/VMX wrappers of 'struct kvm', which will allow us to move the growing number of SVM/VMX specific member variables out of 'struct kvm_arch'. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 3月, 2018 1 次提交
-
-
由 Paolo Bonzini 提交于
Commit 2bb8cafe ("KVM: vVMX: signal failure for nested VMEntry if emulation_required", 2018-03-12) introduces a new error path which does not set *entry_failure_code. Fix that to avoid a leak of L0 stack to L1. Reported-by: NRadim Krčmář <rkrcmar@redhat.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 3月, 2018 15 次提交
-
-
由 Vitaly Kuznetsov 提交于
vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so the context is pretty well defined and as we're past 'swapgs' MSR_GS_BASE should contain kernel's GS base which we point to irq_stack_union. Add new kernelmode_gs_base() API, irq_stack_union needs to be exported as KVM can be build as module. Acked-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
vmx_save_host_state() is only called from kvm_arch_vcpu_ioctl_run() so the context is pretty well defined. Read MSR_{FS,KERNEL_GS}_BASE from current->thread after calling save_fsgs() which takes care of X86_BUG_NULL_SEG case now and will do RD[FG,GS]BASE when FSGSBASE extensions are exposed to userspace (currently they are not). Acked-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Allow to disable pause loop exit/pause filtering on a per VM basis. If some VMs have dedicated host CPUs, they won't be negatively affected due to needlessly intercepted PAUSE instructions. Thanks to Jan H. Schönherr's initial patch. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
If host CPUs are dedicated to a VM, we can avoid VM exits on HLT. This patch adds the per-VM capability to disable them. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Allowing a guest to execute MWAIT without interception enables a guest to put a (physical) CPU into a power saving state, where it takes longer to return from than what may be desired by the host. Don't give a guest that power over a host by default. (Especially, since nothing prevents a guest from using MWAIT even when it is not advertised via CPUID.) Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Liran Alon 提交于
If KVM enable_vmware_backdoor module parameter is set, the commit change VMX to now intercept #GP instead of being directly deliviered from CPU to guest. It is done to support access to VMware backdoor I/O ports even if TSS I/O permission denies it. In that case: 1. A #GP will be raised and intercepted. 2. #GP intercept handler will simulate I/O port access instruction. 3. I/O port access instruction simulation will allow access to VMware backdoor ports specifically even if TSS I/O permission bitmap denies it. Note that the above change introduce slight performance hit as now #GPs are not deliviered directly from CPU to guest but instead cause #VMExit and instruction emulation. However, this behavior is introduced only when enable_vmware_backdoor KVM module parameter is set. Signed-off-by: NLiran Alon <liran.alon@oracle.com> Reviewed-by: NNikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Add kvm_fast_pio() to consolidate duplicate code in VMX and SVM. Unexport kvm_fast_pio_in() and kvm_fast_pio_out(). Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Fast emulation of processor I/O for IN was disabled on x86 (both VMX and SVM) some years ago due to a buggy implementation. The addition of kvm_fast_pio_in(), used by SVM, re-introduced (functional!) fast emulation of IN. Piggyback SVM's work and use kvm_fast_pio_in() on VMX instead of performing full emulation of IN. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Fail a nested VMEntry with EXIT_REASON_INVALID_STATE if L2 guest state is invalid, i.e. vmcs12 contained invalid guest state, and unrestricted guest is disabled in L0 (and by extension disabled in L1). WARN_ON_ONCE in handle_invalid_guest_state() if we're attempting to emulate L2, i.e. nested_run_pending is true, to aid debug in the (hopefully unlikely) scenario that we somehow skip the nested VMEntry consistency check, e.g. due to a L0 bug. Note: KVM relies on hardware to detect the scenario where unrestricted guest is enabled in L0 but disabled in L1 and vmcs12 contains invalid guest state, i.e. checking emulation_required in prepare_vmcs02 is required only to handle the case were unrestricted guest is disabled in L0 since L0 never actually attempts VMLAUNCH/VMRESUME with vmcs02. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
CR3 load/store exiting are always off when unrestricted guest is enabled. WARN on the associated CR3 VMEXIT to detect code that would re-introduce CR3 load/store exiting for unrestricted guest. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Sean Christopherson 提交于
Now CR3 is not forced to a host-controlled value when paging is disabled in an unrestricted guest, CR3 load/store exiting can be left disabled (for an unrestricted guest). And because CR0.WP and CR4.PAE/PSE are also not force to host-controlled values, all of ept_update_paging_mode_cr0() is no longer needed, i.e. skip ept_update_paging_mode_cr0() for an unrestricted guest. Because MOV CR3 no longer exits when paging is disabled for an unrestricted guest, vmx_decache_cr3() must always read GUEST_CR3 from the VMCS for an unrestricted guest. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Sean Christopherson 提交于
CR4.PAE - Unrestricted guest can only be enabled when EPT is enabled, and vmx_set_cr4() clears hardware CR0.PAE based on the guest's CR4.PAE, i.e. CR4.PAE always follows the guest's value when unrestricted guest is enabled. CR4.PSE - Unrestricted guest no longer uses the identity mapped IA32 page tables since CR0.PG can be cleared in hardware, thus there is no need to set CR4.PSE when paging is disabled in the guest (and EPT is enabled). Define KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST (to X86_CR4_VMXE) and use it in lieu of KVM_*MODE_VM_CR4_ALWAYS_ON when unrestricted guest is enabled, which removes the forcing of CR4.PAE. Skip the manipulation of CR4.PAE/PSE for EPT when unrestricted guest is enabled, as CR4.PAE isn't forced and so doesn't need to be manually cleared, and CR4.PSE does not need to be set when paging is disabled since the identity mapped IA32 page tables are not used. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Sean Christopherson 提交于
Unrestricted guest can only be enabled when EPT is enabled, and when EPT is enabled, ept_update_paging_mode_cr0() will clear hardware CR0.WP based on the guest's CR0.WP, i.e. CR0.WP always follows the guest's value when unrestricted guest is enabled. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Sean Christopherson 提交于
An unrestricted guest can run with hardware CR0.PG==0, i.e. IA32 paging disabled, in which case there is no need to load the guest's CR3 with identity mapped IA32 page tables since hardware will effectively ignore CR3. If unrestricted guest is enabled, don't configure the identity mapped IA32 page table and always load the guest's desired CR3. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Sean Christopherson 提交于
An unrestricted guest can run with CR0.PG==0 and/or CR0.PE==0, e.g. it can run in Real Mode without requiring host emulation. The RM TSS is only used for emulating RM, i.e. it will never be used when unrestricted guest is enabled and so doesn't need to be configured. Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 08 3月, 2018 1 次提交
-
-
由 Krish Sadhukhan 提交于
According to Intel SDM 26.2.1.1, the following rules should be enforced on vmentry: * If the "NMI exiting" VM-execution control is 0, "Virtual NMIs" VM-execution control must be 0. * If the “virtual NMIs” VM-execution control is 0, the “NMI-window exiting” VM-execution control must be 0. This patch enforces these rules when entering an L2 guest. Signed-off-by: NKrish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: NLiran Alon <liran.alon@oracle.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 07 3月, 2018 2 次提交
-
-
由 Paolo Bonzini 提交于
Use the new MSR feature framework to tell userspace which VMX capabilities are available for nested hypervisors. Before, these were only accessible with the KVM_GET_MSR VCPU ioctl, after VCPUs had been created. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
Move the MSRs to a separate struct, so that we can introduce a global instance and return it from the /dev/kvm KVM_GET_MSRS ioctl. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 02 3月, 2018 2 次提交
-
-
由 Wanpeng Li 提交于
Linux (among the others) has checks to make sure that certain features aren't enabled on a certain family/model/stepping if the microcode version isn't greater than or equal to a known good version. By exposing the real microcode version, we're preventing buggy guests that don't check that they are running virtualized (i.e., they should trust the hypervisor) from disabling features that are effectively not buggy. Suggested-by: NFilippo Sironi <sironi@amazon.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Tom Lendacky 提交于
Provide a new KVM capability that allows bits within MSRs to be recognized as features. Two new ioctls are added to the /dev/kvm ioctl routine to retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops callback is used to determine support for the listed MSR-based features. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [Tweaked documentation. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 24 2月, 2018 2 次提交
-
-
由 Chao Gao 提交于
Although L2 is in halt state, it will be in the active state after VM entry if the VM entry is vectoring according to SDM 26.6.2 Activity State. Halting the vcpu here means the event won't be injected to L2 and this decision isn't reported to L1. Thus L0 drops an event that should be injected to L2. Cc: Liran Alon <liran.alon@oracle.com> Reviewed-by: NLiran Alon <liran.alon@oracle.com> Signed-off-by: NChao Gao <chao.gao@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
L1 might want to use SECONDARY_EXEC_DESC, so we must not clear the VMCS bit if UMIP is not being emulated. We must still set the bit when emulating UMIP as the feature can be passed to L2 where L0 will do the emulation and because L2 can change CR4 without a VM exit, we should clear the bit if UMIP is disabled. Fixes: 0367f205 ("KVM: vmx: add support for emulating UMIP") Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 23 2月, 2018 2 次提交
-
-
由 Paolo Bonzini 提交于
vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving branch hints to the compiler can actually make a substantial cycle difference by keeping the fast path contiguous in memory. With this optimization, the retpoline-guest/retpoline-host case is about 50 cycles faster. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Paolo Bonzini 提交于
Having a paravirt indirect call in the IBRS restore path is not a good idea, since we are trying to protect from speculative execution of bogus indirect branch targets. It is also slower, so use native_wrmsrl() on the vmentry path too. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NJim Mattson <jmattson@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Fixes: d28b387f Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 2月, 2018 2 次提交
-
-
由 KarimAllah Ahmed 提交于
We either clear the CPU_BASED_USE_MSR_BITMAPS and end up intercepting all MSR accesses or create a valid L02 MSR bitmap and use that. This decision has to be made every time we evaluate whether we are going to generate the L02 MSR bitmap. Before commit: d28b387f ("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL") ... this was probably OK since the decision was always identical. This is no longer the case now since the MSR bitmap might actually change once we decide to not intercept SPEC_CTRL and PRED_CMD. Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan.van.de.ven@intel.com Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: kvm@vger.kernel.org Cc: sironi@amazon.de Link: http://lkml.kernel.org/r/1518305967-31356-6-git-send-email-dwmw@amazon.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 KarimAllah Ahmed 提交于
These two variables should check whether SPEC_CTRL and PRED_CMD are supposed to be passed through to L2 guests or not. While msr_write_intercepted_l01 would return 'true' if it is not passed through. So just invert the result of msr_write_intercepted_l01 to implement the correct semantics. Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Reviewed-by: NJim Mattson <jmattson@google.com> Acked-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan.van.de.ven@intel.com Cc: dave.hansen@intel.com Cc: kvm@vger.kernel.org Cc: sironi@amazon.de Fixes: 086e7d4118cc ("KVM: VMX: Allow direct access to MSR_IA32_SPEC_CTRL") Link: http://lkml.kernel.org/r/1518305967-31356-5-git-send-email-dwmw@amazon.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 04 2月, 2018 1 次提交
-
-
由 KarimAllah Ahmed 提交于
[ Based on a patch from Ashok Raj <ashok.raj@intel.com> ] Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a retpoline+IBPB based approach. To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only start saving and restoring when a non-zero is written to it. No attempt is made to handle STIBP here, intentionally. Filtering STIBP may be added in a future patch, which may require trapping all writes if we don't want to pass it through directly to the guest. [dwmw2: Clean up CPUID bits, save/restore manually, handle reset] Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDarren Kenny <darren.kenny@oracle.com> Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: NJim Mattson <jmattson@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de
-