- 11 3月, 2014 1 次提交
-
-
由 Jan Kiszka 提交于
Move the check for leaving L2 on pending and intercepted IRQs or NMIs from the *_allowed handler into a dedicated callback. Invoke this callback at the relevant points before KVM checks if IRQs/NMIs can be injected. The callback has the task to switch from L2 to L1 if needed and inject the proper vmexit events. The rework fixes L2 wakeups from HLT and provides the foundation for preemption timer emulation. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 3月, 2014 1 次提交
-
-
由 Paolo Bonzini 提交于
Commit e504c909 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13) highlighted a real problem, but the fix was subtly wrong. nested_read_cr0 is the CR0 as read by L2, but here we want to look at the CR0 value reflecting L1's setup. In other words, L2 might think that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually running it with TS=1, we should inject the fault into L1. The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use it. Fixes: e504c909Reported-by: NKashyap Chamarty <kchamart@redhat.com> Reported-by: NStefan Bader <stefan.bader@canonical.com> Tested-by: NKashyap Chamarty <kchamart@redhat.com> Tested-by: NAnthoine Bourgeois <bourgeois@bertin.fr> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 2月, 2014 1 次提交
-
-
由 Paolo Bonzini 提交于
Commit e504c909 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13) highlighted a real problem, but the fix was subtly wrong. nested_read_cr0 is the CR0 as read by L2, but here we want to look at the CR0 value reflecting L1's setup. In other words, L2 might think that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually running it with TS=1, we should inject the fault into L1. The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use it. Fixes: e504c909Reported-by: NKashyap Chamarty <kchamart@redhat.com> Reported-by: NStefan Bader <stefan.bader@canonical.com> Tested-by: NKashyap Chamarty <kchamart@redhat.com> Tested-by: NAnthoine Bourgeois <bourgeois@bertin.fr> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 2月, 2014 1 次提交
-
-
由 Liu, Jinsong 提交于
From 5d5a80cd172ea6fb51786369bcc23356b1e9e956 Mon Sep 17 00:00:00 2001 From: Liu Jinsong <jinsong.liu@intel.com> Date: Mon, 24 Feb 2014 18:11:55 +0800 Subject: [PATCH v5 2/3] KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save Add MSR_IA32_BNDCFGS to msrs_to_save, and corresponding logic to kvm_get/set_msr(). Signed-off-by: NLiu Jinsong <jinsong.liu@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 24 2月, 2014 1 次提交
-
-
由 Liu, Jinsong 提交于
From caddc009a6d2019034af8f2346b2fd37a81608d0 Mon Sep 17 00:00:00 2001 From: Liu Jinsong <jinsong.liu@intel.com> Date: Mon, 24 Feb 2014 18:11:11 +0800 Subject: [PATCH v5 1/3] KVM: x86: Intel MPX vmx and msr handle This patch handle vmx and msr of Intel MPX feature. Signed-off-by: NXudong Hao <xudong.hao@intel.com> Signed-off-by: NLiu Jinsong <jinsong.liu@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 1月, 2014 1 次提交
-
-
由 Jan Kiszka 提交于
Check for invalid state transitions on guest-initiated updates of MSR_IA32_APICBASE. This address both enabling of the x2APIC when it is not supported and all invalid transitions as described in SDM section 10.12.5. It also checks that no reserved bit is set in APICBASE by the guest. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> [Use cpuid_maxphyaddr instead of guest_cpuid_get_phys_bits. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 1月, 2014 8 次提交
-
-
由 Jan Kiszka 提交于
Set guest activity state in L1's VMCS according to the VCPUs mp_state. This ensures we report the correct state in case we L2 executed HLT or if we put L2 into HLT state and it was now woken up by an event. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
When we suspend the guest in HLT state, the nested run is no longer pending - we emulated it completely. So only set nested_run_pending after checking the activity state. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
This simplifies the code and also stops issuing warning about writing to unhandled MSRs when VMX is disabled or the Feature Control MSR is locked - we do handle them all according to the spec. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
Already used by nested SVM for tracing nested vmexit: kvm_nested_vmexit marks exits from L2 to L0 while kvm_nested_vmexit_inject marks vmexits that are reflected to L1. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
Instead of fixing up the vmcs12 after the nested vmexit, pass key parameters already when calling nested_vmx_vmexit. This will help tracing those vmexits. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
When userspace sets MSR_IA32_FEATURE_CONTROL to 0, make sure we leave root and non-root mode, fully disabling VMX. The register state of the VCPU is undefined after this step, so userspace has to set it to a proper state afterward. This enables to reboot a VM while it is running some hypervisor code. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
According to the SDM, only bits 0-3 of DR6 "may" be cleared by "certain" debug exception. So do update them on #DB exception in KVM, but leave the rest alone, only setting BD and BS in addition to already set bits in DR6. This also aligns us with kvm_vcpu_check_singlestep. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
In contrast to VMX, SVM dose not automatically transfer DR6 into the VCPU's arch.dr6. So if we face a DR6 read, we must consult a new vendor hook to obtain the current value. And as SVM now picks the DR6 state from its VMCB, we also need a set callback in order to write updates of DR6 back. Fixes a regression of 020df079. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 1月, 2014 2 次提交
-
-
由 Marcelo Tosatti 提交于
After free_loaded_vmcs executes, the "loaded_vmcs" structure is kfreed, and now vmx->loaded_vmcs points to a kfreed area. Subsequent free_loaded_vmcs then attempts to manipulate vmx->loaded_vmcs. Switch the order to avoid the problem. https://bugzilla.redhat.com/show_bug.cgi?id=1047892Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Zhihui Zhang 提交于
According to Table C-1 of Intel SDM 3C, a VM exit happens on an I/O instruction when "use I/O bitmaps" VM-execution control was 0 _and_ the "unconditional I/O exiting" VM-execution control was 1. So we can't just check "unconditional I/O exiting" alone. This patch was improved by suggestion from Jan Kiszka. Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NZhihui Zhang <zzhsuny@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 02 1月, 2014 1 次提交
-
-
由 Jan Kiszka 提交于
Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if one guest VCPU manipulates the page of another VCPU in L2, we may be fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in nested state. That can crash the host later on if nested_ept_get_cr3 is invoked while L1 already left vmxon and nested.current_vmcs12 became NULL therefore. Cc: stable@kernel.org Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 21 12月, 2013 1 次提交
-
-
由 Jan Kiszka 提交于
If kvm_get_dr or kvm_set_dr reports that it raised a fault, we must not advance the instruction pointer. Otherwise the exception will hit the wrong instruction. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 12月, 2013 1 次提交
-
-
由 Jan Kiszka 提交于
It's a pathological case, but still a valid one: If L1 disables APIC virtualization and also allows L2 to directly write to the APIC page, we have to forcibly enable APIC virtualization while in L2 if the in-kernel APIC is in use. This allows to run the direct interrupt test case in the vmx unit test without x2APIC. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 12月, 2013 2 次提交
-
-
由 Jan Kiszka 提交于
We can easily emulate the HLT activity state for L1: If it decides that L2 shall be halted on entry, just invoke the normal emulation of halt after switching to L2. We do not depend on specific host features to provide this, so we can expose the capability unconditionally. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Gleb Natapov 提交于
VM_(ENTRY|EXIT)_CONTROLS vmcs fields are read/written on each guest entry but most times it can be avoided since values do not changes. Keep fields copy in memory to avoid unnecessary reads from vmcs. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 11月, 2013 1 次提交
-
-
由 Anthoine Bourgeois 提交于
If a nested guest does a NM fault but its CR0 doesn't contain the TS flag (because it was already cleared by the guest with L1 aid) then we have to activate FPU ourselves in L0 and then continue to L2. If TS flag is set then we fallback on the previous behavior, forward the fault to L1 if it asked for. Signed-off-by: NAnthoine Bourgeois <bourgeois@bertin.fr> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 31 10月, 2013 3 次提交
-
-
由 Michael S. Tsirkin 提交于
mst can't be blamed for lack of switch entries: the issue is with msrs actually. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alex Williamson 提交于
We currently use some ad-hoc arch variables tied to legacy KVM device assignment to manage emulation of instructions that depend on whether non-coherent DMA is present. Create an interface for this, adapting legacy KVM device assignment and adding VFIO via the KVM-VFIO device. For now we assume that non-coherent DMA is possible any time we have a VFIO group. Eventually an interface can be developed as part of the VFIO external user interface to query the coherency of a group. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alex Williamson 提交于
Default to operating in coherent mode. This simplifies the logic when we switch to a model of registering and unregistering noncoherent I/O with KVM. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 28 10月, 2013 3 次提交
-
-
由 Jan Kiszka 提交于
If the host supports it, we can and should expose it to the guest as well, just like we already do with PIN_BASED_VIRTUAL_NMIS. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
__vmx_complete_interrupts stored uninjected NMIs in arch.nmi_injected, not arch.nmi_pending. So we actually need to check the former field in vmcs12_save_pending_event. This fixes the eventinj unit test when run in nested KVM. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
As long as the hardware provides us 2MB EPT pages, we can also expose them to the guest because our shadow EPT code already supports this feature. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 11 10月, 2013 1 次提交
-
-
由 Arthur Chunqi Li 提交于
This patch contains the following two changes: 1. Fix the bug in nested preemption timer support. If vmexit L2->L0 with some reasons not emulated by L1, preemption timer value should be save in such exits. 2. Add support of "Save VMX-preemption timer value" VM-Exit controls to nVMX. With this patch, nested VMX preemption timer features are fully supported. Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 10月, 2013 1 次提交
-
-
由 Gleb Natapov 提交于
72f85795 broke shadow on EPT. This patch reverts it and fixes PAE on nEPT (which reverted commit fixed) in other way. Shadow on EPT is now broken because while L1 builds shadow page table for L2 (which is PAE while L2 is in real mode) it never loads L2's GUEST_PDPTR[0-3]. They do not need to be loaded because without nested virtualization HW does this during guest entry if EPT is disabled, but in our case L0 emulates L2's vmentry while EPT is enables, so we cannot rely on vmcs12->guest_pdptr[0-3] to contain up-to-date values and need to re-read PDPTEs from L2 memory. This is what kvm_set_cr3() is doing, but by clearing cache bits during L2 vmentry we drop values that kvm_set_cr3() read from memory. So why the same code does not work for PAE on nEPT? kvm_set_cr3() reads pdptes into vcpu->arch.walk_mmu->pdptrs[]. walk_mmu points to vcpu->arch.nested_mmu while nested guest is running, but ept_load_pdptrs() uses vcpu->arch.mmu which contain incorrect values. Fix that by using walk_mmu in ept_(load|save)_pdptrs. Signed-off-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Tested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 10月, 2013 1 次提交
-
-
由 Paolo Bonzini 提交于
kvm_mmu initialization is mostly filling in function pointers, there is no way for it to fail. Clean up unused return values. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 30 9月, 2013 4 次提交
-
-
由 Gleb Natapov 提交于
If #PF happens during delivery of an exception into L2 and L1 also do not have the page mapped in its shadow page table then L0 needs to generate vmexit to L2 with original event in IDT_VECTORING_INFO, but current code combines both exception and generates #DF instead. Fix that by providing nVMX specific function to handle page faults during page table walk that handles this case correctly. Signed-off-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Gleb Natapov 提交于
All exceptions should be checked for intercept during delivery to L2, but we check only #PF currently. Drop nested_run_pending while we are at it since exception cannot be injected during vmentry anyway. Signed-off-by: NGleb Natapov <gleb@redhat.com> [Renamed the nested_vmx_check_exception function. - Paolo] Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Gleb Natapov 提交于
If an exception causes vmexit directly it should not be reported in IDT_VECTORING_INFO during the exit. For that we need to be able to distinguish between exception that is injected into nested VM and one that is reinjected because its delivery failed. Fortunately we already have mechanism to do so for nested SVM, so here we just use correct function to requeue exceptions and make sure that reinjected exception is not moved to IDT_VECTORING_INFO during vmexit emulation and not re-checked for interception during delivery. Signed-off-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Gleb Natapov 提交于
EXIT_REASON_VMLAUNCH/EXIT_REASON_VMRESUME exit does not mean that nested VM will actually run during next entry. Move setting nested_run_pending closer to vmentry emulation code and move its clearing close to vmexit to minimize amount of code that will erroneously run with nested_run_pending set. Signed-off-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 9月, 2013 5 次提交
-
-
由 Gleb Natapov 提交于
Bit 12 is undefined in any of the following cases: - If the "NMI exiting" VM-execution control is 1 and the "virtual NMIs" VM-execution control is 0. - If the VM exit sets the valid bit in the IDT-vectoring information field Signed-off-by: NGleb Natapov <gleb@redhat.com> [Add parentheses around & within && - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
Now that we provide EPT support, there is no reason to torture our guests by hiding the relieving unrestricted guest mode feature. We just need to relax CR0 checks for always-on bits as PE and PG can now be switched off. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
Implement and advertise VM_EXIT_SAVE_IA32_EFER. L0 traps EFER writes unconditionally, so we always find the current L2 value in the architectural state. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
Fiddling with CR3 for L2 is L1's job. It may set its own, different identity map or simple leave it alone if unrestricted guest mode is enabled. This also fixes reading back the current CR3 on L2 exits for reporting it to L1. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Jan Kiszka 提交于
kvm_set_cr0 performs checks on the state transition that may prevent loading L1's cr0. For now we rely on the hardware to catch invalid states loaded by L1 into its VMCS. Still, consistency checks on the host state part of the VMCS on guest entry will have to be improved later on. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-