- 05 11月, 2015 1 次提交
-
-
由 Kai Huang 提交于
I found PML was broken since below commit: commit feda805f Author: Xiao Guangrong <guangrong.xiao@linux.intel.com> Date: Wed Sep 9 14:05:55 2015 +0800 KVM: VMX: unify SECONDARY_VM_EXEC_CONTROL update Unify the update in vmx_cpuid_update() Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> [Rewrite to use vmcs_set_secondary_exec_control. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> The reason is in above commit vmx_cpuid_update calls vmx_secondary_exec_control, in which currently SECONDARY_EXEC_ENABLE_PML bit is cleared unconditionally (as PML is enabled in creating vcpu). Therefore if vcpu_cpuid_update is called after vcpu is created, PML will be disabled unexpectedly while log-dirty code still thinks PML is used. Fix this by clearing SECONDARY_EXEC_ENABLE_PML in vmx_secondary_exec_control only when PML is not supported or not enabled (!enable_pml). This is more reasonable as PML is currently either always enabled or disabled. With this explicit updating SECONDARY_EXEC_ENABLE_PML in vmx_enable{disable}_pml is not needed so also rename vmx_enable{disable}_pml to vmx_create{destroy}_pml_buffer. Fixes: feda805fSigned-off-by: NKai Huang <kai.huang@linux.intel.com> [While at it, change a wrong ASSERT to an "if". The condition can happen if creating the VCPU fails with ENOMEM. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 11月, 2015 10 次提交
-
-
由 Laszlo Ersek 提交于
Commit b18d5431 ("KVM: x86: fix CR0.CD virtualization") was technically correct, but it broke OVMF guests by slowing down various parts of the firmware. Commit fb279950 ("KVM: vmx: obey KVM_QUIRK_CD_NW_CLEARED") quirked the first function modified by b18d5431, vmx_get_mt_mask(), for OVMF's sake. This restored the speed of the OVMF code that runs before PlatformPei (including the memory intensive LZMA decompression in SEC). This patch extends the quirk to the second function modified by b18d5431, kvm_set_cr0(). It eliminates the intrusive slowdown that hits the EFI_MP_SERVICES_PROTOCOL implementation of edk2's UefiCpuPkg/CpuDxe -- which is built into OVMF --, when CpuDxe starts up all APs at once for initialization, in order to count them. We also carry over the kvm_arch_has_noncoherent_dma() sub-condition from the other half of the original commit b18d5431. Fixes: b18d5431 Cc: stable@vger.kernel.org Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Tested-by: NJanusz Mocek <januszmk6@gmail.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com># Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The SDM says that exiting system management mode from 64-bit mode is invalid, but that would be too good to be true. But actually, most of the code is already there to support exiting from compat mode (EFER.LME=1, EFER.LMA=0). Getting all the way from 64-bit mode to real mode only requires clearing CS.L and CR4.PCIDE. Cc: stable@vger.kernel.org Fixes: 660a5d51Tested-by: NLaszlo Ersek <lersek@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
The comment in code had it mostly right, but we enable paging for emulated real mode regardless of EPT. Without EPT (which implies emulated real mode), secondary VCPUs won't start unless we disable SM[AE]P when the guest doesn't use paging. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The function is not used outside device assignment, and kvm_arch_set_irq_inatomic has a different prototype. Move it here and make it static to avoid confusion. Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The symbols are always defined. Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
We do not want to do too much work in atomic context, in particular not walking all the VCPUs of the virtual machine. So we want to distinguish the architecture-specific injection function for irqfd from kvm_set_msi. Since it's still empty, reuse the newly added kvm_arch_set_irq and rename it to kvm_arch_set_irq_inatomic. Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
BSP doesn't get INIT so its apic_arb_prio isn't zeroed after reboot. BSP won't get lowest priority interrupts until other VCPUs get enough interrupts to match their pre-reboot apic_arb_prio. That behavior doesn't fit into KVM's round-robin-like interpretation of lowest priority delivery ... userspace should KVM_SET_LAPIC on reset, so just zero apic_arb_prio there. Reported-by: NYuki Shibuya <shibuya.yk@ncos.nec.co.jp> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
GET_SMSTATE depends on real mode to ensure that smbase+offset is treated as a physical address, which has already caused a bug after shuffling the code. Enforce physical addressing. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reported-by: NLaszlo Ersek <lersek@redhat.com> Tested-by: NLaszlo Ersek <lersek@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
We want to read the physical memory when emulating RSM. X86EMUL_IO_NEEDED is returned on all errors for consistency with other helpers. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Tested-by: NLaszlo Ersek <lersek@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Saurabh Sengar 提交于
removing unused variables, found by coccinelle Signed-off-by: NSaurabh Sengar <saurabh.truth@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 19 10月, 2015 2 次提交
-
-
由 Takuya Yoshikawa 提交于
Commit fd136902 ("KVM: x86: MMU: Move mapping_level_dirty_bitmap() call in mapping_level()") forgot to initialize force_pt_level to false in FNAME(page_fault)() before calling mapping_level() like nonpaging_map() does. This can sometimes result in forcing page table level mapping unnecessarily. Fix this and move the first *force_pt_level check in mapping_level() before kvm_vcpu_gfn_to_memslot() call to make it a bit clearer that the variable must be initialized before mapping_level() gets called. This change can also avoid calling kvm_vcpu_gfn_to_memslot() when !check_hugepage_cache_consistency() check in tdp_page_fault() forces page table level mapping. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Not zeroing EFER means that a 32-bit firmware cannot enter paging mode without clearing EFER.LME first (which it should not know about). Yang Zhang from Intel confirmed that the manual is wrong and EFER is cleared to zero on INIT. Fixes: d28bc9dd Cc: stable@vger.kernel.org Cc: Yang Z Zhang <yang.z.zhang@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 10月, 2015 11 次提交
-
-
由 Marcelo Tosatti 提交于
As reported at https://bugs.launchpad.net/qemu/+bug/1494350, it is possible to have vcpu->arch.st.last_steal initialized from a thread other than vcpu thread, say the iothread, via KVM_SET_MSRS. Which can cause an overflow later (when subtracting from vcpu threads sched_info.run_delay). To avoid that, move steal time accumulation to vcpu entry time, before copying steal time data to guest. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Reviewed-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
Calling kvm_vcpu_gfn_to_memslot() twice in mapping_level() should be avoided since getting a slot by binary search may not be negligible, especially for virtual machines with many memory slots. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
Now that it has only one caller, and its name is not so helpful for readers, remove it. The new memslot_valid_for_gpte() function makes it possible to share the common code between gfn_to_memslot_dirty_bitmap() and mapping_level(). Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
This is necessary to eliminate an extra memory slot search later. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
As a bonus, an extra memory slot search can be eliminated when is_self_change_mapping is true. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
This will be passed to a function later. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Joerg Roedel 提交于
Currently we always write the next_rip of the shadow vmcb to the guests vmcb when we emulate a vmexit. This could confuse the guest when its cpuid indicated no support for the next_rip feature. Fix this by only propagating next_rip if the guest actually supports it. Cc: Bandan Das <bsd@redhat.com> Cc: Dirk Mueller <dmueller@suse.com> Tested-By: NDirk Mueller <dmueller@suse.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The loop is computing one of two constants, it can be simpler to write everything inline. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Expose VPID capability to L1. For nested guests, we don't do anything specific for single context invalidation. Hence, only advertise support for global context invalidation. The major benefit of nested VPID comes from having separate vpids when switching between L1 and L2, and also when L2's vCPUs not sched in/out on L1. Reviewed-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
VPID is used to tag address space and avoid a TLB flush. Currently L0 use the same VPID to run L1 and all its guests. KVM flushes VPID when switching between L1 and L2. This patch advertises VPID to the L1 hypervisor, then address space of L1 and L2 can be separately treated and avoid TLB flush when swithing between L1 and L2. For each nested vmentry, if vpid12 is changed, reuse shadow vpid w/ an invvpid. Performance: run lmbench on L2 w/ 3.5 kernel. Context switching - times in microseconds - smaller is better ------------------------------------------------------------------------- Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw --------- ------------- ------ ------ ------ ------ ------ ------- ------- kernel Linux 3.5.0-1 1.2200 1.3700 1.4500 4.7800 2.3300 5.60000 2.88000 nested VPID kernel Linux 3.5.0-1 1.2600 1.4300 1.5600 12.7 12.9 3.49000 7.46000 vanilla Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com> Reviewed-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Add the INVVPID instruction emulation. Reviewed-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 10月, 2015 11 次提交
-
-
由 Wanpeng Li 提交于
Introduce __vmx_flush_tlb() to handle specific vpid. Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Adjust allocate/free_vid so that they can be reused for the nested vpid. Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
On real hardware, edge-triggered interrupts don't set a bit in TMR, which means that IOAPIC isn't notified on EOI. Do the same here. Staying in guest/kernel mode after edge EOI is what we want for most devices. If some bugs could be nicely worked around with edge EOI notifications, we should invest in a better interface. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
KVM uses eoi_exit_bitmap to track vectors that need an action on EOI. The problem is that IOAPIC can be reconfigured while an interrupt with old configuration is pending and eoi_exit_bitmap only remembers the newest configuration; thus EOI from the pending interrupt is not recognized. (Reconfiguration is not a problem for level interrupts, because IOAPIC sends interrupt with the new configuration.) For an edge interrupt with ACK notifiers, like i8254 timer; things can happen in this order 1) IOAPIC inject a vector from i8254 2) guest reconfigures that vector's VCPU and therefore eoi_exit_bitmap on original VCPU gets cleared 3) guest's handler for the vector does EOI 4) KVM's EOI handler doesn't pass that vector to IOAPIC because it is not in that VCPU's eoi_exit_bitmap 5) i8254 stops working A simple solution is to set the IOAPIC vector in eoi_exit_bitmap if the vector is in PIR/IRR/ISR. This creates an unwanted situation if the vector is reused by a non-IOAPIC source, but I think it is so rare that we don't want to make the solution more sophisticated. The simple solution also doesn't work if we are reconfiguring the vector. (Shouldn't happen in the wild and I'd rather fix users of ACK notifiers instead of working around that.) The are no races because ioapic injection and reconfig are locked. Fixes: b053b2ae ("KVM: x86: Add EOI exit bitmap inference") [Before b053b2ae, this bug happened only with APICv.] Fixes: c7c9c56c ("x86, apicv: add virtual interrupt delivery support") Cc: <stable@vger.kernel.org> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
After moving PIR to IRR, the interrupt needs to be delivered manually. Reported-by: NPaolo Bonzini <pbonzini@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
In order to get into 64-bit protected mode, you need to enable paging while EFER.LMA=1. For this to work, CS.L must be 0. Currently, we load the segments before CR0 and CR4, which means that if RSM returns into 64-bit protected mode CS.L is already 1 and everything breaks. Luckily, CS.L=0 is always the case when executing RSM, because it is forbidden to execute RSM from 64-bit protected mode. Hence it is enough to load CR0 and CR4 first, and only then the segments. Fixes: 660a5d51 Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Unfortunately I only noticed this after pushing. Fixes: f0d648bd Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
An SMI to a halted VCPU must wake it up, hence a VCPU with a pending SMI must be considered runnable. Fixes: 64d60670 Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Split the huge conditional in two functions. Fixes: 64d60670 Cc: stable@vger.kernel.org Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Otherwise, two copies (one of them never populated and thus bogus) are allocated for the regular and SMM address spaces. This breaks SMM with EPT but without unrestricted guest support, because the SMM copy of the identity page map is all zeros. By moving the allocation to the caller we also remove the last vestiges of kernel-allocated memory regions (not accessible anymore in userspace since commit b74a07be, "KVM: Remove kernel-allocated memory regions", 2010-06-21); that is a nice bonus. Reported-by: NAlexandre DERUMIER <aderumier@odiso.com> Cc: stable@vger.kernel.org Fixes: 9da0e4d5Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The next patch will make x86_set_memory_region fill the userspace_addr. Since the struct is not used untouched anymore, it makes sense to build it in x86_set_memory_region directly; it also simplifies the callers. Reported-by: NAlexandre DERUMIER <aderumier@odiso.com> Cc: stable@vger.kernel.org Fixes: 9da0e4d5Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 01 10月, 2015 5 次提交
-
-
由 Feng Wu 提交于
This patch updates the Posted-Interrupts Descriptor when vCPU is blocked. pre-block: - Add the vCPU to the blocked per-CPU list - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR post-block: - Remove the vCPU from the per-CPU list Signed-off-by: NFeng Wu <feng.wu@intel.com> [Concentrate invocation of pre/post-block hooks to vcpu_block. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Feng Wu 提交于
This patch updates the Posted-Interrupts Descriptor when vCPU is preempted. sched out: - Set 'SN' to suppress furture non-urgent interrupts posted for the vCPU. sched in: - Clear 'SN' - Change NDST if vCPU is scheduled to a different CPU - Set 'NV' to POSTED_INTR_VECTOR Signed-off-by: NFeng Wu <feng.wu@intel.com> [Include asm/cpu.h to fix !CONFIG_SMP compilation. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Feng Wu 提交于
Select IRQ_BYPASS_MANAGER for x86 when CONFIG_KVM is set Signed-off-by: NFeng Wu <feng.wu@intel.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Feng Wu 提交于
This patch adds the routine to update IRTE for posted-interrupts when guest changes the interrupt configuration. Signed-off-by: NFeng Wu <feng.wu@intel.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> [Squashed in automatically generated patch from the build robot "KVM: x86: vcpu_to_pi_desc() can be static" - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Feng Wu 提交于
Make kvm_set_msi_irq() public, we can use this function outside. Signed-off-by: NFeng Wu <feng.wu@intel.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-