- 29 7月, 2013 4 次提交
-
-
由 Gleb Natapov 提交于
During nested vmentry into vm86 mode a vcpu state is found to be incorrect because rflags does not have VM flag set since it is read from the cache and has L1's value instead of L2's. If emulate_invalid_guest_state=1 L0 KVM tries to emulate it, but emulation does not work for nVMX and it never should happen anyway. Fix that by using vmx_set_rflags() to set rflags during nested vmentry which takes care of updating register cache. Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This lets debugging work better during emulation of invalid guest state. This time the check is done after emulation, but before writeback of the flags; we need to check the flags *before* execution of the instruction, we cannot check singlestep_rip because the CS base may have already been modified. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Conflicts: arch/x86/kvm/x86.c
-
由 Paolo Bonzini 提交于
This lets debugging work better during emulation of invalid guest state. The check is done before emulating the instruction, and (in the case of guest debugging) reuses EMULATE_DO_MMIO to exit with KVM_EXIT_DEBUG. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
The next patch will reuse it for other userspace exits than MMIO, namely debug events. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 7月, 2013 2 次提交
-
-
由 Jan Kiszka 提交于
Both have no users anymore. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Jan Kiszka 提交于
If posted interrupts are enabled, we can no longer track if an IRQ was coalesced based on IRR. So drop this logic also from the classic software path and simplify apic_test_and_set_irr to apic_set_irr. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 20 7月, 2013 1 次提交
-
-
由 Andi Kleen 提交于
[KVM maintainers: The underlying support for this is in perf/core now. So please merge this patch into the KVM tree.] This is not arch perfmon, but older CPUs will just ignore it. This makes it possible to do at least some TSX measurements from a KVM guest v2: Various fixes to address review feedback v3: Ignore the bits when no CPUID. No #GP. Force raw events with TSX bits. v4: Use reserved bits for #GP v5: Remove obsolete argument Acked-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 7月, 2013 9 次提交
-
-
由 Arthur Chunqi Li 提交于
When L2 exits to L1, segment infomations of L1 are not set correctly. According to Intel SDM 27.5.2(Loading Host Segment and Descriptor Table Registers), segment base/limit/access right of L1 should be set to some designed value when L2 exits to L1. This patch fixes this. Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com> Reviewed-by: NGleb Natapov <gnatapov@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Har'El 提交于
Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. This patch simulate this MSR in nested_vmx and the default value is 0x0. BIOS should set it to 0x5 before VMXON. After setting the lock bit, write to it will cause #GP(0). Another QEMU patch is also needed to handle emulation of reset and migration. Reset to vCPU should clear this MSR and migration should reserve value of it. This patch is based on Nadav's previous commit. http://permalink.gmane.org/gmane.comp.emulators.kvm.devel/88478Signed-off-by: NNadav Har'El <nyh@math.technion.ac.il> Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Mathias Krause 提交于
Void pointers don't need no casting, drop it. Signed-off-by: NMathias Krause <minipli@googlemail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Mathias Krause 提交于
Use a const pointer type instead of casting away the const qualifier from const arrays. Keep the pointer array on the stack, nonetheless. Making it static just increases the object size. Signed-off-by: NMathias Krause <minipli@googlemail.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Arthur Chunqi Li 提交于
Set rflags after successfully emulateing VMXON/VMXOFF in VMX. Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Arthur Chunqi Li 提交于
Move nested_vmx_succeed/nested_vmx_failInvalid/nested_vmx_failValid ahead of handle_vmon to eliminate double declaration in the same file Signed-off-by: NArthur Chunqi Li <yzt356@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
Now that kvm_arch_memslots_updated() catches every increment of the memslots->generation, checking if the mmio generation has reached its maximum value is enough. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Takuya Yoshikawa 提交于
This is called right after the memslots is updated, i.e. when the result of update_memslots() gets installed in install_new_memslots(). Since the memslots needs to be updated twice when we delete or move a memslot, kvm_arch_commit_memory_region() does not correspond to this exactly. In the following patch, x86 will use this new API to check if the mmio generation has reached its maximum value, in which case mmio sptes need to be flushed out. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Acked-by: NAlexander Graf <agraf@suse.de> Reviewed-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Currently, fast page fault incorrectly tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen). It then returns to guest to retry the fault since it sees the last spte is nonpresent. This causes an infinite loop. Since fast page fault only works for direct mmu, the issue exists when 1) tdp is enabled. It is only triggered only on AMD host since on Intel host the mmio page fault is recognized as ept-misconfig whose handler call fault-page path with error_code = 0 2) guest paging is disabled. Under this case, the issue is hardly discovered since paging disable is short-lived and the sptes will be invalid after memslot changed for 150 times Fix it by filtering out MMIO page faults in page_fault_can_be_fast. Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Tested-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 7月, 2013 1 次提交
-
-
由 Gleb Natapov 提交于
Some userspaces do not preserve unusable property. Since usable segment has to be present according to VMX spec we can use present property to amend userspace bug by making unusable segment always nonpresent. vmx_segment_access_rights() already marks nonpresent segment as unusable. Cc: stable@vger.kernel.org # 3.9+ Reported-by: NStefan Pietsch <stefan.pietsch@lsexperts.de> Tested-by: NStefan Pietsch <stefan.pietsch@lsexperts.de> Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 6月, 2013 10 次提交
-
-
由 Gleb Natapov 提交于
This reverts most of the f1ed0450. After the commit kvm_apic_set_irq() no longer returns accurate information about interrupt injection status if injection is done into disabled APIC. RTC interrupt coalescing tracking relies on the information to be accurate and cannot recover if it is not. Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Yoshihiro YUNOMAE 提交于
Add a tracepoint write_tsc_offset for tracing TSC offset change. We want to merge ftrace's trace data of guest OSs and the host OS using TSC for timestamp in chronological order. We need "TSC offset" values for each guest when merge those because the TSC value on a guest is always the host TSC plus guest's TSC offset. If we get the TSC offset values, we can calculate the host TSC value for each guest events from the TSC offset and the event TSC value. The host TSC values of the guest events are used when we want to merge trace data of guests and the host in chronological order. (Note: the trace_clock of both the host and the guest must be set x86-tsc in this case) This tracepoint also records vcpu_id which can be used to merge trace data for SMP guests. A merge tool will read TSC offset for each vcpu, then the tool converts guest TSC values to host TSC values for each vcpu. TSC offset is stored in the VMCS by vmx_write_tsc_offset() or vmx_adjust_tsc_offset(). KVM executes the former function when a guest boots. The latter function is executed when kvm clock is updated. Only host can read TSC offset value from VMCS, so a host needs to output TSC offset value when TSC offset is changed. Since the TSC offset is not often changed, it could be overwritten by other frequent events while tracing. To avoid that, I recommend to use a special instance for getting this event: 1. set a instance before booting a guest # cd /sys/kernel/debug/tracing/instances # mkdir tsc_offset # cd tsc_offset # echo x86-tsc > trace_clock # echo 1 > events/kvm/kvm_write_tsc_offset/enable 2. boot a guest Signed-off-by: NYoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Takuya Yoshikawa 提交于
Without this information, users will just see unexpected performance problems and there is little chance we will get good reports from them: note that mmio generation is increased even when we just start, or stop, dirty logging for some memory slot, in which case users cannot expect all shadow pages to be zapped. printk_ratelimited() is used for this taking into account the problems that we can see the information many times when we start multiple VMs and guests can trigger this by reading ROM in a loop for example. Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Drop kvm_mmu_zap_mmio_sptes and use kvm_mmu_invalidate_zap_all_pages instead to handle mmio generation number overflow Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Then it has the chance to trigger mmio generation number wrap-around Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> [Change from MMIO_MAX_GEN - 13 to MMIO_MAX_GEN - 150, because 13 is very close to the number of calls to KVM_SET_USER_MEMORY_REGION before the guest is started and there is any chance to create any spte. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
It is useful for debug mmio spte invalidation Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
This patch tries to introduce a very simple and scale way to invalidate all mmio sptes - it need not walk any shadow pages and hold mmu-lock KVM maintains a global mmio valid generation-number which is stored in kvm->memslots.generation and every mmio spte stores the current global generation-number into his available bits when it is created When KVM need zap all mmio sptes, it just simply increase the global generation-number. When guests do mmio access, KVM intercepts a MMIO #PF then it walks the shadow page table and get the mmio spte. If the generation-number on the spte does not equal the global generation-number, it will go to the normal #PF handler to update the mmio spte Since 19 bits are used to store generation-number on mmio spte, we zap all mmio sptes when the number is round Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Define some meaningful names instead of raw code Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Xiao Guangrong 提交于
Store the generation-number into bit3 ~ bit11 and bit52 ~ bit61, totally 19 bits can be used, it should be enough for nearly all most common cases In this patch, the generation-number is always 0, it will be changed in the later patch [Gleb: masking generation bits from spte in get_mmio_spte_gfn() and get_mmio_spte_access()] Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 26 6月, 2013 2 次提交
-
-
由 H. Peter Anvin 提交于
Rename X86_CR4_RDWRGSFS to X86_CR4_FSGSBASE to match the SDM. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Link: http://lkml.kernel.org/n/tip-buq1evi5dpykxx7ak6amaam0@git.kernel.org
-
由 H. Peter Anvin 提交于
Bit 1 in the x86 EFLAGS is always set. Name the macro something that actually tries to explain what it is all about, rather than being a tautology. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Gleb Natapov <gleb@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/n/tip-f10rx5vjjm6tfnt8o1wseb3v@git.kernel.org
-
- 21 6月, 2013 1 次提交
-
-
由 Xiao Guangrong 提交于
Let mmio spte only use bit62 and bit63 on upper 32 bits, then bit 52 ~ bit 61 can be used for other purposes Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NGleb Natapov <gleb@redhat.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 6月, 2013 1 次提交
-
-
由 Zhanghaoyu (A) 提交于
__kvm_set_xcr function does the CPL check when set xcr. __kvm_set_xcr is called in two flows, one is invoked by guest, call stack shown as below, handle_xsetbv(or xsetbv_interception) kvm_set_xcr __kvm_set_xcr the other one is invoked by host, for example during system reset: kvm_arch_vcpu_ioctl kvm_vcpu_ioctl_x86_set_xcrs __kvm_set_xcr The former does need the CPL check, but the latter does not. Cc: stable@vger.kernel.org Signed-off-by: NZhang Haoyu <haoyu.zhang@huawei.com> [Tweaks to commit message. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 6月, 2013 1 次提交
-
-
由 Marcelo Tosatti 提交于
Its possible that idivl overflows (due to large delta stored in usdiff, valid scenario). Create an exception handler to catch the overflow exception (division by zero is protected by vcpu->arch.virtual_tsc_khz check), and interpret it accordingly (delta is larger than USEC_PER_SEC). Fixes https://bugzilla.redhat.com/show_bug.cgi?id=969644Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 05 6月, 2013 8 次提交
-
-
由 Gleb Natapov 提交于
Quote Gleb's mail: | why don't we check for sp->role.invalid in | kvm_mmu_prepare_zap_page before calling kvm_reload_remote_mmus()? and | Actually we can add check for is_obsolete_sp() there too since | kvm_mmu_invalidate_all_pages() already calls kvm_reload_remote_mmus() | after incrementing mmu_valid_gen. [ Xiao: add some comments and the check of is_obsolete_sp() ] Signed-off-by: NGleb Natapov <gleb@redhat.com> Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
As Marcelo pointed out that | "(retention of large number of pages while zapping) | can be fatal, it can lead to OOM and host crash" We introduce a list, kvm->arch.zapped_obsolete_pages, to link all the pages which are deleted from the mmu cache but not actually freed. When page reclaiming is needed, we always zap this kind of pages first. Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
kvm_zap_obsolete_pages uses lock-break technique to zap pages, it will flush tlb every time when it does lock-break We can reload mmu on all vcpus after updating the generation number so that the obsolete pages are not used on any vcpus, after that we do not need to flush tlb when obsolete pages are zapped It will do kvm_mmu_prepare_zap_page many times and use one kvm_mmu_commit_zap_page to collapse tlb flush, the side-effects is that causes obsolete pages unlinked from active_list but leave on hash-list, so we add the comment around the hash list walker Note: kvm_mmu_commit_zap_page is still needed before free the pages since other vcpus may be doing locklessly shadow page walking Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Zap at lease 10 pages before releasing mmu-lock to reduce the overload caused by requiring lock After the patch, kvm_zap_obsolete_pages can forward progress anyway, so update the comments [ It improves the case 0.6% ~ 1% that do kernel building meanwhile read PCI ROM. ] Note: i am not sure that "10" is the best speculative value, i just guessed that '10' can make vcpu do not spend long time on kvm_zap_obsolete_pages and do not cause mmu-lock too hungry. Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
The obsolete page will be zapped soon, do not reuse it to reduce future page fault Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
It is good for debug and development Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Show sp->mmu_valid_gen Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Xiao Guangrong 提交于
Replace kvm_mmu_zap_all by kvm_mmu_invalidate_zap_all_pages Signed-off-by: NXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-