- 24 8月, 2017 1 次提交
-
-
由 Janakarajan Natarajan 提交于
Enable the Virtual GIF feature. This is done by setting bit 25 at position 60h in the vmcb. With this feature enabled, the processor uses bit 9 at position 60h as the virtual GIF when executing STGI/CLGI instructions. Since the execution of STGI by the L1 hypervisor does not cause a return to the outermost (L0) hypervisor, the enable_irq_window and enable_nmi_window are modified. The IRQ window will be opened even if GIF is not set, under the assumption that on resuming the L1 hypervisor the IRQ will be held pending until the processor executes the STGI instruction. For the NMI window, the STGI intercept is set. This will assist in opening the window only when GIF=1. Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 8月, 2017 2 次提交
-
-
由 Denys Vlasenko 提交于
With lightly tweaked defconfig: text data bss dec hex filename 11259661 5109408 2981888 19350957 12745ad vmlinux.before 11259661 5109408 884736 17253805 10745ad vmlinux.after Only compile-tested. Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Brijesh Singh 提交于
When a guest causes a page fault which requires emulation, the vcpu->arch.gpa_available flag is set to indicate that cr2 contains a valid GPA. Currently, emulator_read_write_onepage() makes use of gpa_available flag to avoid a guest page walk for a known MMIO regions. Lets not limit the gpa_available optimization to just MMIO region. The patch extends the check to avoid page walk whenever gpa_available flag is set. Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> [Fix EPT=0 according to Wanpeng Li's fix, plus ensure VMX also uses the new code. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> [Moved "ret < 0" to the else brach, as per David's review. - Radim] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 08 8月, 2017 2 次提交
-
-
由 Longpeng(Mike) 提交于
get_cpl requires vcpu_load, so we must cache the result (whether the vcpu was preempted when its cpl=0) in kvm_vcpu_arch. Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Longpeng(Mike) 提交于
If a vcpu exits due to request a user mode spinlock, then the spinlock-holder may be preempted in user mode or kernel mode. (Note that not all architectures trap spin loops in user mode, only AMD x86 and ARM/ARM64 currently do). But if a vcpu exits in kernel mode, then the holder must be preempted in kernel mode, so we should choose a vcpu in kernel mode as a more likely candidate for the lock holder. This introduces kvm_arch_vcpu_in_kernel() to decide whether the vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's new argument says the same of the spinning VCPU. Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 8月, 2017 2 次提交
-
-
由 Radim Krčmář 提交于
Add guest_cpuid_clear() and use it instead of kvm_find_cpuid_entry(). Also replace some uses of kvm_find_cpuid_entry() with guest_cpuid_has(). Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
This patch turns guest_cpuid_has_XYZ(cpuid) into guest_cpuid_has(cpuid, X86_FEATURE_XYZ), which gets rid of many very similar helpers. When seeing a X86_FEATURE_*, we can know which cpuid it belongs to, but this information isn't in common code, so we recreate it for KVM. Add some BUILD_BUG_ONs to make sure that it runs nicely. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 02 8月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
There are three issues in nested_vmx_check_exception: 1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported by Wanpeng Li; 2) it should rebuild the interruption info and exit qualification fields from scratch, as reported by Jim Mattson, because the values from the L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes a page fault, the EPT misconfig's exit qualification is incorrect). 3) CR2 and DR6 should not be written for exception intercept vmexits (CR2 only for AMD). This patch fixes the first two and adds a comment about the last, outlining the fix. Cc: Jim Mattson <jmattson@google.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 7月, 2017 3 次提交
-
-
由 Wanpeng Li 提交于
Add an nested_apf field to vcpu->arch.exception to identify an async page fault, and constructs the expected vm-exit information fields. Force a nested VM exit from nested_vmx_check_exception() if the injected #PF is async page fault. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
This patch adds the L1 guest async page fault #PF vmexit handler, such by L1 similar to ordinary async page fault. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Passed insn parameters to kvm_mmu_page_fault().] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
This patch removes all arguments except the first in kvm_x86_ops->queue_exception since they can extract the arguments from vcpu->arch.exception themselves. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 13 7月, 2017 4 次提交
-
-
由 Janakarajan Natarajan 提交于
Enable the Virtual VMLOAD VMSAVE feature. This is done by setting bit 1 at position B8h in the vmcb. The processor must have nested paging enabled, be in 64-bit mode and have support for the Virtual VMLOAD VMSAVE feature for the bit to be set in the vmcb. Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Janakarajan Natarajan 提交于
Rename the lbr_ctl variable to better reflect the purpose of the field - provide support for virtualization extensions. Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Janakarajan Natarajan 提交于
The lbr_ctl variable in the vmcb control area is used to enable or disable Last Branch Record (LBR) virtualization. However, this is to be done using only bit 0 of the variable. To correct this and to prepare for a new feature, change the current usage to work only on a particular bit. Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Ladi Prosek 提交于
kvm_skip_emulated_instruction handles the singlestep debug exception which is something we almost always want. This commit (specifically the change in rdmsr_interception) makes the debug.flat KVM unit test pass on AMD. Two call sites still call skip_emulated_instruction directly: * In svm_queue_exception where it's used only for moving the rip forward * In task_switch_interception which is analogous to handle_task_switch in VMX Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 30 6月, 2017 1 次提交
-
-
由 Josh Poimboeuf 提交于
In preparation for an objtool rewrite which will have broader checks, whitelist functions and files which cause problems because they do unusual things with the stack. These whitelists serve as a TODO list for which functions and files don't yet have undwarf unwinder coverage. Eventually most of the whitelists can be removed in favor of manual CFI hint annotations or objtool improvements. Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 6月, 2017 5 次提交
-
-
由 Ladi Prosek 提交于
enable_nmi_window is supposed to be a no-op if we know that we'll see a VM exit by the time the NMI window opens. This commit adds two more cases: * We intercept stgi so we don't need to singlestep on GIF=0. * We emulate nested vmexit so we don't need to singlestep when nested VM exit is required. Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ladi Prosek 提交于
Singlestepping is enabled by setting the TF flag and care must be taken to not let the guest see (and reuse at an inconvenient time) the modified rflag value. One such case is event injection, as part of which flags are pushed on the stack and restored later on iret. This commit disables singlestepping when we're about to inject an event and forces an immediate exit for us to re-evaluate the NMI related state. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ladi Prosek 提交于
These flags are used internally by SVM so it's cleaner to not leak them to callers of svm_get_rflags. This is similar to how the TF flag is handled on KVM_GUESTDBG_SINGLESTEP by kvm_get_rflags and kvm_set_rflags. Without this change, the flags may propagate from host VMCB to nested VMCB or vice versa while singlestepping over a nested VM enter/exit, and then get stuck in inappropriate places. Example: NMI singlestepping is enabled while running L1 guest. The instruction to step over is VMRUN and nested vmrun emulation stashes rflags to hsave->save.rflags. Then if singlestepping is disabled while still in L2, TF/RF will be cleared from the nested VMCB but the next nested VM exit will restore them from hsave->save.rflags and cause an unexpected DB exception. Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ladi Prosek 提交于
Nested hypervisor should not see singlestep VM exits if singlestepping was enabled internally by KVM. Windows is particularly sensitive to this and known to bluescreen on unexpected VM exits. Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ladi Prosek 提交于
Just moving the code to a new helper in preparation for following commits. Signed-off-by: NLadi Prosek <lprosek@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 01 6月, 2017 2 次提交
-
-
由 Dan Carpenter 提交于
I moved the || to the line before. Also I replaced some spaces with a tab on the "return 0;" line. It looks OK in the diff but originally that line was only indented 7 spaces. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Roman Pen 提交于
This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt was taken on userspace stack. The root cause lies in the specific AMD CPU behaviour which manifests itself as unusable segment attributes on SYSRET. The corresponding work around for the kernel is the following: 61f01dd9 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue") In other turn virtualization side treated unusable segment incorrectly and restored CPL from SS attributes, which were zeroed out few lines above. In current patch it is assured only that P bit is cleared in VMCB.save state and segment attributes are not zeroed out if segment is not presented or is unusable, therefore CPL can be safely restored from DPL field. This is only one part of the fix, since QEMU side should be fixed accordingly not to zero out attributes on its side. Corresponding patch will follow. [1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com Signed-off-by: NRoman Pen <roman.penyaev@profitbricks.com> Signed-off-by: NMikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim KrÄmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 30 5月, 2017 1 次提交
-
-
由 Gioh Kim 提交于
Commit 19bca6ab ("KVM: SVM: Fix cross vendor migration issue with unusable bit") added checking type when setting unusable. So unusable can be set if present is 0 OR type is 0. According to the AMD processor manual, long mode ignores the type value in segment descriptor. And type can be 0 if it is read-only data segment. Therefore type value is not related to unusable flag. This patch is based on linux-next v4.12.0-rc3. Signed-off-by: NGioh Kim <gi-oh.kim@profitbricks.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 18 5月, 2017 1 次提交
-
-
由 Dan Carpenter 提交于
Smatch complains that we check cap the upper bound of "index" but don't check for negatives. It's a false positive because "index" is never negative. But it's also simple enough to make it unsigned which makes the code easier to audit. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 21 4月, 2017 1 次提交
-
-
由 Michael S. Tsirkin 提交于
Guests that are heavy on futexes end up IPI'ing each other a lot. That can lead to significant slowdowns and latency increase for those guests when running within KVM. If only a single guest is needed on a host, we have a lot of spare host CPU time we can throw at the problem. Modern CPUs implement a feature called "MWAIT" which allows guests to wake up sleeping remote CPUs without an IPI - thus without an exit - at the expense of never going out of guest context. The decision whether this is something sensible to use should be up to the VM admin, so to user space. We can however allow MWAIT execution on systems that support it properly hardware wise. This patch adds a CAP to user space and a KVM cpuid leaf to indicate availability of native MWAIT execution. With that enabled, the worst a guest can do is waste as many cycles as a "jmp ." would do, so it's not a privilege problem. We consciously do *not* expose the feature in our CPUID bitmap, as most people will want to benefit from sleeping vCPUs to allow for over commit. Reported-by: N"Gabriel L. Somlo" <gsomlo@gmail.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> [agraf: fix amd, change commit message] Signed-off-by: NAlexander Graf <agraf@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 07 4月, 2017 1 次提交
-
-
由 Borislav Petkov 提交于
MCG_CAP[63:9] bits are reserved on AMD. However, on an AMD guest, this MSR returns 0x100010a. More specifically, bit 24 is set, which is simply wrong. That bit is MCG_SER_P and is present only on Intel. Thus, clean up the reserved bits in order not to confuse guests. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 3月, 2017 1 次提交
-
-
由 Dmitry Vyukov 提交于
If avic is not enabled, avic_vm_init() does nothing and returns early. However, avic_vm_destroy() still tries to destroy what hasn't been created. The only bad consequence of this now is that avic_vm_destroy() uses svm_vm_data_hash_lock that hasn't been initialized (and is not meant to be used at all if avic is not enabled). Return early from avic_vm_destroy() if avic is not enabled. It has nothing to destroy. Signed-off-by: NDmitry Vyukov <dvyukov@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: kvm@vger.kernel.org Cc: syzkaller@googlegroups.com Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 16 3月, 2017 1 次提交
-
-
由 Thomas Garnier 提交于
This patch makes the GDT remapped pages read-only, to prevent accidental (or intentional) corruption of this key data structure. This change is done only on 64-bit, because 32-bit needs it to be writable for TSS switches. The native_load_tr_desc function was adapted to correctly handle a read-only GDT. The LTR instruction always writes to the GDT TSS entry. This generates a page fault if the GDT is read-only. This change checks if the current GDT is a remap and swap GDTs as needed. This function was tested by booting multiple machines and checking hibernation works properly. KVM SVM and VMX were adapted to use the writeable GDT. On VMX, the per-cpu variable was removed for functions to fetch the original GDT. Instead of reloading the previous GDT, VMX will reload the fixmap GDT as expected. For testing, VMs were started and restored on multiple configurations. Signed-off-by: NThomas Garnier <thgarnie@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Luis R . Rodriguez <mcgrof@kernel.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: kasan-dev@googlegroups.com Cc: kernel-hardening@lists.openwall.com Cc: kvm@vger.kernel.org Cc: lguest@lists.ozlabs.org Cc: linux-doc@vger.kernel.org Cc: linux-efi@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-pm@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: zijun_hu <zijun_hu@htc.com> Link: http://lkml.kernel.org/r/20170314170508.100882-3-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 2月, 2017 1 次提交
-
-
由 Paolo Bonzini 提交于
The FPU is always active now when running KVM. Reviewed-by: NDavid Matlack <dmatlack@google.com> Reviewed-by: NBandan Das <bsd@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 2月, 2017 2 次提交
-
-
由 David Hildenbrand 提交于
The hashtable and guarding spinlock are global data structures, we can inititalize them statically. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Message-Id: <20170124212116.4568-1-david@redhat.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
Calls to apic_find_highest_irr are scanning IRR twice, once in vmx_sync_pir_from_irr and once in apic_search_irr. Change sync_pir_from_irr to get the new maximum IRR from kvm_apic_update_irr; now that it does the computation, it can also do the RVI write. In order to avoid complications in svm.c, make the callback optional. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 1月, 2017 1 次提交
-
-
由 Tom Lendacky 提交于
When a guest causes a NPF which requires emulation, KVM sometimes walks the guest page tables to translate the GVA to a GPA. This is unnecessary most of the time on AMD hardware since the hardware provides the GPA in EXITINFO2. The only exception cases involve string operations involving rep or operations that use two memory locations. With rep, the GPA will only be the value of the initial NPF and with dual memory locations we won't know which memory address was translated into EXITINFO2. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 12月, 2016 2 次提交
-
-
由 Kyle Huey 提交于
kvm_skip_emulated_instruction calls both kvm_x86_ops->skip_emulated_instruction and kvm_vcpu_check_singlestep, skipping the emulated instruction and generating a trap if necessary. Replacing skip_emulated_instruction calls with kvm_skip_emulated_instruction is straightforward, except for: - ICEBP, which is already inside a trap, so avoid triggering another trap. - Instructions that can trigger exits to userspace, such as the IO insns, MOVs to CR8, and HALT. If kvm_skip_emulated_instruction does trigger a KVM_GUESTDBG_SINGLESTEP exit, and the handling code for IN/OUT/MOV CR8/HALT also triggers an exit to userspace, the latter will take precedence. The singlestep will be triggered again on the next instruction, which is the current behavior. - Task switch instructions which would require additional handling (e.g. the task switch bit) and are instead left alone. - Cases where VMLAUNCH/VMRESUME do not proceed to the next instruction, which do not trigger singlestep traps as mentioned previously. Signed-off-by: NKyle Huey <khuey@kylehuey.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Kyle Huey 提交于
Once skipping the emulated instruction can potentially trigger an exit to userspace (via KVM_GUESTDBG_SINGLESTEP) kvm_emulate_cpuid will need to propagate a return value. Signed-off-by: NKyle Huey <khuey@kylehuey.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 25 11月, 2016 2 次提交
-
-
由 Tom Lendacky 提交于
Update the I/O interception support to add the kvm_fast_pio_in function to speed up the in instruction similar to the out instruction. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Tom Lendacky 提交于
AMD hardware adds two additional bits to aid in nested page fault handling. Bit 32 - NPF occurred while translating the guest's final physical address Bit 33 - NPF occurred while translating the guest page tables The guest page tables fault indicator can be used as an aid for nested virtualization. Using V0 for the host, V1 for the first level guest and V2 for the second level guest, when both V1 and V2 are using nested paging there are currently a number of unnecessary instruction emulations. When V2 is launched shadow paging is used in V1 for the nested tables of V2. As a result, KVM marks these pages as RO in the host nested page tables. When V2 exits and we resume V1, these pages are still marked RO. Every nested walk for a guest page table is treated as a user-level write access and this causes a lot of NPFs because the V1 page tables are marked RO in the V0 nested tables. While executing V1, when these NPFs occur KVM sees a write to a read-only page, emulates the V1 instruction and unprotects the page (marking it RW). This patch looks for cases where we get a NPF due to a guest page table walk where the page was marked RO. It immediately unprotects the page and resumes the guest, leading to far fewer instruction emulations when nested virtualization is used. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Reviewed-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 03 11月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
Since commit a545ab6a ("kvm: x86: add tsc_offset field to struct kvm_vcpu_arch", 2016-09-07) the offset between host and L1 TSC is cached and need not be fished out of the VMCS or VMCB. This means that we can implement adjust_tsc_offset_guest and read_l1_tsc entirely in generic code. The simplification is particularly significant for VMX code, where vmx->nested.vmcs01_tsc_offset was duplicating what is now in vcpu->arch.tsc_offset. Therefore the vmcs01_tsc_offset can be dropped completely. More importantly, this fixes KVM_GET_CLOCK/KVM_SET_CLOCK which, after commit 108b249c ("KVM: x86: introduce get_kvmclock_ns", 2016-09-01) called read_l1_tsc while the VMCS was not loaded. It thus returned bogus values on Intel CPUs. Fixes: 108b249cReported-by: NRoman Kagan <rkagan@virtuozzo.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 9月, 2016 1 次提交
-
-
由 Colin Ian King 提交于
vm_data->avic_vm_id is a u32, so the check for a error return (less than zero) such as -EAGAIN from avic_get_next_vm_id currently has no effect whatsoever. Fix this by using a temporary int for the comparison and assign vm_data->avic_vm_id to this. I used an explicit u32 cast in the assignment to show why vm_data->avic_vm_id cannot be used in the assign/compare steps. Signed-off-by: NColin Ian King <colin.king@canonical.com> Acked-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 9月, 2016 1 次提交
-
-
由 Luiz Capitulino 提交于
The TSC offset can now be read directly from struct kvm_arch_vcpu. Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-