- 30 3月, 2015 3 次提交
-
-
由 Nadav Amit 提交于
POPA should assign the values to the registers as usual registers are assigned. In other words, 32-bits register assignments should clear bits [63:32] of the register. Split the code of register assignments that will be used by future changes as well. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Message-Id: <1427719163-5429-3-git-send-email-namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Nadav Amit 提交于
On legacy mode CMOV emulation should still clear bits [63:32] even if the assignment is not done. The previous fix 140bad89 ("KVM: x86: emulation of dword cmov on long-mode should clear [63:32]") was incomplete. Signed-off-by: NNadav Amit <namit@cs.technion.ac.il> Message-Id: <1427719163-5429-2-git-send-email-namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Petr Matousek 提交于
If data is read from PIC with invalid access size, the return data stays uninitialized even though success is returned. Fix this by always initializing the data. Signed-off-by: NPetr Matousek <pmatouse@redhat.com> Reported-by: NNadav Amit <nadav.amit@gmail.com> Message-Id: <20150311111609.GG8544@dhcp-25-225.brq.redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 3月, 2015 1 次提交
-
-
由 Jan Kiszka 提交于
If the guest CPU is supposed to support rdtscp and the host has rdtscp enabled in the secondary execution controls, we can also expose this feature to L1. Just extend nested_vmx_exit_handled to properly route EXIT_REASON_RDTSCP. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 3月, 2015 1 次提交
-
-
由 Radim Krčmář 提交于
An overhead from function call is not appropriate for its size and frequency of execution. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 19 3月, 2015 1 次提交
-
-
由 Bandan Das 提交于
I hit this path on a AMD box and thought someone was playing a April Fool's joke on me. Signed-off-by: NBandan Das <bsd@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 18 3月, 2015 2 次提交
-
-
由 Xiubo Li 提交于
This patch fix the following sparse warnings: for arch/x86/kvm/x86.c: warning: symbol 'emulator_read_write' was not declared. Should it be static? warning: symbol 'emulator_write_emulated' was not declared. Should it be static? warning: symbol 'emulator_get_dr' was not declared. Should it be static? warning: symbol 'emulator_set_dr' was not declared. Should it be static? for arch/x86/kvm/pmu.c: warning: symbol 'fixed_pmc_events' was not declared. Should it be static? Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Xiubo Li 提交于
This patch fix the following sparse warning: for file arch/x86/kvm/x86.c: warning: Using plain integer as NULL pointer Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 14 3月, 2015 2 次提交
-
-
由 Jan Kiszka 提交于
While in L2, leave all #UD to L2 and do not try to emulate it. If L1 is interested in doing this, it reports its interest via the exception bitmap, and we never get into handle_exception of L0 anyway. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Jan Kiszka 提交于
For a very long time (since 2b3d2a20), the path handling a vmmcall instruction of the guest on an Intel host only applied the patch but no longer handled the hypercall. The reverse case, vmcall on AMD hosts, is fine. As both em_vmcall and em_vmmcall actually have to do the same, we can fix the issue by consolidating both into the same handler. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 13 3月, 2015 1 次提交
-
-
由 David Kaplan 提交于
Another patch in my war on emulate_on_interception() use as a svm exit handler. These were pulled out of a larger patch at the suggestion of Radim Krcmar, see https://lkml.org/lkml/2015/2/25/559 Changes since v1: * fixed typo introduced after test, retested Signed-off-by: NDavid Kaplan <david.kaplan@amd.com> [separated out just cr_interception part from larger removal of INTERCEPT_CR0_WRITE, forward ported, tested] Signed-off-by: NJoel Schopp <joel.schopp@amd.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 11 3月, 2015 2 次提交
-
-
由 David Kaplan 提交于
No need to re-decode WBINVD since we know what it is from the intercept. Signed-off-by: NDavid Kaplan <David.Kaplan@amd.com> [extracted from larger unlrelated patch, forward ported, tested,style cleanup] Signed-off-by: NJoel Schopp <joel.schopp@amd.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Joel Schopp 提交于
Currently kvm_emulate() skips the instruction but kvm_emulate_* sometimes don't. The end reult is the caller ends up doing the skip themselves. Let's make them consistant. Signed-off-by: NJoel Schopp <joel.schopp@amd.com> Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 10 3月, 2015 3 次提交
-
-
由 Thomas Huth 提交于
kvm_kvfree() provides exactly the same functionality as the new common kvfree() function - so let's simply replace the kvm function with the common function. Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 Wincy Van 提交于
This patch fixes the bug discussed in https://www.mail-archive.com/kvm@vger.kernel.org/msg109813.html This patch uses a new field named irr_delivered to record the delivery status of edge-triggered interrupts, and clears the delivered interrupts in kvm_get_ioapic. So it has the same effect of commit 0bc830b0 ("KVM: ioapic: clear IRR for edge-triggered interrupts at delivery") while avoids the bug of Windows guests. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
由 David Kaplan 提交于
KVM has nice wrappers to access the register values, clean up a few places that should use them but currently do not. Signed-off-by: NDavid Kaplan <david.kaplan@amd.com> [forward port and testing] Signed-off-by: NJoel Schopp <joel.schopp@amd.com> Acked-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 03 3月, 2015 1 次提交
-
-
由 Radim Krčmář 提交于
In commit b4eef9b3, we started to use hwapic_isr_update() != NULL instead of kvm_apic_vid_enabled(vcpu->kvm). This didn't work because SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not change apic->isr_count, because it should always be 1. The initial value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm), which is always 0 for SVM, so KVM could have injected interrupts when it shouldn't. Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the initial isr_count depend on hwapic_isr_update() for good measure. Fixes: b4eef9b3 ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv") Reported-and-tested-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 24 2月, 2015 2 次提交
-
-
由 Paolo Bonzini 提交于
This has been broken for a long time: it broke first in 2.6.35, then was almost fixed in 2.6.36 but this one-liner slipped through the cracks. The bug shows up as an infinite loop in Windows 7 (and newer) boot on 32-bit hosts without EPT. Windows uses CMPXCHG8B to write to page tables, which causes a page fault if running without EPT; the emulator is then called from kvm_mmu_page_fault. The loop then happens if the higher 4 bytes are not 0; the common case for this is that the NX bit (bit 63) is 1. Fixes: 6550e1f1 Fixes: 16518d5a Cc: stable@vger.kernel.org # 2.6.35+ Reported-by: NErik Rull <erik.rull@rdsoftware.de> Tested-by: NErik Rull <erik.rull@rdsoftware.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
'apic' is not defined if !CONFIG_X86_64 && !CONFIG_X86_LOCAL_APIC. Posted interrupt makes no sense without CONFIG_SMP, and CONFIG_X86_LOCAL_APIC will be set with it. Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 10 2月, 2015 1 次提交
-
-
由 Radim Krčmář 提交于
<asm/apic.h> isn't included directly and without CONFIG_SMP, an option that automagically pulls it can't be enabled. Reported-by: NJim Davis <jim.epost@gmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 09 2月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
NoWrite instructions (e.g. cmp or test) never set the "write access" bit in the error code, even if one of the operands is treated as a destination. Fixes: c205fb7d Cc: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 06 2月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
This patch introduces a new module parameter for the KVM module; when it is present, KVM attempts a bit of polling on every HLT before scheduling itself out via kvm_vcpu_block. This parameter helps a lot for latency-bound workloads---in particular I tested it with O_DSYNC writes with a battery-backed disk in the host. In this case, writes are fast (because the data doesn't have to go all the way to the platters) but they cannot be merged by either the host or the guest. KVM's performance here is usually around 30% of bare metal, or 50% if you use cache=directsync or cache=writethrough (these parameters avoid that the guest sends pointless flush requests, and at the same time they are not slow because of the battery-backed cache). The bad performance happens because on every halt the host CPU decides to halt itself too. When the interrupt comes, the vCPU thread is then migrated to a new physical CPU, and in general the latency is horrible because the vCPU thread has to be scheduled back in. With this patch performance reaches 60-65% of bare metal and, more important, 99% of what you get if you use idle=poll in the guest. This means that the tunable gets rid of this particular bottleneck, and more work can be done to improve performance in the kernel or QEMU. Of course there is some price to pay; every time an otherwise idle vCPUs is interrupted by an interrupt, it will poll unnecessarily and thus impose a little load on the host. The above results were obtained with a mostly random value of the parameter (500000), and the load was around 1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU. The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll, that can be used to tune the parameter. It counts how many HLT instructions received an interrupt during the polling period; each successful poll avoids that Linux schedules the VCPU thread out and back in, and may also avoid a likely trip to C1 and back for the physical CPU. While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second. Of these halts, almost all are failed polls. During the benchmark, instead, basically all halts end within the polling period, except a more or less constant stream of 50 per second coming from vCPUs that are not running the benchmark. The wasted time is thus very low. Things may be slightly different for Windows VMs, which have a ~10 ms timer tick. The effect is also visible on Marcelo's recently-introduced latency test for the TSC deadline timer. Though of course a non-RT kernel has awful latency bounds, the latency of the timer is around 8000-10000 clock cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC deadline timer, thus, the effect is both a smaller average latency and a smaller variance. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 2月, 2015 8 次提交
-
-
由 Andy Lutomirski 提交于
Context switches and TLB flushes can change individual bits of CR4. CR4 reads take several cycles, so store a shadow copy of CR4 in a per-cpu variable. To avoid wasting a cache line, I added the CR4 shadow to cpu_tlbstate, which is already touched in switch_mm. The heaviest users of the cr4 shadow will be switch_mm and __switch_to_xtra, and __switch_to_xtra is called shortly after switch_mm during context switch, so the cacheline is likely to be hot. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/3a54dd3353fffbf84804398e00dfdc5b7c1afd7d.1414190806.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andy Lutomirski 提交于
CR4 manipulation was split, seemingly at random, between direct (write_cr4) and using a helper (set/clear_in_cr4). Unfortunately, the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code, which only a small subset of users actually wanted. This patch replaces all cr4 access in functions that don't leave cr4 exactly the way they found it with new helpers cr4_set_bits, cr4_clear_bits, and cr4_set_bits_and_update_boot. Signed-off-by: NAndy Lutomirski <luto@amacapital.net> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Wincy Van 提交于
If vcpu has a interrupt in vmx non-root mode, injecting that interrupt requires a vmexit. With posted interrupt processing, the vmexit is not needed, and interrupts are fully taken care of by hardware. In nested vmx, this feature avoids much more vmexits than non-nested vmx. When L1 asks L0 to deliver L1's posted interrupt vector, and the target VCPU is in non-root mode, we use a physical ipi to deliver POSTED_INTR_NV to the target vCPU. Using POSTED_INTR_NV avoids unexpected interrupts if a concurrent vmexit happens and L1's vector is different with L0's. The IPI triggers posted interrupt processing in the target physical CPU. In case the target vCPU was not in guest mode, complete the posted interrupt delivery on the next entry to L2. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
With virtual interrupt delivery, the hardware lets KVM use a more efficient mechanism for interrupt injection. This is an important feature for nested VMX, because it reduces vmexits substantially and they are much more expensive with nested virtualization. This is especially important for throughput-bound scenarios. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
We can reduce apic register virtualization cost with this feature, it is also a requirement for virtual interrupt delivery and posted interrupt processing. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
To enable nested apicv support, we need per-cpu vmx control MSRs: 1. If in-kernel irqchip is enabled, we can enable nested posted interrupt, we should set posted intr bit in the nested_vmx_pinbased_ctls_high. 2. If in-kernel irqchip is disabled, we can not enable nested posted interrupt, the posted intr bit in the nested_vmx_pinbased_ctls_high will be cleared. Since there would be different settings about in-kernel irqchip between VMs, different nested control MSRs are needed. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
When L2 is using x2apic, we can use virtualize x2apic mode to gain higher performance, especially in apicv case. This patch also introduces nested_vmx_check_apicv_controls for the nested apicv patches. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
Currently, if L1 enables MSR_BITMAP, we will emulate this feature, all of L2's msr access is intercepted by L0. Features like "virtualize x2apic mode" require that the MSR bitmap is enabled, or the hardware will exit and for example not virtualize the x2apic MSRs. In order to let L1 use these features, we need to build a merged bitmap that only not cause a VMEXIT if 1) L1 requires that 2) the bit is not required by the processor for APIC virtualization. For now the guests are still run with MSR bitmap disabled, but this patch already introduces nested_vmx_merge_msr_bitmap for future use. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 2月, 2015 2 次提交
-
-
由 Marcelo Tosatti 提交于
Revert 7c6a98df, given that testing PIR is not necessary anymore. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Marcelo Tosatti 提交于
With APICv, LAPIC timer interrupt is always delivered via IRR: apic_find_highest_irr syncs PIR to IRR. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 30 1月, 2015 7 次提交
-
-
由 Paolo Bonzini 提交于
A function pointer was not NULLed, causing kvm_vcpu_reload_apic_access_page to go down the wrong path and OOPS when doing put_page(NULL). This did not happen on old processors, only when setting the module option explicitly. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
We forgot to re-check LAPIC after splitting the loop in commit 173beedc (KVM: x86: Software disabled APIC should still deliver NMIs, 2014-11-02). Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Fixes: 173beedcSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
We cannot hit the bug now, but future patches will expose this path. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
To make the code self-documenting. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
The majority of this patch turns result = 0; if (CODE) result = 1; return result; into return CODE; because we return bool now. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
And don't export the internal ones while at it. Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Kai Huang 提交于
This patch adds PML support in VMX. A new module parameter 'enable_pml' is added to allow user to enable/disable it manually. Signed-off-by: NKai Huang <kai.huang@linux.intel.com> Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 1月, 2015 1 次提交
-
-
由 Kai Huang 提交于
This patch adds new kvm_x86_ops dirty logging hooks to enable/disable dirty logging for particular memory slot, and to flush potentially logged dirty GPAs before reporting slot->dirty_bitmap to userspace. kvm x86 common code calls these hooks when they are available so PML logic can be hidden to VMX specific. SVM won't be impacted as these hooks remain NULL there. Signed-off-by: NKai Huang <kai.huang@linux.intel.com> Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-