- 20 7月, 2008 18 次提交
-
-
由 Avi Kivity 提交于
Add emulation for the memory type range registers, needed by VMware esx 3.5, and by pci device assignment. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Sheng Yang 提交于
Signed-off-by: NSheng Yang <sheng.yang@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Sheng Yang 提交于
[avi: fix ia64 build breakage] Signed-off-by: NSheng Yang <sheng.yang@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Since we aren't modifying any register, there's no need to decache the register state. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Obsoleted by the vmx-specific per-cpu list. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
VMX hardware can cache the contents of a vcpu's vmcs. This cache needs to be flushed when migrating a vcpu to another cpu, or (which is the case that interests us here) when disabling hardware virtualization on a cpu. The current implementation of decaching iterates over the list of all vcpus, picks the ones that are potentially cached on the cpu that is being offlined, and flushes the cache. The problem is that it uses mutex_trylock() to gain exclusive access to the vcpu, which fires off a (benign) warning about using the mutex in an interrupt context. To avoid this, and to make things generally nicer, add a new per-cpu list of potentially cached vcus. This makes the decaching code much simpler. The list is vmx-specific since other hardware doesn't have this issue. [andrea: fix crash on suspend/resume] Signed-off-by: NAndrea Arcangeli <andrea@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
KVM turns off hardware virtualization extensions during reboot, in order to disassociate the memory used by the virtualization extensions from the processor, and in order to have the system in a consistent state. Unfortunately virtual machines may still be running while this goes on, and once virtualization extensions are turned off, any virtulization instruction will #UD on execution. Fix by adding an exception handler to virtualization instructions; if we get an exception during reboot, we simply spin waiting for the reset to complete. If it's a true exception, BUG() so we can have our stack trace. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
The KVM MMU tries to detect when a speculative pte update is not actually used by demand fault, by checking the accessed bit of the shadow pte. If the shadow pte has not been accessed, we deem that page table flooded and remove the shadow page table, allowing further pte updates to proceed without emulation. However, if the pte itself points at a page table and only used for write operations, the accessed bit will never be set since all access will happen through the emulator. This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels. The kernel points a kmap_atomic() pte at a page table, and then proceeds with read-modify-write operations to look at the dirty and accessed bits. We get a false flood trigger on the kmap ptes, which results in the mmu spending all its time setting up and tearing down shadows. Fix by setting the shadow accessed bit on emulated accesses. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Chris Lalancette 提交于
Attached is a patch that fixes a guest crash when booting older Linux kernels. The problem stems from the fact that we are currently emulating MSR_K7_EVNTSEL[0-3], but not emulating MSR_K7_PERFCTR[0-3]. Because of this, setup_k7_watchdog() in the Linux kernel receives a GPF when it attempts to write into MSR_K7_PERFCTR, which causes an OOPs. The patch fixes it by just "fake" emulating the appropriate MSRs, throwing away the data in the process. This causes the NMI watchdog to not actually work, but it's not such a big deal in a virtualized environment. When we get a write to one of these counters, we printk_ratelimit() a warning. I decided to print it out for all writes, even if the data is 0; it doesn't seem to make sense to me to special case when data == 0. Tested by myself on a RHEL-4 guest, and Joerg Roedel on a Windows XP 64-bit guest. Signed-off-by: NChris Lalancette <clalance@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Aurelien Jarno 提交于
The in-kernel PIT emulation ignores pending timers if operating under mode 3, which for example Hurd uses. This mode should output a square wave, high for (N+1)/2 counts and low for (N-1)/2 counts. As we only care about the resulting interrupts, the period is N, and mode 3 is the same as mode 2 with regard to interrupts. Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
To distinguish between real page faults and nested page faults they should be traced as different events. This is implemented by this patch. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch adds the missing kvmtrace markers to the svm module of kvm. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch adds some kvmtrace bits to the generic x86 code where it is instrumented from SVM. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
With an exit handler for INTR intercepts its possible to account them using kvmtrace. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
With an exit handler for NMI intercepts its possible to account them using kvmtrace. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch moves the trace entry for APIC accesses from the VMX code to the generic lapic code. This way APIC accesses from SVM will also be traced. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Harvey Harrison 提交于
Noticed by sparse: arch/x86/kvm/vmx.c:1583:6: warning: symbol 'vmx_disable_intercept_for_msr' was not declared. Should it be static? arch/x86/kvm/x86.c:3406:5: warning: symbol 'kvm_task_switch_16' was not declared. Should it be static? arch/x86/kvm/x86.c:3429:5: warning: symbol 'kvm_task_switch_32' was not declared. Should it be static? arch/x86/kvm/mmu.c:1968:6: warning: symbol 'kvm_mmu_remove_one_alloc_mmu_page' was not declared. Should it be static? arch/x86/kvm/mmu.c:2014:6: warning: symbol 'mmu_destroy_caches' was not declared. Should it be static? arch/x86/kvm/lapic.c:862:5: warning: symbol 'kvm_lapic_get_base' was not declared. Should it be static? arch/x86/kvm/i8254.c:94:5: warning: symbol 'pit_get_gate' was not declared. Should it be static? arch/x86/kvm/i8254.c:196:5: warning: symbol '__pit_timer_fn' was not declared. Should it be static? arch/x86/kvm/i8254.c:561:6: warning: symbol '__inject_pit_timer_intr' was not declared. Should it be static? Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 26 6月, 2008 2 次提交
-
-
由 Jens Axboe 提交于
It's not even passed on to smp_call_function() anymore, since that was removed. So kill it. Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 25 6月, 2008 1 次提交
-
-
由 Gerd Hoffmann 提交于
This patch updates the kvm host code to use the pvclock structs. It also makes the paravirt clock compatible with Xen. Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 24 6月, 2008 6 次提交
-
-
由 Avi Kivity 提交于
Switching msrs can occur either synchronously as a result of calls to the msr management functions (usually in response to the guest touching virtualized msrs), or asynchronously when preempting a kvm thread that has guest state loaded. If we're unlucky enough to have the two at the same time, host msrs are corrupted and the machine goes kaput on the next syscall. Most easily triggered by Windows Server 2008, as it does a lot of msr switching during bootup. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
KVM has a heuristic to unshadow guest pagetables when userspace accesses them, on the assumption that most guests do not allow userspace to access pagetables directly. Unfortunately, in addition to unshadowing the pagetables, it also oopses. This never triggers on ordinary guests since sane OSes will clear the pagetables before assigning them to userspace, which will trigger the flood heuristic, unshadowing the pagetables before the first userspace access. One particular guest, though (Xenner) will run the kernel in userspace, triggering the oops. Since the heuristic is incorrect in this case, we can simply remove it. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
kvm_mmu_pte_write() does not handle 32-bit non-PAE large page backed guests properly. It will instantiate two 2MB sptes pointing to the same physical 2MB page when a guest large pte update is trapped. Instead of duplicating code to handle this, disallow directory level updates to happen through kvm_mmu_pte_write(), so the two 2MB sptes emulating one guest 4MB pte can be correctly created by the page fault handling path. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
rmap_next() does not work correctly after rmap_remove(), as it expects the rmap chains not to change during iteration. Fix (for now) by restarting iteration from the beginning. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
If a timer fires after kvm_inject_pending_timer_irqs() but before local_irq_disable() the code will enter guest mode and only inject such timer interrupt the next time an unrelated event causes an exit. It would be simpler if the timer->pending irq conversion could be done with IRQ's disabled, so that the above problem cannot happen. For now introduce a new vcpu requests bit to cancel guest entry. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
A guest vcpu instance can be scheduled to a different physical CPU between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable(). If that happens, the timer will only be migrated to the current pCPU on the next exit, meaning that guest LAPIC timer event can be delayed until a host interrupt is triggered. Fix it by cancelling guest entry if any vcpu request is pending. This has the side effect of nicely consolidating vcpu->requests checks. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 07 6月, 2008 6 次提交
-
-
由 Avi Kivity 提交于
The check is only looking at one of two possible empty ptes. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Shadows for large guests can take a long time to tear down, so reschedule occasionally to avoid softlockup warnings. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Eli Collins 提交于
Clear CR4.VMXE in hardware_disable. There's no reason to leave it set after doing a VMXOFF. VMware Workstation 6.5 checks CR4.VMXE as a proxy for whether the CPU is in VMX mode, so leaving VMXE set means we'll refuse to power on. With this change the user can power on after unloading the kvm-intel module. I tested on kvm-67 and kvm-69. Signed-off-by: NEli Collins <ecollins@vmware.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Migrate the PIT timer to the physical CPU which vcpu0 is scheduled on, similarly to what is done for the LAPIC timers, otherwise PIT interrupts will be delayed until an unrelated event causes an exit. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
The hypercall instructions on Intel and AMD are different. KVM allows the guest to choose one or the other (the default is Intel), and if the guest chooses incorrectly, KVM will patch it at runtime to select the correct instruction. This allows live migration between Intel and AMD machines. This patching occurs in the x86 emulator. The current code also executes the hypercall. Unfortunately, the tail end of the x86 emulator code also executes, overwriting the return value of the hypercall with the original contents of rax (which happens to be the hypercall number). Fix not by executing the hypercall in the emulator context; instead let the guest reissue the patched instruction and execute the hypercall via the normal path. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 23 5月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 5月, 2008 3 次提交
-
-
由 Marcelo Tosatti 提交于
Only use the APIC pending timers count to break out of HLT emulation if the timer vector is enabled. Certain configurations of Windows simply mask out the vector without disabling the timer. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Otherwise hlt emulation fails if PIT is not injecting IRQ's. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
A register destination encoded with a mod=3 encoding left dst.ptr NULL. Normally we don't trap writes to registers, but in the case of smsw, we do. Fix by pointing dst.ptr at the destination register. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 04 5月, 2008 3 次提交
-
-
由 Avi Kivity 提交于
nonpae guests can call rmap_write_protect twice per page (for page tables) or four times per page (for page directories), triggering a bogus warning. Remove the warning. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Andrea Arcangeli 提交于
This make sure not to schedule in atomic during fx_init. I also changed the name of fpu_init to fx_finit to avoid duplicating the name with fpu_init that is already used in the kernel, this makes grep simpler if nothing else. Signed-off-by: NAndrea Arcangeli <andrea@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Jan Kiszka 提交于
Clear pending exceptions when setting new register values. This avoids spurious exceptions after restoring a vcpu state or after reset-on-triple-fault. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-