- 27 4月, 2008 16 次提交
-
-
由 Joerg Roedel 提交于
Let SVM detect if the Nested Paging feature is available on the hardware. Disable it to keep this patch series bisectable. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
By moving the SVM feature detection from the each_cpu code to the hardware setup code it runs only once. As an additional advance the feature check is now available earlier in the module setup process. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. [avi: add check for EFER-less hosts] Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch aligns the bits the guest can set in the EFER register with the features in the host processor. Currently it lets EFER.NX disabled if the processor does not support it and enables EFER.LME and EFER.LMA only for KVM on 64 bit hosts. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch give the SVM and VMX implementations the ability to add some bits the guest can set in its EFER register. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Sheng Yang 提交于
To allow TLB entries to be retained across VM entry and VM exit, the VMM can now identify distinct address spaces through a new virtual-processor ID (VPID) field of the VMCS. [avi: drop vpid_sync_all()] [avi: add "cc" to asm constraints] Signed-off-by: NSheng Yang <sheng.yang@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Currently an mmio guest pte is encoded in the shadow pagetable as a not-present trapping pte, with the SHADOW_IO_MARK bit set. However nothing is ever done with this information, so maintaining it is a useless complication. This patch moves the check for mmio to before shadow ptes are instantiated, so the shadow code is never invoked for ptes that reference mmio. The code is simpler, and with future work, can be made to handle mmio concurrently. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Opcodes 0x80-0x83 Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This adds group decoding for opcode 0x0f 0x01 (group 7). Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Add group decoding support for opcode 0xfe (group 4) and 0xff (group 5). Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This adds group decoding support for opcodes 0xf6, 0xf7 (group 3). Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This adds group decode support for opcode 0x8f. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Certain x86 instructions use bits 3:5 of the byte following the opcode as an opcode extension, with the decode sometimes depending on bits 6:7 as well. Add support for this in the main decoding table rather than an ad-hock adaptation per opcode. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Dong, Eddie 提交于
Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Dong, Eddie 提交于
A guest partial guest pte write will leave shadow_trap_nonpresent_pte in spte, which generates a vmexit at the next guest access through that pte. This patch improves this by reading the full guest pte in advance and thus being able to update the spte and eliminate the vmexit. This helps pae guests which use two 32-bit writes to set a single 64-bit pte. [truncation fix by Eric] Signed-off-by: NYaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 25 3月, 2008 5 次提交
-
-
由 Avi Kivity 提交于
While backporting 72dc67a6, a gfn_to_page() call was duplicated instead of moved (due to an unrelated patch not being present in mainline). This caused a page reference leak, resulting in a fairly massive memory leak. Fix by removing the extraneous gfn_to_page() call. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
init_rmode_tss was forgotten during the conversion from mmap_sem to slots_lock. INFO: task qemu-system-x86:3748 blocked for more than 120 seconds. Call Trace: [<ffffffff8053d100>] __down_read+0x86/0x9e [<ffffffff8053fb43>] do_page_fault+0x346/0x78e [<ffffffff8053d235>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8053dcad>] error_exit+0x0/0xa9 [<ffffffff8035a7a7>] copy_user_generic_string+0x17/0x40 [<ffffffff88099a8a>] :kvm:kvm_write_guest_page+0x3e/0x5f [<ffffffff880b661a>] :kvm_intel:init_rmode_tss+0xa7/0xf9 [<ffffffff880b7d7e>] :kvm_intel:vmx_vcpu_reset+0x10/0x38a [<ffffffff8809b9a5>] :kvm:kvm_arch_vcpu_setup+0x20/0x53 [<ffffffff8809a1e4>] :kvm:kvm_vm_ioctl+0xad/0x1cf [<ffffffff80249dea>] __lock_acquire+0x4f7/0xc28 [<ffffffff8028fad9>] vfs_ioctl+0x21/0x6b [<ffffffff8028fd75>] do_vfs_ioctl+0x252/0x26b [<ffffffff8028fdca>] sys_ioctl+0x3c/0x5e [<ffffffff8020b01b>] system_call_after_swapgs+0x7b/0x80 Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
Do not assume that a shadow mapping will always point to the same host frame number. Fixes crash with madvise(MADV_DONTNEED). [avi: move after first printk(), add another printk()] Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
is_rmap_pte() doesn't take into account io ptes, which have the avail bit set. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
The vmx hardware state restore restores the tss selector and base address, but not its length. Usually, this does not matter since most of the tss contents is within the default length of 0x67. However, if a process is using ioperm() to grant itself I/O port permissions, an additional bitmap within the tss, but outside the default length is consulted. The effect is that the process will receive a SIGSEGV instead of transparently accessing the port. Fix by restoring the tss length. Note that i386 had this working already. Closes bugzilla 10246. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 04 3月, 2008 7 次提交
-
-
由 Avi Kivity 提交于
KVM tries to run as much as possible with the guest msrs loaded instead of host msrs, since switching msrs is very expensive. It also tries to minimize the number of msrs switched according to the guest mode; for example, MSR_LSTAR is needed only by long mode guests. This optimization is done by setup_msrs(). However, we must not change which msrs are switched while we are running with guest msr state: - switch to guest msr state - call setup_msrs(), removing some msrs from the list - switch to host msr state, leaving a few guest msrs loaded An easy way to trigger this is to kexec an x86_64 linux guest. Early during setup, the guest will switch EFER to not include SCE. KVM will stop saving MSR_LSTAR, and on the next msr switch it will leave the guest LSTAR loaded. The next host syscall will end up in a random location in the kernel. Fix by reloading the host msrs before changing the msr list. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
For improved concurrency, the guest walk is performed concurrently with other vcpus. This means that we need to revalidate the guest ptes once we have write-protected the guest page tables, at which point they can no longer be modified. The current code attempts to avoid this check if the shadow page table is not new, on the assumption that if it has existed before, the guest could not have modified the pte without the shadow lock. However the assumption is incorrect, as the racing vcpu could have modified the pte, then instantiated the shadow page, before our vcpu regains control: vcpu0 vcpu1 fault walk pte modify pte fault in same pagetable instantiate shadow page lookup shadow page conclude it is old instantiate spte based on stale guest pte We could do something clever with generation counters, but a test run by Marcelo suggests this is unnecessary and we can just do the revalidation unconditionally. The pte will be in the processor cache and the check can be quite fast. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
If the local apic initial count is zero, don't start a an hrtimer with infinite frequency, locking up the host. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
the cr3 variable is now inside the vcpu->arch structure. Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Marcelo Tosatti 提交于
alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called inside a non preemptable region. Move it after put_cpu(). Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
While installing Windows XP 64 bit wants to access the DEBUGCTL and the last branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to crash. This patch allow the access to these MSRs and fixes the issue. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Izik Eidus 提交于
This patch replaces the mmap_sem lock for the memory slots with a new kvm private lock, it is needed beacuse untill now there were cases where kvm accesses user memory while holding the mmap semaphore. Signed-off-by: NIzik Eidus <izike@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 03 3月, 2008 5 次提交
-
-
由 Joerg Roedel 提交于
Injecting an GP when accessing this MSR lets Windows crash when running some stress test tools in KVM. So this patch emulates access to this MSR. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
One of the use cases for the supported cpuid list is to create a "greatest common denominator" of cpu capabilities in a server farm. As such, it is useful to be able to get the list without creating a virtual machine first. Since the code does not depend on the vm in any way, all that is needed is to move it to the device ioctl handler. The capability identifier is also changed so that binaries made against -rc1 will fail gracefully. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Paul Knowles 提交于
Whilst working on getting a VM to initialize in to IA32e mode I found this issue. set_cr0 relies on comparing the old cr0 to the new one to work correctly. Move the assignment below so the compare can work. Signed-off-by: NPaul Knowles <paul@transitive.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
Explicitly enable the NM intercept in svm_set_cr0 if we enable TS in the guest copy of CR0 for lazy FPU switching. This fixes guest SMP with Linux under SVM. Without that patch Linux deadlocks or panics right after trying to boot the other CPUs. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
If the guest writes to cr0 and leaves the TS flag at 0 while vcpu->fpu_active is also 0, the TS flag in the guest's cr0 gets lost. This leads to corrupt FPU state an causes Windows Vista 64bit to crash very soon after boot. This patch fixes this bug. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 06 2月, 2008 1 次提交
-
-
由 Andrew Morton 提交于
arch/x86/kvm/x86.c: In function 'emulator_cmpxchg_emulated': arch/x86/kvm/x86.c:1746: warning: passing argument 2 of 'vcpu->arch.mmu.gva_to_gpa' makes integer from pointer without a cast arch/x86/kvm/x86.c:1746: warning: 'addr' is used uninitialized in this function Is true. Local variable `addr' shadows incoming arg `addr'. Avi is on vacation for a while, so... Cc: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 2月, 2008 1 次提交
-
-
由 Anthony Liguori 提交于
This patch moves virtio under the virtualization menu and changes virtio devices to not claim to only be for lguest. Signed-off-by: NAnthony Liguori <aliguori@us.ibm.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 31 1月, 2008 5 次提交
-
-
由 Avi Kivity 提交于
Migrating the apic timer in the critical section is not very nice, and is absolutely horrible with the real-time port. Move migration to the regular vcpu execution path, triggered by a new bitflag. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
When preparing to enter the guest, if an interrupt comes in while preemption is disabled but interrupts are still enabled, we miss a preemption point. Fix by explicitly checking whether we need to reschedule. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Otherwise we re-initialize the mmu caches, which will fail since the caches are already registered, which will cause us to deinitialize said caches. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Izik Eidus 提交于
Right now rmap_remove won't set the page as dirty if the shadow pte pointed to this page had write access and then it became readonly. This patches fixes that, by setting the page as dirty for spte changes from write to readonly access. Signed-off-by: NIzik Eidus <izike@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Sheng Yang 提交于
When executing a test program called "crashme", we found the KVM guest cannot survive more than ten seconds, then encounterd kernel panic. The basic concept of "crashme" is generating random assembly code and trying to execute it. After some fixes on emulator insn validity judgment, we found it's hard to get the current emulator handle the invalid instructions correctly, for the #UD trap for hypercall patching caused troubles. The problem is, if the opcode itself was OK, but combination of opcode and modrm_reg was invalid, and one operand of the opcode was memory (SrcMem or DstMem), the emulator will fetch the memory operand first rather than checking the validity, and may encounter an error there. For example, ".byte 0xfe, 0x34, 0xcd" has this problem. In the patch, we simply check that if the invalid opcode wasn't vmcall/vmmcall, then return from emulate_instruction() and inject a #UD to guest. With the patch, the guest had been running for more than 12 hours. Signed-off-by: NFeng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: NSheng Yang <sheng.yang@intel.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-