- 03 5月, 2007 12 次提交
-
-
由 Avi Kivity 提交于
This is redundant, as we also return -EINTR from the ioctl, but it allows us to examine the exit_reason field on resume without seeing old data. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Currently, userspace is told about the nature of the last exit from the guest using two fields, exit_type and exit_reason, where exit_type has just two enumerations (and no need for more). So fold exit_type into exit_reason, reducing the complexity of determining what really happened. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This is useful for paravirtualized graphics devices, for example. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
We no longer emulate single instructions in userspace. Instead, we service mmio or pio requests. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
KVM used to handle cpuid by letting userspace decide what values to return to the guest. We now handle cpuid completely in the kernel. We still let userspace decide which values the guest will see by having userspace set up the value table beforehand (this is necessary to allow management software to set the cpu features to the least common denominator, so that live migration can work). The motivation for the change is that kvm kernel code can be impacted by cpuid features, for example the x86 emulator. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Currently when passing the a PIO emulation request to userspace, we rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx (on string instructions). This (a) requires two extra ioctls for getting and setting the registers and (b) is unfriendly to non-x86 archs, when they get kvm ports. So fix by doing the register fixups in the kernel and passing to userspace only an abstract description of the PIO to be done. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Instead of passing a 'struct kvm_run' back and forth between the kernel and userspace, allocate a page and allow the user to mmap() it. This reduces needless copying and makes the interface expandable by providing lots of free space. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
When auditing a 32-bit guest on a 64-bit host, sign extension of the page table directory pointer table index caused bogus addresses to be shown on audit errors. Fix by declaring the index unsigned. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Use the minor number (232) allocated to kvm by lanana. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Dor Laor 提交于
Instead of twiddling the rip registers directly, use the skip_emulated_instruction() function to do that for us. Signed-off-by: NDor Laor <dor.laor@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Dor Laor 提交于
The hypercall code mixes up the ->cache_regs() and ->decache_regs() callbacks, resulting in guest register corruption. Signed-off-by: NDor Laor <dor.laor@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 19 4月, 2007 1 次提交
-
-
由 Avi Kivity 提交于
Nonpae guest pdes are shadowed by two pae ptes, so we double the offset twice: once to account for the pte size difference, and once because we need to shadow pdes for a single guest pde. But when writing to the upper guest pde we also need to truncate the lower bits, otherwise the multiply shifts these bits into the pde index and causes an access to the wrong shadow pde. If we're at the end of the page (accessing the very last guest pde) we can even overflow into the next host page and oops. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 27 3月, 2007 2 次提交
-
-
由 Ingo Molnar 提交于
failed VM entry on VMX might still change %fs or %gs, thus make sure that KVM always reloads the segment selectors. This is crutial on both x86 and x86_64: x86 has __KERNEL_PDA in %fs on which things like 'current' depends and x86_64 has 0 there and needs MSR_GS_BASE to work. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Avi Kivity 提交于
Intel virtualization extensions do not support virtualizing real mode. So kvm uses virtualized vm86 mode to run real mode code. Unfortunately, this virtualized vm86 mode does not support the so called "big real" mode, where the segment selector and base do not agree with each other according to the real mode rules (base == selector << 4). To work around this, kvm checks whether a selector/base pair violates the virtualized vm86 rules, and if so, forces it into conformance. On a transition back to protected mode, if we see that the guest did not touch a forced segment, we restore it back to the original protected mode value. This pile of hacks breaks down if the gdt has changed in real mode, as it can cause a segment selector to point to a system descriptor instead of a normal data segment. In fact, this happens with the Windows bootloader and the qemu acpi bios, where a protected mode memcpy routine issues an innocent 'pop %es' and traps on an attempt to load a system descriptor. "Fix" by checking if the to-be-restored selector points at a system segment, and if so, coercing it into a normal data segment. The long term solution, of course, is to abandon vm86 mode and use emulation for big real mode. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 18 3月, 2007 4 次提交
-
-
由 Avi Kivity 提交于
PAGE_MASK is an unsigned long, so using it to mask physical addresses on i386 (which are 64-bit wide) leads to truncation. This can result in page->private of unrelated memory pages being modified, with disasterous results. Fix by not using PAGE_MASK for physical addresses; instead calculate the correct value directly from PAGE_SIZE. Also fix a similar BUG_ON(). Acked-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
KVM shadow page tables are always in pae mode, regardless of the guest setting. This means that a guest pde (mapping 4MB of memory) is mapped to two shadow pdes (mapping 2MB each). When the guest writes to a pte or pde, we intercept the write and emulate it. We also remove any shadowed mappings corresponding to the write. Since the mmu did not account for the doubling in the number of pdes, it removed the wrong entry, resulting in a mismatch between shadow page tables and guest page tables, followed shortly by guest memory corruption. This patch fixes the problem by detecting the special case of writing to a non-pae pde and adjusting the address and number of shadow pdes zapped accordingly. Acked-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
The vmx code currently treats the guest's sysenter support msrs as 32-bit values, which breaks 32-bit compat mode userspace on 64-bit guests. Fix by using the native word width of the machine. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Otherwise, the core module thinks the arch module is loaded, and won't let you reload it after you've fixed the bug. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
- 04 3月, 2007 21 次提交
-
-
由 Andrew Morton 提交于
Use the standard magic.h for kvmfs. Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
A bogus 'return r' can cause an otherwise successful module load to fail. This both denies users the use of kvm, and it also denies them the use of their machine, as it leaves a filesystem registered with its callbacks pointing into now-freed module memory. Fix by returning a zero like a good module. Thanks to Richard Lucassen <mailinglists@lucassen.org> (?) for reporting the problem and for providing access to a machine which exhibited it. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Uri Lublin 提交于
Enabling dirty page logging is done using KVM_SET_MEMORY_REGION ioctl. If the memory region already exists, we need to remove write accesses, so writes will be caught, and dirty pages will be logged. Signed-off-by: NUri Lublin <uril@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Uri Lublin 提交于
To be called from kvm_vm_ioctl_set_memory_region() Signed-off-by: NUri Lublin <uril@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Uri Lublin 提交于
Since dirty_bitmap is an unsigned long array, the alignment and size need to take that into account. Signed-off-by: NUri Lublin <uril@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Uri Lublin 提交于
A few places where we modify guest memory fail to call mark_page_dirty(), causing live migration to fail. This adds the missing calls. Signed-off-by: NUri Lublin <uril@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Allocate a distinct inode for every vcpu in a VM. This has the following benefits: - the filp cachelines are no longer bounced when f_count is incremented on every ioctl() - the API and internal code are distinctly clearer; for example, on the KVM_GET_REGS ioctl, there is no need to copy the vcpu number from userspace and then copy the registers back; the vcpu identity is derived from the fd used to make the call Right now the performance benefits are completely theoretical since (a) we don't support more than one vcpu per VM and (b) virtualization hardware inefficiencies completely everwhelm any cacheline bouncing effects. But both of these will change, and we need to prepare the API today. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
In preparation of some hacking. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This reflects the changed scope, from device-wide to single vm (previously every device open created a virtual machine). Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
This avoids having filp->f_op and the corresponding inode->i_fop different, which is a little unorthodox. The ioctl list is split into two: global kvm ioctls and per-vm ioctls. A new ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
The kvmfs inodes will represent virtual machines and vcpus, as necessary, reducing cacheline bouncing due to inodes and filps being shared. Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
This patch changes the SVM code to intercept SMIs and handle it outside the guest. Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Avi Kivity 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Ingo Molnar 提交于
Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Ingo Molnar 提交于
This adds a special MSR based hypercall API to KVM. This is to be used by paravirtual kernels and virtual drivers. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Markus Rechberger 提交于
Besides using an established api, this allows using kvm in older kernels. Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Ahmed S. Darwish 提交于
Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com> Signed-off-by: NDor Laor <dor.laor@qumranet.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-
由 Joerg Roedel 提交于
The whole thing is rotten, but this allows vmx to boot with the guest reboot fix. Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com> Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com> Signed-off-by: NAvi Kivity <avi@qumranet.com>
-