提交 · b4e63f560beb187cffdaf706e534a1e2f9effb66 · openanolis / cloud-kernel

03 5月, 2007 10 次提交

A
KVM: Allow userspace to process hypercalls which have no kernel handler · b4e63f56
由 Avi Kivity 提交于 3月 04, 2007
```
This is useful for paravirtualized graphics devices, for example.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
b4e63f56
A
KVM: Add method to check for backwards-compatible API extensions · 5d308f45
由 Avi Kivity 提交于 3月 01, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
5d308f45

KVM: Remove the 'emulated' field from the userspace interface · 106b552b

由 Avi Kivity 提交于 3月 01, 2007

We no longer emulate single instructions in userspace.  Instead, we service
mmio or pio requests.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

106b552b

KVM: Handle cpuid in the kernel instead of punting to userspace · 06465c5a

由 Avi Kivity 提交于 2月 28, 2007

KVM used to handle cpuid by letting userspace decide what values to
return to the guest.  We now handle cpuid completely in the kernel.  We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).

The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

06465c5a

KVM: Do not communicate to userspace through cpu registers during PIO · 46fc1477

由 Avi Kivity 提交于 2月 22, 2007

Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions).  This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.

So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

46fc1477

KVM: Use a shared page for kernel/user communication when runing a vcpu · 9a2bb7f4

由 Avi Kivity 提交于 2月 22, 2007

Instead of passing a 'struct kvm_run' back and forth between the kernel and
userspace, allocate a page and allow the user to mmap() it.  This reduces
needless copying and makes the interface expandable by providing lots of
free space.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9a2bb7f4

KVM: Fix bogus sign extension in mmu mapping audit · 1ea252af

由 Avi Kivity 提交于 3月 08, 2007

When auditing a 32-bit guest on a 64-bit host, sign extension of the page
table directory pointer table index caused bogus addresses to be shown on
audit errors.

Fix by declaring the index unsigned.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1ea252af

KVM: Use own minor number · bbe4432e

由 Avi Kivity 提交于 3月 04, 2007

Use the minor number (232) allocated to kvm by lanana.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bbe4432e

KVM: Use the generic skip_emulated_instruction() in hypercall code · 510043da

由 Dor Laor 提交于 2月 19, 2007

Instead of twiddling the rip registers directly, use the
skip_emulated_instruction() function to do that for us.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

510043da

KVM: Fix guest register corruption on paravirt hypercall · 9b22bf57

由 Dor Laor 提交于 2月 19, 2007

The hypercall code mixes up the ->cache_regs() and ->decache_regs()
callbacks, resulting in guest register corruption.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9b22bf57

19 4月, 2007 1 次提交

KVM: Fix off-by-one when writing to a nonpae guest pde · 6b8d0f9b

由 Avi Kivity 提交于 4月 18, 2007

Nonpae guest pdes are shadowed by two pae ptes, so we double the offset
twice: once to account for the pte size difference, and once because we
need to shadow pdes for a single guest pde.

But when writing to the upper guest pde we also need to truncate the
lower bits, otherwise the multiply shifts these bits into the pde index
and causes an access to the wrong shadow pde.  If we're at the end of the
page (accessing the very last guest pde) we can even overflow into the
next host page and oops.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6b8d0f9b

27 3月, 2007 2 次提交

KVM: always reload segment selectors · 6d9658df

由 Ingo Molnar 提交于 3月 11, 2007

failed VM entry on VMX might still change %fs or %gs, thus make sure
that KVM always reloads the segment selectors. This is crutial on both
x86 and x86_64: x86 has __KERNEL_PDA in %fs on which things like
'current' depends and x86_64 has 0 there and needs MSR_GS_BASE to work.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6d9658df

KVM: Prevent system selectors leaking into guest on real->protected mode transition on vmx · 6af11b9e

由 Avi Kivity 提交于 3月 19, 2007

Intel virtualization extensions do not support virtualizing real mode. So
kvm uses virtualized vm86 mode to run real mode code. Unfortunately, this
virtualized vm86 mode does not support the so called "big real" mode, where
the segment selector and base do not agree with each other according to the
real mode rules (base == selector << 4).

To work around this, kvm checks whether a selector/base pair violates the
virtualized vm86 rules, and if so, forces it into conformance. On a
transition back to protected mode, if we see that the guest did not touch
a forced segment, we restore it back to the original protected mode value.

This pile of hacks breaks down if the gdt has changed in real mode, as it
can cause a segment selector to point to a system descriptor instead of a
normal data segment. In fact, this happens with the Windows bootloader
and the qemu acpi bios, where a protected mode memcpy routine issues an
innocent 'pop %es' and traps on an attempt to load a system descriptor.

"Fix" by checking if the to-be-restored selector points at a system segment,
and if so, coercing it into a normal data segment. The long term solution,
of course, is to abandon vm86 mode and use emulation for big real mode.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6af11b9e

18 3月, 2007 4 次提交

KVM: MMU: Fix host memory corruption on i386 with >= 4GB ram · 27aba766

由 Avi Kivity 提交于 3月 09, 2007

PAGE_MASK is an unsigned long, so using it to mask physical addresses on
i386 (which are 64-bit wide) leads to truncation.  This can result in
page->private of unrelated memory pages being modified, with disasterous
results.

Fix by not using PAGE_MASK for physical addresses; instead calculate
the correct value directly from PAGE_SIZE.  Also fix a similar BUG_ON().
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

27aba766

KVM: MMU: Fix guest writes to nonpae pde · ac1b714e

由 Avi Kivity 提交于 3月 08, 2007

KVM shadow page tables are always in pae mode, regardless of the guest
setting.  This means that a guest pde (mapping 4MB of memory) is mapped
to two shadow pdes (mapping 2MB each).

When the guest writes to a pte or pde, we intercept the write and emulate it.
We also remove any shadowed mappings corresponding to the write.  Since the
mmu did not account for the doubling in the number of pdes, it removed the
wrong entry, resulting in a mismatch between shadow page tables and guest
page tables, followed shortly by guest memory corruption.

This patch fixes the problem by detecting the special case of writing to
a non-pae pde and adjusting the address and number of shadow pdes zapped
accordingly.
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ac1b714e

KVM: Fix guest sysenter on vmx · f5b42c33

由 Avi Kivity 提交于 3月 06, 2007

The vmx code currently treats the guest's sysenter support msrs as 32-bit
values, which breaks 32-bit compat mode userspace on 64-bit guests.  Fix by
using the native word width of the machine.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f5b42c33

KVM: Unset kvm_arch_ops if arch module loading failed · ca45aaae

由 Avi Kivity 提交于 3月 01, 2007

Otherwise, the core module thinks the arch module is loaded, and won't
let you reload it after you've fixed the bug.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ca45aaae

04 3月, 2007 23 次提交

KVM: Move kvmfs magic number to <linux/magic.h> · e9cdb1e3

由 Andrew Morton 提交于 3月 01, 2007

Use the standard magic.h for kvmfs.

Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e9cdb1e3

KVM: Fix bogus failure in kvm.ko module initialization · 58e690e6

由 Avi Kivity 提交于 2月 26, 2007

A bogus 'return r' can cause an otherwise successful module load to fail.
This both denies users the use of kvm, and it also denies them the use of
their machine, as it leaves a filesystem registered with its callbacks
pointing into now-freed module memory.

Fix by returning a zero like a good module.

Thanks to Richard Lucassen <mailinglists@lucassen.org> (?) for reporting
the problem and for providing access to a machine which exhibited it.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

58e690e6

KVM: Remove write access permissions when dirty-page-logging is enabled · ff990d59

由 Uri Lublin 提交于 2月 22, 2007

Enabling dirty page logging is done using KVM_SET_MEMORY_REGION ioctl.
If the memory region already exists, we need to remove write accesses,
so writes will be caught, and dirty pages will be logged.
Signed-off-by: NUri Lublin <uril@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ff990d59

kvm: move do_remove_write_access() up · 02b27c1f

由 Uri Lublin 提交于 2月 22, 2007

To be called from kvm_vm_ioctl_set_memory_region()
Signed-off-by: NUri Lublin <uril@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

02b27c1f

KVM: Fix dirty page log bitmap size/access calculation · cd1a4a98

由 Uri Lublin 提交于 2月 22, 2007

Since dirty_bitmap is an unsigned long array, the alignment and size need
to take that into account.
Signed-off-by: NUri Lublin <uril@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

cd1a4a98

KVM: Add missing calls to mark_page_dirty() · ab51a434

由 Uri Lublin 提交于 2月 21, 2007

A few places where we modify guest memory fail to call mark_page_dirty(),
causing live migration to fail.  This adds the missing calls.
Signed-off-by: NUri Lublin <uril@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ab51a434

KVM: Per-vcpu inodes · bccf2150

由 Avi Kivity 提交于 2月 21, 2007

Allocate a distinct inode for every vcpu in a VM.  This has the following
benefits:

 - the filp cachelines are no longer bounced when f_count is incremented on
   every ioctl()
 - the API and internal code are distinctly clearer; for example, on the
   KVM_GET_REGS ioctl, there is no need to copy the vcpu number from
   userspace and then copy the registers back; the vcpu identity is derived
   from the fd used to make the call

Right now the performance benefits are completely theoretical since (a) we
don't support more than one vcpu per VM and (b) virtualization hardware
inefficiencies completely everwhelm any cacheline bouncing effects.  But
both of these will change, and we need to prepare the API today.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bccf2150

A
KVM: Move kvm_vm_ioctl_create_vcpu() around · c5ea7660
由 Avi Kivity 提交于 2月 20, 2007
```
In preparation of some hacking.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
c5ea7660

KVM: Rename some kvm_dev_ioctl_*() functions to kvm_vm_ioctl_*() · 2c6f5df9

由 Avi Kivity 提交于 2月 20, 2007

This reflects the changed scope, from device-wide to single vm (previously
every device open created a virtual machine).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2c6f5df9

KVM: Create an inode per virtual machine · f17abe9a

由 Avi Kivity 提交于 2月 21, 2007

This avoids having filp->f_op and the corresponding inode->i_fop different,
which is a little unorthodox.

The ioctl list is split into two: global kvm ioctls and per-vm ioctls.  A new
ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f17abe9a

KVM: Add internal filesystem for generating inodes · 37e29d90

由 Avi Kivity 提交于 2月 20, 2007

The kvmfs inodes will represent virtual machines and vcpus, as necessary,
reducing cacheline bouncing due to inodes and filps being shared.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

37e29d90

A
KVM: More 0 -> NULL conversions · 19d1408d
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
19d1408d

KVM: SVM: intercept SMI to handle it at host level · 0152527b

由 Joerg Roedel 提交于 2月 19, 2007

This patch changes the SVM code to intercept SMIs and handle it
outside the guest.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0152527b

A
KVM: svm: init cr0 with the wp bit set · cd205625
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
cd205625
A
KVM: Wire up hypercall handlers to a central arch-independent location · 270fd9b9
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
270fd9b9
A
KVM: Add hypercall host support for svm · 02e235bc
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
02e235bc
I
KVM: Add host hypercall support for vmx · c21415e8
由 Ingo Molnar 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
c21415e8

KVM: add MSR based hypercall API · 102d8325

由 Ingo Molnar 提交于 2月 19, 2007

This adds a special MSR based hypercall API to KVM. This is to be
used by paravirtual kernels and virtual drivers.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

102d8325

KVM: Use page_private()/set_page_private() apis · 5972e953

由 Markus Rechberger 提交于 2月 19, 2007

Besides using an established api, this allows using kvm in older kernels.
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

5972e953

KVM: Use ARRAY_SIZE macro instead of manual calculation. · 9d8f549d

由 Ahmed S. Darwish 提交于 2月 19, 2007

Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9d8f549d

KVM: vmx: hack set_cr0_no_modeswitch() to actually do modeswitch · de979caa

由 Joerg Roedel 提交于 2月 19, 2007

The whole thing is rotten, but this allows vmx to boot with the guest reboot
fix.
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

de979caa

A
KVM: Cosmetics · d27d4aca
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
d27d4aca

KVM: Move virtualization deactivation from CPU_DEAD state to CPU_DOWN_PREPARE · 43934a38

由 Jeremy Katz 提交于 2月 19, 2007

This gives it more chances of surviving suspend.
Signed-off-by: NJeremy Katz <katzj@redhat.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

43934a38

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功