提交 · e8207547d2f7b2f557bdb73015c1f74c32474438 · openanolis / cloud-kernel

03 5月, 2007 39 次提交

KVM: Add physical memory aliasing feature · e8207547

由 Avi Kivity 提交于 3月 30, 2007

With this, we can specify that accesses to one physical memory range will
be remapped to another. This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e8207547

KVM: Simply gfn_to_page() · 954bbbc2

由 Avi Kivity 提交于 3月 30, 2007

Mapping a guest page to a host page is a common operation.  Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).

This is clumsy, and also won't work well with memory aliases.  So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

954bbbc2

KVM: Add mmu cache clear function · e0fa826f

由 Dor Laor 提交于 3月 30, 2007

Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory.  Both the mmu cache and the processor tlb
are cleared.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e0fa826f

KVM: x86 emulator: fix bit string operations operand size · df513e2c

由 Avi Kivity 提交于 3月 28, 2007

On x86, bit operations operate on a string of bits that can reside in
multiple words. For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.

The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1). This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.

Now, a 32-bit guest goes and fork()s a process. It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).

The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry. The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.

Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.

Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

df513e2c

A
KVM: Remove debug message · afeb1f14
由 Avi Kivity 提交于 3月 27, 2007
```
No longer interesting.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
afeb1f14

KVM: Use list_move() · 36868f7b

由 Avi Kivity 提交于 3月 26, 2007

Use list_move() where possible.  Noticed by Dor Laor.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

36868f7b

KVM: Remove unused function · 55bf4028

由 Michal Piotrowski 提交于 3月 25, 2007

Remove unused function

CC      drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used
Signed-off-by: NMichal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

55bf4028

KVM: SVM: Ensure timestamp counter monotonicity · 0cc5064d

由 Avi Kivity 提交于 3月 25, 2007

When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.

As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0cc5064d

KVM: MMU: Fix hugepage pdes mapping same physical address with different access · d28c6cfb

由 Avi Kivity 提交于 3月 23, 2007

The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page. This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).

However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page. If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.

Fix by making the access permissions part of the cache lookup key.

The fix allows Xen pae to boot on kvm and run guest domains.

Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

d28c6cfb

KVM: SVM: forbid guest to execute monitor/mwait · 916ce236

由 Joerg Roedel 提交于 3月 21, 2007

This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

916ce236

KVM: Handle writes to MCG_STATUS msr · 0e5bf0d0

由 Sergey Kiselev 提交于 3月 22, 2007

Some older (~2.6.7) kernels write MCG_STATUS register during kernel
boot (mce_clear_all() function, called from mce_init()). It's not
currently handled by kvm and will cause it to inject a GPF.
Following patch adds a "nop" handler for this.
Signed-off-by: NSergey Kiselev <sergey.kiselev@intel.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

0e5bf0d0

A
KVM: Remove unused and write-only variables · fcd34108
由 Avi Kivity 提交于 3月 21, 2007
```
Trivial cleanup.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
fcd34108

KVM: Don't allow the guest to turn off the cpu cache · 6da63cf9

由 Avi Kivity 提交于 3月 21, 2007

The cpu cache is a host resource; the guest should not be able to turn
it off (even for itself).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6da63cf9

KVM: Hack real-mode segments on vmx from KVM_SET_SREGS · 038881c8

由 Avi Kivity 提交于 3月 21, 2007

As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints.  We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.

This fixes reboot on vmx.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

038881c8

KVM: Modify guest segments after potentially switching modes · 024aa1c0

由 Avi Kivity 提交于 3月 21, 2007

The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and
guest segment registers.  Since segment handling is modified by the mode on
Intel procesors, update the segment registers after the mode switch has taken
place.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

024aa1c0

KVM: Remove set_cr0_no_modeswitch() arch op · f6528b03

由 Avi Kivity 提交于 3月 20, 2007

set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
As we now cache the protected mode values on entry to real mode, this
isn't an issue anymore, and it interferes with reboot (which usually _is_
a modeswitch).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f6528b03

KVM: Workaround vmx inability to virtualize the reset state · 8cb5b033

由 Avi Kivity 提交于 3月 20, 2007

The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000,
which aren't compatible with vm86 mode, which is used for real mode
virtualization.

When we create a vcpu, we set cs.base to 0xf0000, but if we get there by
way of a reset, the values are inconsistent and vmx refuses to enter
guest mode.

Workaround by detecting the state and munging it appropriately.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8cb5b033

KVM: MMU: Remove global pte tracking · aac01224

由 Avi Kivity 提交于 3月 20, 2007

The initial, noncaching, version of the kvm mmu flushed the all nonglobal
shadow page table translations (much like a native tlb flush).  The new
implementation flushes translations only when they change, rendering global
pte tracking superfluous.

This removes the unused tracking mechanism and storage space.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

aac01224

KVM: MMU: Remove unnecessary check for pdptr access · ca5aac1f

由 Avi Kivity 提交于 3月 20, 2007

We already special case the pdptr access, so no need to check it again.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ca5aac1f

KVM: Avoid guest virtual addresses in string pio userspace interface · 039576c0

由 Avi Kivity 提交于 3月 20, 2007

The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions. This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.

Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there. The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

039576c0

KVM: Future-proof argument-less ioctls · f0fe5108

由 Avi Kivity 提交于 3月 07, 2007

Some ioctls ignore their arguments. By requiring them to be zero now,
we allow a nonzero value to have some special meaning in the future.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f0fe5108

KVM: Allow kernel to select size of mmap() buffer · 07c45a36

由 Avi Kivity 提交于 3月 07, 2007

This allows us to store offsets in the kernel/user kvm_run area, and be
sure that userspace has them mapped. As offsets can be outside the
kvm_run struct, userspace has no way of knowing how much to mmap.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

07c45a36

KVM: Add guest mode signal mask · 1961d276

由 Avi Kivity 提交于 3月 05, 2007

Allow a special signal mask to be used while executing in guest mode. This
allows signals to be used to interrupt a vcpu without requiring signal
delivery to a userspace handler, which is quite expensive. Userspace still
receives -EINTR and can get the signal via sigwait().
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1961d276

KVM: Initialize the apic_base msr on svm too · 6722c51c

由 Avi Kivity 提交于 3月 05, 2007

Older userspace didn't care, but newer userspace (with the cpuid changes)
does.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6722c51c

KVM: Add a special exit reason when exiting due to an interrupt · 1b19f3e6

由 Avi Kivity 提交于 3月 04, 2007

This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1b19f3e6

KVM: Fold kvm_run::exit_type into kvm_run::exit_reason · 8eb7d334

由 Avi Kivity 提交于 3月 04, 2007

Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more). So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8eb7d334

A
KVM: Allow userspace to process hypercalls which have no kernel handler · b4e63f56
由 Avi Kivity 提交于 3月 04, 2007
```
This is useful for paravirtualized graphics devices, for example.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
b4e63f56
A
KVM: Add method to check for backwards-compatible API extensions · 5d308f45
由 Avi Kivity 提交于 3月 01, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
5d308f45

KVM: Renumber ioctls · 739872c5

由 Avi Kivity 提交于 3月 01, 2007

The recent changes have left the ioctl numbers in complete disarray.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

739872c5

KVM: Remove minor wart from KVM_CREATE_VCPU ioctl · 2a4dac39

由 Avi Kivity 提交于 3月 01, 2007

That ioctl does not transfer any data, so it should be an _IO rather than an
_IOW.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2a4dac39

KVM: Remove the 'emulated' field from the userspace interface · 106b552b

由 Avi Kivity 提交于 3月 01, 2007

We no longer emulate single instructions in userspace.  Instead, we service
mmio or pio requests.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

106b552b

KVM: Handle cpuid in the kernel instead of punting to userspace · 06465c5a

由 Avi Kivity 提交于 2月 28, 2007

KVM used to handle cpuid by letting userspace decide what values to
return to the guest.  We now handle cpuid completely in the kernel.  We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).

The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

06465c5a

KVM: Do not communicate to userspace through cpu registers during PIO · 46fc1477

由 Avi Kivity 提交于 2月 22, 2007

Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions).  This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.

So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

46fc1477

KVM: Use a shared page for kernel/user communication when runing a vcpu · 9a2bb7f4

由 Avi Kivity 提交于 2月 22, 2007

Instead of passing a 'struct kvm_run' back and forth between the kernel and
userspace, allocate a page and allow the user to mmap() it.  This reduces
needless copying and makes the interface expandable by providing lots of
free space.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9a2bb7f4

KVM: Fix bogus sign extension in mmu mapping audit · 1ea252af

由 Avi Kivity 提交于 3月 08, 2007

When auditing a 32-bit guest on a 64-bit host, sign extension of the page
table directory pointer table index caused bogus addresses to be shown on
audit errors.

Fix by declaring the index unsigned.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1ea252af

KVM: Export <linux/kvm.h> · ff426974

由 Avi Kivity 提交于 3月 07, 2007

This allows users to actually build prgrams that use kvm without
the entire source tree.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

ff426974

KVM: Use own minor number · bbe4432e

由 Avi Kivity 提交于 3月 04, 2007

Use the minor number (232) allocated to kvm by lanana.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bbe4432e

KVM: Use the generic skip_emulated_instruction() in hypercall code · 510043da

由 Dor Laor 提交于 2月 19, 2007

Instead of twiddling the rip registers directly, use the
skip_emulated_instruction() function to do that for us.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

510043da

KVM: Fix guest register corruption on paravirt hypercall · 9b22bf57

由 Dor Laor 提交于 2月 19, 2007

The hypercall code mixes up the ->cache_regs() and ->decache_regs()
callbacks, resulting in guest register corruption.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9b22bf57

01 5月, 2007 1 次提交

libata: honour host controllers that want just one host · dc87c398

由 Linus Torvalds 提交于 4月 30, 2007

The Marvell IDE interface on my machine would hit a BUG_ON() in
lib/iomem.c because it was calling ata_pci_init_one() specifying just a
single port on the host, but that would actually end up trying to
initialize two ports, the second one with bogus information.

This fixes "ata_pci_init_one()" so that it actually passes down the
n_ports variable that it got from the low-level driver to the host
allocation routine ("ata_host_alloc_pinfo()"), which results in the ATA
layer actually having the correct port number information.

And in order to make it all work, I also needed to fix a few places that
had incorrectly hard-coded the fact that a host always had exactly two
ports (both ata_pci_init_bmdma() and ata_request_legacy_irqs() would
just always iterate over both ports).
Acked-by: NJeff Garzik <jeff@garzik.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dc87c398

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功