提交 · e0e5127d06957e76da3906b7a58d5d2665e81f59 · openeuler / raspberrypi-kernel

03 5月, 2007 18 次提交

A
KVM: Don't complain about cpu erratum AA15 · e0e5127d
由 Avi Kivity 提交于 4月 25, 2007
```
It slows down Windows x64 horribly.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
e0e5127d

由 Avi Kivity 提交于 4月 19, 2007

Make the exit statistics per-vcpu instead of global.  This gives a 3.5%
boost when running one virtual machine per core on my two socket dual core
(4 cores total) machine.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1165f5fe

KVM: VMX: Only save/restore MSR_K6_STAR if necessary · 4d56c8a7

由 Avi Kivity 提交于 4月 19, 2007

Intel hosts only support syscall/sysret in long more (and only if efer.sce
is enabled), so only reload the related MSR_K6_STAR if the guest will
actually be able to use it.

This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

4d56c8a7

A
KVM: Fold drivers/kvm/kvm_vmx.h into drivers/kvm/vmx.c · 35cc7f97
由 Avi Kivity 提交于 4月 19, 2007
```
No meat in that file.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
35cc7f97

KVM: VMX: Don't switch 64-bit msrs for 32-bit guests · e38aea3e

由 Avi Kivity 提交于 4月 19, 2007

Some msrs are only used by x86_64 instructions, and are therefore
not needed when the guest is legacy mode.  By not bothering to switch
them, we reduce vmexit latency by 2400 cycles (from about 8800) when
running a 32-bt guest on a 64-bit host.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

e38aea3e

KVM: VMX: Reduce unnecessary saving of host msrs · 2345df8c

由 Avi Kivity 提交于 4月 17, 2007

THe automatically switched msrs are never changed on the host (with
the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save
them on every vm entry.

This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%)
on x86_64.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

2345df8c

KVM: Fix overflow bug in overflow detection code · 3964994b

由 Eric Sesterhenn / Snakebyte 提交于 4月 09, 2007

The expression

   sp - 6 < sp

where sp is a u16 is undefined in C since 'sp - 6' is promoted to int,
and signed overflow is undefined in C.  gcc 4.2 actually warns about it.
Replace with a simpler test.
Signed-off-by: NEric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

3964994b

KVM: Simply gfn_to_page() · 954bbbc2

由 Avi Kivity 提交于 3月 30, 2007

Mapping a guest page to a host page is a common operation.  Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).

This is clumsy, and also won't work well with memory aliases.  So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

954bbbc2

A
KVM: Remove debug message · afeb1f14
由 Avi Kivity 提交于 3月 27, 2007
```
No longer interesting.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
afeb1f14

KVM: Hack real-mode segments on vmx from KVM_SET_SREGS · 038881c8

由 Avi Kivity 提交于 3月 21, 2007

As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints.  We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.

This fixes reboot on vmx.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

038881c8

KVM: Remove set_cr0_no_modeswitch() arch op · f6528b03

由 Avi Kivity 提交于 3月 20, 2007

set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
As we now cache the protected mode values on entry to real mode, this
isn't an issue anymore, and it interferes with reboot (which usually _is_
a modeswitch).
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f6528b03

KVM: Workaround vmx inability to virtualize the reset state · 8cb5b033

由 Avi Kivity 提交于 3月 20, 2007

The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000,
which aren't compatible with vm86 mode, which is used for real mode
virtualization.

When we create a vcpu, we set cs.base to 0xf0000, but if we get there by
way of a reset, the values are inconsistent and vmx refuses to enter
guest mode.

Workaround by detecting the state and munging it appropriately.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8cb5b033

KVM: Avoid guest virtual addresses in string pio userspace interface · 039576c0

由 Avi Kivity 提交于 3月 20, 2007

The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions. This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.

Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there. The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

039576c0

KVM: Add a special exit reason when exiting due to an interrupt · 1b19f3e6

由 Avi Kivity 提交于 3月 04, 2007

This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

1b19f3e6

KVM: Fold kvm_run::exit_type into kvm_run::exit_reason · 8eb7d334

由 Avi Kivity 提交于 3月 04, 2007

Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more). So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

8eb7d334

KVM: Handle cpuid in the kernel instead of punting to userspace · 06465c5a

由 Avi Kivity 提交于 2月 28, 2007

KVM used to handle cpuid by letting userspace decide what values to
return to the guest.  We now handle cpuid completely in the kernel.  We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).

The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

06465c5a

KVM: Do not communicate to userspace through cpu registers during PIO · 46fc1477

由 Avi Kivity 提交于 2月 22, 2007

Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions).  This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.

So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

46fc1477

KVM: Use the generic skip_emulated_instruction() in hypercall code · 510043da

由 Dor Laor 提交于 2月 19, 2007

Instead of twiddling the rip registers directly, use the
skip_emulated_instruction() function to do that for us.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

510043da

27 3月, 2007 2 次提交

KVM: always reload segment selectors · 6d9658df

由 Ingo Molnar 提交于 3月 11, 2007

failed VM entry on VMX might still change %fs or %gs, thus make sure
that KVM always reloads the segment selectors. This is crutial on both
x86 and x86_64: x86 has __KERNEL_PDA in %fs on which things like
'current' depends and x86_64 has 0 there and needs MSR_GS_BASE to work.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6d9658df

KVM: Prevent system selectors leaking into guest on real->protected mode transition on vmx · 6af11b9e

由 Avi Kivity 提交于 3月 19, 2007

Intel virtualization extensions do not support virtualizing real mode. So
kvm uses virtualized vm86 mode to run real mode code. Unfortunately, this
virtualized vm86 mode does not support the so called "big real" mode, where
the segment selector and base do not agree with each other according to the
real mode rules (base == selector << 4).

To work around this, kvm checks whether a selector/base pair violates the
virtualized vm86 rules, and if so, forces it into conformance. On a
transition back to protected mode, if we see that the guest did not touch
a forced segment, we restore it back to the original protected mode value.

This pile of hacks breaks down if the gdt has changed in real mode, as it
can cause a segment selector to point to a system descriptor instead of a
normal data segment. In fact, this happens with the Windows bootloader
and the qemu acpi bios, where a protected mode memcpy routine issues an
innocent 'pop %es' and traps on an attempt to load a system descriptor.

"Fix" by checking if the to-be-restored selector points at a system segment,
and if so, coercing it into a normal data segment. The long term solution,
of course, is to abandon vm86 mode and use emulation for big real mode.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

6af11b9e

18 3月, 2007 1 次提交

KVM: Fix guest sysenter on vmx · f5b42c33

由 Avi Kivity 提交于 3月 06, 2007

The vmx code currently treats the guest's sysenter support msrs as 32-bit
values, which breaks 32-bit compat mode userspace on 64-bit guests.  Fix by
using the native word width of the machine.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

f5b42c33

04 3月, 2007 7 次提交

KVM: Per-vcpu inodes · bccf2150

由 Avi Kivity 提交于 2月 21, 2007

Allocate a distinct inode for every vcpu in a VM.  This has the following
benefits:

 - the filp cachelines are no longer bounced when f_count is incremented on
   every ioctl()
 - the API and internal code are distinctly clearer; for example, on the
   KVM_GET_REGS ioctl, there is no need to copy the vcpu number from
   userspace and then copy the registers back; the vcpu identity is derived
   from the fd used to make the call

Right now the performance benefits are completely theoretical since (a) we
don't support more than one vcpu per VM and (b) virtualization hardware
inefficiencies completely everwhelm any cacheline bouncing effects.  But
both of these will change, and we need to prepare the API today.
Signed-off-by: NAvi Kivity <avi@qumranet.com>

bccf2150

A
KVM: Wire up hypercall handlers to a central arch-independent location · 270fd9b9
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
270fd9b9
I
KVM: Add host hypercall support for vmx · c21415e8
由 Ingo Molnar 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
c21415e8

KVM: add MSR based hypercall API · 102d8325

由 Ingo Molnar 提交于 2月 19, 2007

This adds a special MSR based hypercall API to KVM. This is to be
used by paravirtual kernels and virtual drivers.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

102d8325

KVM: Use ARRAY_SIZE macro instead of manual calculation. · 9d8f549d

由 Ahmed S. Darwish 提交于 2月 19, 2007

Signed-off-by: NAhmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

9d8f549d

KVM: vmx: hack set_cr0_no_modeswitch() to actually do modeswitch · de979caa

由 Joerg Roedel 提交于 2月 19, 2007

The whole thing is rotten, but this allows vmx to boot with the guest reboot
fix.
Signed-off-by: NMarkus Rechberger <markus.rechberger@amd.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>

de979caa

A
KVM: Cosmetics · d27d4aca
由 Avi Kivity 提交于 2月 19, 2007
```
Signed-off-by: NAvi Kivity <avi@qumranet.com>
```
d27d4aca

13 2月, 2007 6 次提交

[PATCH] i386: Convert i386 PDA code to use %fs · 464d1a78

由 Jeremy Fitzhardinge 提交于 2月 13, 2007

Convert the PDA code to use %fs rather than %gs as the segment for
per-processor data.  This is because some processors show a small but
measurable performance gain for reloading a NULL segment selector (as %fs
generally is in user-space) versus a non-NULL one (as %gs generally is).

On modern processors the difference is very small, perhaps undetectable.
Some old AMD "K6 3D+" processors are noticably slower when %fs is used
rather than %gs; I have no idea why this might be, but I think they're
sufficiently rare that it doesn't matter much.

This patch also fixes the math emulator, which had not been adjusted to
match the changed struct pt_regs.

[frederik.deweerdt@gmail.com: fixit with gdb]
[mingo@elte.hu: Fix KVM too]
Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: NAndi Kleen <ak@suse.de>
Cc: Ian Campbell <Ian.Campbell@XenSource.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NZachary Amsden <zach@vmware.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>

464d1a78

[PATCH] KVM: cpu hotplug support · 774c47f1

由 Avi Kivity 提交于 2月 12, 2007

On hotplug, we execute the hardware extension enable sequence.  On unplug, we
decache any vcpus that last ran on the exiting cpu, and execute the hardware
extension disable sequence.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

774c47f1

[PATCH] KVM: VMX: add vcpu_clear() · 8d0be2b3

由 Avi Kivity 提交于 2月 12, 2007

Like the inline code it replaces, this function decaches the vmcs from the cpu
it last executed on.  in addition:

 - vcpu_clear() works if the last cpu is also the cpu we're running on
 - it is faster on larger smps by virtue of using smp_call_function_single()

Includes fix from Ingo Molnar.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d0be2b3

[PATCH] kvm: VMX: Reload ds and es even in 64-bit mode · 26bb83a7

由 Avi Kivity 提交于 2月 12, 2007

Or 32-bit userspace will get confused.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

26bb83a7

[PATCH] kvm: vmx: handle triple faults by returning EXIT_REASON_SHUTDOWN to userspace · 988ad74f

由 Avi Kivity 提交于 2月 12, 2007

Just like svm.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

988ad74f

[PATCH] kvm: optimize inline assembly · 96958231

由 Ingo Molnar 提交于 2月 12, 2007

Forms like "0(%rsp)" generate an instruction with an unnecessary one byte
displacement under certain circumstances.  replace with the equivalent
"(%rsp)".
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

96958231

10 2月, 2007 1 次提交

[PATCH] kvm: NULL noise removal · 8b6d44c7

由 Al Viro 提交于 2月 09, 2007

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8b6d44c7

02 2月, 2007 1 次提交

[PATCH] KVM: fix lockup on 32-bit intel hosts with nx disabled in the bios · 432bd6cb

由 Avi Kivity 提交于 1月 31, 2007

Intel hosts, without long mode, and with nx support disabled in the bios
have an efer that is readable but not writable.  This causes a lockup on
switch to guest mode (even though it should exit with reason 34 according
to the documentation).
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

432bd6cb

23 1月, 2007 2 次提交

[PATCH] KVM: fix race between mmio reads and injected interrupts · cccf748b

由 Avi Kivity 提交于 1月 22, 2007

The kvm mmio read path looks like:

 1. guest read faults
 2. kvm emulates read, calls emulator_read_emulated()
 3. fails as a read requires userspace help
 4. exit to userspace
 5. userspace emulates read, kvm sets vcpu->mmio_read_completed
 6. re-enter guest, fault again
 7. kvm emulates read, calls emulator_read_emulated()
 8. succeeds as vcpu->mmio_read_emulated is set
 9. instruction completes and guest is resumed

A problem surfaces if the userspace exit (step 5) also requests an interrupt
injection.  In that case, the guest does not re-execute the original
instruction, but the interrupt handler.  The next time an mmio read is
exectued (likely for a different address), step 3 will find
vcpu->mmio_read_completed set and return the value read for the original
instruction.

The problem manifested itself in a few annoying ways:
- little squares appear randomly on console when switching virtual terminals
- ne2000 fails under nfs read load
- rtl8139 complains about "pci errors" even though the device model is
  incapable of issuing them.

Fix by skipping interrupt injection if an mmio read is pending.

A better fix is to avoid re-entry into the guest, and re-emulating immediately
instead.  However that's a bit more complex.
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cccf748b

[PATCH] vmx: Fix register constraint in launch code · e0015489

由 Herbert Xu 提交于 1月 23, 2007

Both "=r" and "=g" breaks my build on i386:

  $ make
    CC [M]  drivers/kvm/vmx.o
  {standard input}: Assembler messages:
  {standard input}:3318: Error: bad register name `%sil'
  make[1]: *** [drivers/kvm/vmx.o] Error 1
  make: *** [_module_drivers/kvm] Error 2

The reason is that setbe requires an 8-bit register but "=r" does not
constrain the target register to be one that has an 8-bit version on
i386.

According to

	http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10153

the correct constraint is "=q".
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0015489

12 1月, 2007 1 次提交

[PATCH] KVM: add VM-exit profiling · 07031e14

由 Ingo Molnar 提交于 1月 10, 2007

This adds the profile=kvm boot option, which enables KVM to profile VM
exits.

Use: "readprofile -m ./System.map | sort -n" to see the resulting
output:

   [...]
   18246 serial_out                               148.3415
   18945 native_flush_tlb                         378.9000
   23618 serial_in                                212.7748
   29279 __spin_unlock_irq                        622.9574
   43447 native_apic_write                        2068.9048
   52702 enable_8259A_irq                         742.2817
   54250 vgacon_scroll                             89.3740
   67394 ide_inb                                  6126.7273
   79514 copy_page_range                           98.1654
   84868 do_wp_page                                86.6000
  140266 pit_read                                 783.6089
  151436 ide_outb                                 25239.3333
  152668 native_io_delay                          21809.7143
  174783 mask_and_ack_8259A                       783.7803
  362404 native_set_pte_at                        36240.4000
 1688747 total                                      0.5009
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

07031e14

06 1月, 2007 1 次提交

[PATCH] KVM: Simplify test for interrupt window · 022a9308

由 Dor Laor 提交于 1月 05, 2007

No need to test for rflags.if as both VT and SVM specs assure us that on exit
caused from interrupt window opening, 'if' is set.
Signed-off-by: NDor Laor <dor.laor@qumranet.com>
Signed-off-by: NAvi Kivity <avi@qumranet.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

022a9308