提交 · 383d0b050106abecb82f43101cac94fa423af5cd · openeuler / raspberrypi-kernel

28 11月, 2014 10 次提交

KVM: s390: handle pending local interrupts via bitmap · 383d0b05

由 Jens Freimann 提交于 7月 29, 2014

This patch adapts handling of local interrupts to be more compliant with
the z/Architecture Principles of Operation and introduces a data
structure
which allows more efficient handling of interrupts.

* get rid of li->active flag, use bitmap instead
* Keep interrupts in a bitmap instead of a list
* Deliver interrupts in the order of their priority as defined in the
  PoP
* Use a second bitmap for sigp emergency requests, as a CPU can have
  one request pending from every other CPU in the system.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

383d0b05

KVM: s390: add bitmap for handling cpu-local interrupts · c0e6159d

由 Jens Freimann 提交于 7月 29, 2013

Adds a bitmap to the vcpu structure which is used to keep track
of local pending interrupts. Also add enum with all interrupt
types sorted in order of priority (highest to lowest)
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

c0e6159d

KVM: s390: refactor interrupt delivery code · 0fb97abe

由 Jens Freimann 提交于 7月 29, 2014

Move delivery code for cpu-local interrupt from the huge do_deliver_interrupt()
to smaller functions which handle one type of interrupt.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

0fb97abe

KVM: s390: add defines for virtio and pfault interrupt code · 60f90a14

由 Jens Freimann 提交于 11月 10, 2014

Get rid of open coded value for virtio and pfault completion interrupts.
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

60f90a14

KVM: s390: external param not valid for cpu timer and ckc · af43eb2f

由 David Hildenbrand 提交于 11月 07, 2014

The 32bit external interrupt parameter is only valid for timing-alert and
service-signal interrupts.
Signed-off-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

af43eb2f

KVM: s390: refactor interrupt injection code · 0146a7b0

由 Jens Freimann 提交于 7月 28, 2014

In preparation for the rework of the local interrupt injection code,
factor out injection routines from kvm_s390_inject_vcpu().
Signed-off-by: NJens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

0146a7b0

KVM: S390: Create helper function get_guest_storage_key · 9fcf93b5

由 Jason J. Herne 提交于 9月 23, 2014

Define get_guest_storage_key which can be used to get the value of a guest
storage key. This compliments the functionality provided by the helper function
set_guest_storage_key. Both functions are needed for live migration of s390
guests that use storage keys.
Signed-off-by: NJason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

9fcf93b5

KVM: s390: trigger the right CPU exit for floating interrupts · da00fcbd

由 Christian Borntraeger 提交于 11月 21, 2014

When injecting a floating interrupt and no CPU is idle we
kick one CPU to do an external exit. In case of I/O we
should trigger an I/O exit instead. This does not matter
for Linux guests as external and I/O interrupts are
enabled/disabled at the same time, but play safe anyway.

The same holds true for machine checks. Since there is no
special exit, just reuse the generic stop exit. The injection
code inside the VCPU loop will recheck anyway and rearm the
proper exits (e.g. control registers) if necessary.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>

da00fcbd

KVM: s390: Fix rewinding of the PSW pointing to an EXECUTE instruction · 04b41acd

由 Thomas Huth 提交于 11月 12, 2014

A couple of our interception handlers rewind the PSW to the beginning
of the instruction to run the intercepted instruction again during the
next SIE entry. This normally works fine, but there is also the
possibility that the instruction did not get run directly but via an
EXECUTE instruction.
In this case, the PSW does not point to the instruction that caused the
interception, but to the EXECUTE instruction! So we've got to rewind the
PSW to the beginning of the EXECUTE instruction instead.
This is now accomplished with a new helper function kvm_s390_rewind_psw().
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

04b41acd

KVM: s390: Small fixes for the PFMF handler · a02689fe

由 Thomas Huth 提交于 11月 10, 2014

This patch includes two small fixes for the PFMF handler: First, the
start address for PFMF has to be masked according to the current
addressing mode, which is now done with kvm_s390_logical_to_effective().
Second, the protection exceptions have a lower priority than the
specification exceptions, so the check for low-address protection
has to be moved after the last spot where we inject a specification
exception.
Signed-off-by: NThomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

a02689fe

24 11月, 2014 2 次提交

kvm: x86: mask out XSAVES · b65d6e17

由 Paolo Bonzini 提交于 11月 21, 2014

This feature is not supported inside KVM guests yet, because we do not emulate
MSR_IA32_XSS.  Mask it out.

Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b65d6e17

kvm: x86: move assigned-dev.c and iommu.c to arch/x86/ · c274e03a

由 Radim Krčmář 提交于 11月 21, 2014

Now that ia64 is gone, we can hide deprecated device assignment in x86.

Notable changes:
 - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl()

The easy parts were removed from generic kvm code, remaining
 - kvm_iommu_(un)map_pages() would require new code to be moved
 - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c274e03a

22 11月, 2014 3 次提交

kvm: remove IA64 ioctls · 6b397158

由 Radim Krčmář 提交于 11月 20, 2014

KVM ia64 is no longer present so new applications shouldn't use them.
The main problem is that they most likely didn't work even before,
because of a conflict in the #defines:

  #define KVM_SET_GUEST_DEBUG       _IOW(KVMIO,  0x9b, struct kvm_guest_debug)
  #define KVM_IA64_VCPU_SET_STACK   _IOW(KVMIO,  0x9b, void *)

The argument to KVM_SET_GUEST_DEBUG is:

  struct kvm_guest_debug {
  	__u32 control;
  	__u32 pad;
  	struct kvm_guest_debug_arch arch;
  };

  struct kvm_guest_debug_arch {
  };

meaning that sizeof(struct kvm_guest_debug) == sizeof(void *) == 8
and KVM_SET_GUEST_DEBUG == KVM_IA64_VCPU_SET_STACK.

KVM_SET_GUEST_DEBUG is handled in virt/kvm/kvm_main.c before even calling
kvm_arch_vcpu_ioctl (which would have handled KVM_IA64_VCPU_SET_STACK),
so KVM_IA64_VCPU_SET_STACK would just return -EINVAL.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6b397158

R
kvm: remove CONFIG_X86 #ifdefs from files formerly shared with ia64 · 3bf58e9a
由 Radim Krcmar 提交于 11月 21, 2014
```
Signed-off-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
3bf58e9a

kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/ · 6ef768fa

由 Paolo Bonzini 提交于 11月 20, 2014

ia64 does not need them anymore.  Ack notifiers become x86-specific
too.
Suggested-by: NGleb Natapov <gleb@kernel.org>
Reviewed-by: NRadim Krcmar <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6ef768fa

20 11月, 2014 11 次提交

kvm: Documentation: remove ia64 · c32a4272

由 Tiejun Chen 提交于 11月 20, 2014

kvm/ia64 is gone, clean up Documentation too.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c32a4272

KVM: ia64: remove · 003f7de6

由 Paolo Bonzini 提交于 11月 19, 2014

KVM for ia64 has been marked as broken not just once, but twice even,
and the last patch from the maintainer is now roughly 5 years old.
Time for it to rest in peace.
Acked-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

003f7de6

KVM: x86: Remove FIXMEs in emulate.c · 86619e7b

由 Nicholas Krause 提交于 11月 19, 2014

Remove FIXME comments about needing fault addresses to be returned.  These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.
Signed-off-by: NNicholas Krause <xerofoify@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

86619e7b

KVM: emulator: remove duplicated limit check · 997b0412

由 Paolo Bonzini 提交于 11月 19, 2014

The check on the higher limit of the segment, and the check on the
maximum accessible size, is the same for both expand-up and
expand-down segments.  Only the computation of "lim" varies.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

997b0412

KVM: emulator: remove code duplication in register_address{,_increment} · 01485a22

由 Paolo Bonzini 提交于 11月 19, 2014

register_address has been a duplicate of address_mask ever since the
ancestor of __linearize was born in 90de84f5 (KVM: x86 emulator:
preserve an operand's segment identity, 2010-11-17).

However, we can put it to a better use by including the call to reg_read
in register_address.  Similarly, the call to reg_rmw can be moved to
register_address_increment.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

01485a22

KVM: x86: Move __linearize masking of la into switch · 31ff6488

由 Nadav Amit 提交于 11月 19, 2014

In __linearize there is check of the condition whether to check if masking of
the linear address is needed.  It occurs immediately after switch that
evaluates the same condition.  Merge them.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

31ff6488

KVM: x86: Non-canonical access using SS should cause #SS · abc7d8a4

由 Nadav Amit 提交于 11月 19, 2014

When SS is used using a non-canonical address, an #SS exception is generated on
real hardware.  KVM emulator causes a #GP instead. Fix it to behave as real x86
CPU.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

abc7d8a4

KVM: x86: Perform limit checks when assigning EIP · d50eaa18

由 Nadav Amit 提交于 11月 19, 2014

If branch (e.g., jmp, ret) causes limit violations, since the target IP >
limit, the #GP exception occurs before the branch. In other words, the RIP
pushed on the stack should be that of the branch and not that of the target.

To do so, we can call __linearize, with new EIP, which also saves us the code
which performs the canonical address checks. On the case of assigning an EIP >=
2^32 (when switching cs.l), we also safe, as __linearize will check the new EIP
does not exceed the limit and would trigger #GP(0) otherwise.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d50eaa18

KVM: x86: Emulator performs privilege checks on __linearize · a7315d2f

由 Nadav Amit 提交于 11月 19, 2014

When segment is accessed, real hardware does not perform any privilege level
checks. In contrast, KVM emulator does. This causes some discrepencies from
real hardware. For instance, reading from readable code segment may fail due to
incorrect segment checks. In addition, it introduces unnecassary overhead.

To reference Intel SDM 5.5 ("Privilege Levels"): "Privilege levels are checked
when the segment selector of a segment descriptor is loaded into a segment
register." The SDM never mentions privilege level checks during memory access,
except for loading far pointers in section 5.10 ("Pointer Validation"). Those
are actually segment selector loads and are emulated in the similarily (i.e.,
regardless to __linearize checks).

This behavior was also checked using sysexit. A data-segment whose DPL=0 was
loaded, and after sysexit (CPL=3) it is still accessible.

Therefore, all the privilege level checks in __linearize are removed.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7315d2f

KVM: x86: Stack size is overridden by __linearize · 1c1c35ae

由 Nadav Amit 提交于 11月 19, 2014

When performing segmented-read/write in the emulator for stack operations, it
ignores the stack size, and uses the ad_bytes as indication for the pointer
size. As a result, a wrong address may be accessed.

To fix this behavior, we can remove the masking of address in __linearize and
perform it beforehand.  It is already done for the operands (so currently it is
inefficiently done twice). It is missing in two cases:
1. When using rip_relative
2. On fetch_bit_operand that changes the address.

This patch masks the address on these two occassions, and removes the masking
from __linearize.

Note that it does not mask EIP during fetch. In protected/legacy mode code
fetch when RIP >= 2^32 should result in #GP and not wrap-around. Since we make
limit checks within __linearize, this is the expected behavior.

Partial revert of commit 518547b3 (KVM: x86: Emulator does not
calculate address correctly, 2014-09-30).
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1c1c35ae

KVM: x86: Revert NoBigReal patch in the emulator · 7d882ffa

由 Nadav Amit 提交于 11月 19, 2014

Commit 10e38fc7cab6 ("KVM: x86: Emulator flag for instruction that only support
16-bit addresses in real mode") introduced NoBigReal for instructions such as
MONITOR. Apparetnly, the Intel SDM description that led to this patch is
misleading. Since no instruction is using NoBigReal, it is safe to remove it,
we fully understand what the SDM means.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7d882ffa

18 11月, 2014 2 次提交

kvm: x86: vmx: remove MMIO_MAX_GEN · 842bb26a

由 Tiejun Chen 提交于 11月 18, 2014

MMIO_MAX_GEN is the same as MMIO_GEN_MASK.  Use only one.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

842bb26a

kvm: x86: vmx: cleanup handle_ept_violation · 81ed33e4

由 Tiejun Chen 提交于 11月 18, 2014

Instead, just use PFERR_{FETCH, PRESENT, WRITE}_MASK
inside handle_ept_violation() for slightly better code.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

81ed33e4

17 11月, 2014 5 次提交

KVM: x86: Fix lost interrupt on irr_pending race · f210f757

由 Nadav Amit 提交于 11月 16, 2014

apic_find_highest_irr assumes irr_pending is set if any vector in APIC_IRR is
set.  If this assumption is broken and apicv is disabled, the injection of
interrupts may be deferred until another interrupt is delivered to the guest.
Ultimately, if no other interrupt should be injected to that vCPU, the pending
interrupt may be lost.

commit 56cc2406 ("KVM: nVMX: fix "acknowledge interrupt on exit" when APICv
is in use") changed the behavior of apic_clear_irr so irr_pending is cleared
after setting APIC_IRR vector. After this commit, if apic_set_irr and
apic_clear_irr run simultaneously, a race may occur, resulting in APIC_IRR
vector set, and irr_pending cleared. In the following example, assume a single
vector is set in IRR prior to calling apic_clear_irr:

apic_set_irr				apic_clear_irr
------------				--------------
apic->irr_pending = true;
					apic_clear_vector(...);
					vec = apic_search_irr(apic);
					// => vec == -1
apic_set_vector(...);
					apic->irr_pending = (vec != -1);
					// => apic->irr_pending == false

Nonetheless, it appears the race might even occur prior to this commit:

apic_set_irr				apic_clear_irr
------------				--------------
apic->irr_pending = true;
					apic->irr_pending = false;
					apic_clear_vector(...);
					if (apic_search_irr(apic) != -1)
						apic->irr_pending = true;
					// => apic->irr_pending == false
apic_set_vector(...);

Fixing this issue by:
1. Restoring the previous behavior of apic_clear_irr: clear irr_pending, call
   apic_clear_vector, and then if APIC_IRR is non-zero, set irr_pending.
2. On apic_set_irr: first call apic_set_vector, then set irr_pending.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f210f757

KVM: compute correct map even if all APICs are software disabled · a3e339e1

由 Paolo Bonzini 提交于 11月 06, 2014

Logical destination mode can be used to send NMI IPIs even when all
APICs are software disabled, so if all APICs are software disabled we
should still look at the DFRs.

So the DFRs should all be the same, even if some or all APICs are
software disabled.  However, the SDM does not say this, so tweak
the logic as follows:

- if one APIC is enabled and has LDR != 0, use that one to build the map.
This picks the right DFR in case an OS is only setting it for the
software-enabled APICs, or in case an OS is using logical addressing
on some APICs while leaving the rest in reset state (using LDR was
suggested by Radim).

- if all APICs are disabled, pick a random one to build the map.
We use the last one with LDR != 0 for simplicity.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a3e339e1

KVM: x86: Software disabled APIC should still deliver NMIs · 173beedc

由 Nadav Amit 提交于 11月 02, 2014

Currently, the APIC logical map does not consider VCPUs whose local-apic is
software-disabled.  However, NMIs, INIT, etc. should still be delivered to such
VCPUs. Therefore, the APIC mode should first be determined, and then the map,
considering all VCPUs should be constructed.

To address this issue, first find the APIC mode, and only then construct the
logical map.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

173beedc

kvm: simplify update_memslots invocation · 5cc15027

由 Paolo Bonzini 提交于 11月 14, 2014

The update_memslots invocation is only needed in one case.  Make
the code clearer by moving it to __kvm_set_memory_region, and
removing the wrapper around insert_memslot.
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5cc15027

kvm: commonize allocation of the new memory slots · f2a81036

由 Paolo Bonzini 提交于 11月 14, 2014

The two kmemdup invocations can be unified.  I find that the new
placement of the comment makes it easier to see what happens.
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f2a81036

14 11月, 2014 3 次提交

kvm: memslots: track id_to_index changes during the insertion sort · 8593176c

由 Paolo Bonzini 提交于 11月 14, 2014

This completes the optimization from the previous patch, by
removing the KVM_MEM_SLOTS_NUM-iteration loop from insert_memslot.
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8593176c

kvm: memslots: replace heap sort with an insertion sort pass · 063584d4

由 Igor Mammedov 提交于 11月 13, 2014

memslots is a sorted array.  When a slot is changed, heapsort (lib/sort.c)
would take O(n log n) time to update it; an optimized insertion sort will
only cost O(n) on an array with just one item out of order.

Replace sort() with a custom sort that takes advantage of memslots usage
pattern and the known position of the changed slot.

performance change of 128 memslots insertions with gradually increasing
size (the worst case):

      heap sort   custom sort
max:  249747      2500 cycles

with custom sort alg taking ~98% less then original
update time.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

063584d4

kvm: x86: increase user memory slots to 509 · 1d4e7e3c

由 Igor Mammedov 提交于 11月 06, 2014

With the 3 private slots, this gives us 512 slots total.
Motivation for this is in addition to assigned devices
support more memory hotplug slots, where 1 slot is
used by a hotplugged memory stick.
It will allow to support upto 256 hotplug memory
slots and leave 253 slots for assigned devices and
other devices that use them.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1d4e7e3c

13 11月, 2014 1 次提交

kvm: svm: move WARN_ON in svm_adjust_tsc_offset · d913b904

由 Chris J Arges 提交于 11月 12, 2014

When running the tsc_adjust kvm-unit-test on an AMD processor with the
IA32_TSC_ADJUST feature enabled, the WARN_ON in svm_adjust_tsc_offset can be
triggered. This WARN_ON checks for a negative adjustment in case __scale_tsc
is called; however it may trigger unnecessary warnings.

This patch moves the WARN_ON to trigger only if __scale_tsc will actually be
called from svm_adjust_tsc_offset. In addition make adj in kvm_set_msr_common
s64 since this can have signed values.
Signed-off-by: NChris J Arges <chris.j.arges@canonical.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d913b904

12 11月, 2014 2 次提交

x86, kvm, vmx: Don't set LOAD_IA32_EFER when host and guest match · 54b98bff

由 Andy Lutomirski 提交于 11月 10, 2014

There's nothing to switch if the host and guest values are the same.
I am unable to find evidence that this makes any difference
whatsoever.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
[I could see a difference on Nehalem.  From 5 runs:

 userspace exit, guest!=host   12200 11772 12130 12164 12327
 userspace exit, guest=host    11983 11780 11920 11919 12040
 lightweight exit, guest!=host  3214  3220  3238  3218  3337
 lightweight exit, guest=host   3178  3193  3193  3187  3220

 This passes the t-test with 99% confidence for userspace exit,
 98.5% confidence for lightweight exit. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54b98bff

x86, kvm, vmx: Always use LOAD_IA32_EFER if available · f6577a5f

由 Andy Lutomirski 提交于 11月 07, 2014

At least on Sandy Bridge, letting the CPU switch IA32_EFER is much
faster than switching it manually.

I benchmarked this using the vmexit kvm-unit-test (single run, but
GOAL multiplied by 5 to do more iterations):

Test Before After Change
cpuid 2000 1932 -3.40%
vmcall 1914 1817 -5.07%
mov_from_cr8 13 13 0.00%
mov_to_cr8 19 19 0.00%
inl_from_pmtimer 19164 10619 -44.59%
inl_from_qemu 15662 10302 -34.22%
inl_from_kernel 3916 3802 -2.91%
outl_to_kernel 2230 2194 -1.61%
mov_dr 172 176 2.33%
ipi (skipped) (skipped)
ipi+halt (skipped) (skipped)
ple-round-robin 13 13 0.00%
wr_tsc_adjust_msr 1920 1845 -3.91%
rd_tsc_adjust_msr 1892 1814 -4.12%
mmio-no-eventfd:pci-mem 16394 11165 -31.90%
mmio-wildcard-eventfd:pci-mem 4607 4645 0.82%
mmio-datamatch-eventfd:pci-mem 4601 4610 0.20%
portio-no-eventfd:pci-io 11507 7942 -30.98%
portio-wildcard-eventfd:pci-io 2239 2225 -0.63%
portio-datamatch-eventfd:pci-io 2250 2234 -0.71%

I haven't explicitly computed the significance of these numbers,
but this isn't subtle.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
[The results were reproducible on all of Nehalem, Sandy Bridge and
Ivy Bridge. The slowness of manual switching is because writing
to EFER with WRMSR triggers a TLB flush, even if the only bit you're
touching is SCE (so the page table format is not affected). Doing
the write as part of vmentry/vmexit, instead, does not flush the TLB,
probably because all processors that have EPT also have VPID. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f6577a5f

10 11月, 2014 1 次提交

KVM: x86: fix warning on 32-bit compilation · ac146235

由 Paolo Bonzini 提交于 11月 10, 2014

PCIDs are only supported in 64-bit mode.  No need to clear bit 63
of CR3 unless the host is 64-bit.

Reported by Fengguang Wu's autobuilder.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ac146235