提交 · db2336a80489e7c3c7728cefd9be58fac5ecfb39 · openeuler / raspberrypi-kernel

21 4月, 2017 2 次提交

KVM: x86: virtualize cpuid faulting · db2336a8

由 Kyle Huey 提交于 3月 20, 2017

Hardware support for faulting on the cpuid instruction is not required to
emulate it, because cpuid triggers a VM exit anyways. KVM handles the relevant
MSRs (MSR_PLATFORM_INFO and MSR_MISC_FEATURES_ENABLE) and upon a
cpuid-induced VM exit checks the cpuid faulting state and the CPL.
kvm_require_cpl is even kind enough to inject the GP fault for us.
Signed-off-by: NKyle Huey <khuey@kylehuey.com>
Reviewed-by: NDavid Matlack <dmatlack@google.com>
[Return "1" from kvm_emulate_cpuid, it's not void. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

db2336a8

KVM: VMX: drop vmm_exclusive module parameter · fe0e80be

由 David Hildenbrand 提交于 3月 10, 2017

vmm_exclusive=0 leads to KVM setting X86_CR4_VMXE always and calling
VMXON only when the vcpu is loaded. X86_CR4_VMXE is used as an
indication in cpu_emergency_vmxoff() (called on kdump) if VMXOFF has to be
called. This is obviously not the case if both are used independtly.
Calling VMXOFF without a previous VMXON will result in an exception.

In addition, X86_CR4_VMXE is used as a mean to test if VMX is already in
use by another VMM in hardware_enable(). So there can't really be
co-existance. If the other VMM is prepared for co-existance and does a
similar check, only one VMM can exist. If the other VMM is not prepared
and blindly sets/clears X86_CR4_VMXE, we will get inconsistencies with
X86_CR4_VMXE.

As we also had bug reports related to clearing of vmcs with vmm_exclusive=0
this seems to be pretty much untested. So let's better drop it.

While at it, directly move setting/clearing X86_CR4_VMXE into
kvm_cpu_vmxon/off.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fe0e80be

14 4月, 2017 1 次提交

KVM: nVMX: fix AD condition when handling EPT violation · 33251870

由 Radim Krčmář 提交于 4月 13, 2017

I have introduced this bug when applying and simplifying Paolo's patch
as we agreed on the list.  The original was "x &= ~y; if (z) x |= y;".

Here is the story of a bad workflow:

  A maintainer was already testing with the intended change, but it was
  applied only to a testing repo on a different machine.  When the time
  to push tested patches to kvm/next came, he realized that this change
  was missing and quickly added it to the maintenance repo, didn't test
  again (because the change is trivial, right), and pushed the world to
  fire.

Fixes: ae1e2d10 ("kvm: nVMX: support EPT accessed/dirty bits")
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

33251870

13 4月, 2017 26 次提交

KVM: x86: Add MSR_AMD64_DC_CFG to the list of ignored MSRs · 405a353a

由 Ladi Prosek 提交于 4月 06, 2017

Hyper-V writes 0x800000000000 to MSR_AMD64_DC_CFG when running on AMD CPUs
as recommended in erratum 383, analogous to our svm_init_erratum_383.

By ignoring the MSR, this patch enables running Hyper-V in L1 on AMD.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

405a353a

x86/kvm: virt_xxx memory barriers instead of mandatory barriers · 5a48a622

由 Wanpeng Li 提交于 4月 11, 2017

virt_xxx memory barriers are implemented trivially using the low-level
__smp_xxx macros, __smp_xxx is equal to a compiler barrier for strong
TSO memory model, however, mandatory barriers will unconditional add
memory barriers, this patch replaces the rmb() in kvm_steal_clock() by
virt_rmb().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

5a48a622

KVM: x86: fix maintaining of kvm_clock stability on guest CPU hotplug · bd8fab39

由 Denis Plotnikov 提交于 4月 07, 2017

VCPU TSC synchronization is perfromed in kvm_write_tsc() when the TSC
value being set is within 1 second from the expected, as obtained by
extrapolating of the TSC in already synchronized VCPUs.

This is naturally achieved on all VCPUs at VM start and resume;
however on VCPU hotplug it is not: the newly added VCPU is created
with TSC == 0 while others are well ahead.

To compensate for that, consider host-initiated kvm_write_tsc() with
TSC == 0 a special case requiring synchronization regardless of the
current TSC on other VCPUs.
Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

bd8fab39

KVM: x86: remaster kvm_write_tsc code · c5e8ec8e

由 Denis Plotnikov 提交于 4月 07, 2017

Reuse existing code instead of using inline asm.
Make the code more concise and clear in the TSC
synchronization part.
Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c5e8ec8e

KVM: x86: use irqchip_kernel() to check for pic+ioapic · 900ab14c

由 David Hildenbrand 提交于 4月 07, 2017

Although the current check is not wrong, this check explicitly includes
the pic.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

900ab14c

KVM: x86: simplify pic_ioport_read() · b5e7cf52

由 David Hildenbrand 提交于 4月 07, 2017

Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b5e7cf52

KVM: x86: set data directly in picdev_read() · 84a5c79e

由 David Hildenbrand 提交于 4月 07, 2017

Now it looks almost as picdev_write().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

84a5c79e

KVM: x86: drop picdev_in_range() · 9fecaa9e

由 David Hildenbrand 提交于 4月 07, 2017

We already have the exact same checks a couple of lines below.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

9fecaa9e

KVM: x86: make kvm_pic_reset() static · dc24d1d2

由 David Hildenbrand 提交于 4月 07, 2017

Not used outside of i8259.c, so let's make it static.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

dc24d1d2

KVM: x86: simplify pic_unlock() · e21d1758

由 David Hildenbrand 提交于 4月 07, 2017

We can easily compact this code and get rid of one local variable.
Reviewed-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e21d1758

KVM: x86: drop goto label in kvm_set_routing_entry() · 43ae312c

由 David Hildenbrand 提交于 4月 07, 2017

No need for the goto label + local variable "r".
Reviewed-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

43ae312c

KVM: x86: rename kvm_vcpu_request_scan_ioapic() · 993225ad

由 David Hildenbrand 提交于 4月 07, 2017

Let's rename it into a proper arch specific callback.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

993225ad

KVM: x86: directly call kvm_make_scan_ioapic_request() in ioapic.c · ca8ab3f8

由 David Hildenbrand 提交于 4月 07, 2017

We know there is an ioapic, so let's call it directly.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ca8ab3f8

KVM: x86: remove all-vcpu request from kvm_ioapic_init() · d62f270b

由 David Hildenbrand 提交于 4月 07, 2017

kvm_ioapic_init() is guaranteed to be called without any created VCPUs,
so doing an all-vcpu request results in a NOP.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d62f270b

KVM: x86: KVM_IRQCHIP_PIC_MASTER only has 8 pins · 445ee82d

由 David Hildenbrand 提交于 4月 07, 2017

Currently, one could set pin 8-15, implicitly referring to
KVM_IRQCHIP_PIC_SLAVE.

Get rid of the two local variables max_pin and delta on the way.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

445ee82d

KVM: x86: push usage of slots_lock down · 49f520b9

由 David Hildenbrand 提交于 4月 07, 2017

Let's just move it to the place where it is actually needed.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

49f520b9

KVM: x86: don't take kvm->irq_lock when creating IRQCHIP · ba7454e1

由 David Hildenbrand 提交于 4月 07, 2017

I don't see any reason any more for this lock, seemed to be used to protect
removal of kvm->arch.vpic / kvm->arch.vioapic when already partially
inititalized, now access is properly protected using kvm->arch.irqchip_mode
and this shouldn't be necessary anymore.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ba7454e1

D
KVM: x86: convert kvm_(set|get)_ioapic() into void · 33392b49
由 David Hildenbrand 提交于 4月 07, 2017
```
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
```
33392b49

KVM: x86: remove duplicate checks for ioapic · 4c0b06d8

由 David Hildenbrand 提交于 4月 07, 2017

When handling KVM_GET_IRQCHIP, we already check irqchip_kernel(), which
implies a fully inititalized ioapic.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4c0b06d8

D
KVM: x86: use ioapic_in_kernel() to check for ioapic existence · 0bceb15a
由 David Hildenbrand 提交于 4月 07, 2017
```
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
```
0bceb15a

KVM: x86: get rid of ioapic_irqchip() · 0191e92d

由 David Hildenbrand 提交于 4月 07, 2017

Let's just use kvm->arch.vioapic directly.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0191e92d

KVM: x86: get rid of pic_irqchip() · 90bca052

由 David Hildenbrand 提交于 4月 07, 2017

It seemed like a nice idea to encapsulate access to kvm->arch.vpic. But
as the usage is already mixed, internal locks are taken outside of i8259.c
and grepping for "vpic" only is much easier, let's just get rid of
pic_irqchip().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

90bca052

KVM: x86: check against irqchip_mode in ioapic_in_kernel() · f567080b

由 David Hildenbrand 提交于 4月 07, 2017

KVM_IRQCHIP_KERNEL implies a fully inititalized ioapic, while
kvm->arch.vioapic might temporarily be set but invalidated again if e.g.
setting of default routing fails when setting KVM_CREATE_IRQCHIP.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

f567080b

KVM: x86: check against irqchip_mode in pic_in_kernel() · 19d25a0e

由 David Hildenbrand 提交于 4月 07, 2017

Let's avoid checking against kvm->arch.vpic. We have kvm->arch.irqchip_mode
for that now.

KVM_IRQCHIP_KERNEL implies a fully inititalized pic, while kvm->arch.vpic
might temporarily be set but invalidated again if e.g. kvm_ioapic_init()
fails when setting KVM_CREATE_IRQCHIP. Although current users seem to be
fine, this avoids future bugs.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

19d25a0e

KVM: x86: check against irqchip_mode in kvm_set_routing_entry() · 8bf463f3

由 David Hildenbrand 提交于 4月 07, 2017

Let's replace the checks for pic_in_kernel() and ioapic_in_kernel() by
checks against irqchip_mode.

Also make sure that creation of any route is only possible if we have
an lapic in kernel (irqchip_in_kernel()) or if we are currently
inititalizing the irqchip.

This is necessary to switch pic_in_kernel() and ioapic_in_kernel() to
irqchip_mode, too.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8bf463f3

KVM: x86: new irqchip mode KVM_IRQCHIP_INIT_IN_PROGRESS · 637e3f86

由 David Hildenbrand 提交于 4月 07, 2017

Let's add a new mode and set it while we create the irqchip via
KVM_CREATE_IRQCHIP and KVM_CAP_SPLIT_IRQCHIP.

This mode will be used later to test if adding routes
(in kvm_set_routing_entry()) is already allowed.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

637e3f86

07 4月, 2017 11 次提交

kvm: nVMX: Disallow userspace-injected exceptions in guest mode · 28d06353

由 Jim Mattson 提交于 4月 05, 2017

The userspace exception injection API and code path are entirely
unprepared for exceptions that might cause a VM-exit from L2 to L1, so
the best course of action may be to simply disallow this for now.

1. The API provides no mechanism for userspace to specify the new DR6
bits for a #DB exception or the new CR2 value for a #PF
exception. Presumably, userspace is expected to modify these registers
directly with KVM_SET_SREGS before the next KVM_RUN ioctl. However, in
the event that L1 intercepts the exception, these registers should not
be changed. Instead, the new values should be provided in the
exit_qualification field of vmcs12 (Intel SDM vol 3, section 27.1).

2. In the case of a userspace-injected #DB, inject_pending_event()
clears DR7.GD before calling vmx_queue_exception(). However, in the
event that L1 intercepts the exception, this is too early, because
DR7.GD should not be modified by a #DB that causes a VM-exit directly
(Intel SDM vol 3, section 27.1).

3. If the injected exception is a #PF, nested_vmx_check_exception()
doesn't properly check whether or not L1 is interested in the
associated error code (using the #PF error code mask and match fields
from vmcs12). It may either return 0 when it should call
nested_vmx_vmexit() or vice versa.

4. nested_vmx_check_exception() assumes that it is dealing with a
hardware-generated exception intercept from L2, with some of the
relevant details (the VM-exit interruption-information and the exit
qualification) live in vmcs02. For userspace-injected exceptions, this
is not the case.

5. prepare_vmcs12() assumes that when its exit_intr_info argument
specifies valid information with a valid error code that it can VMREAD
the VM-exit interruption error code from vmcs02. For
userspace-injected exceptions, this is not the case.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

28d06353

KVM: x86: fix user triggerable warning in kvm_apic_accept_events() · 28bf2888

由 David Hildenbrand 提交于 3月 23, 2017

If we already entered/are about to enter SMM, don't allow switching to
INIT/SIPI_RECEIVED, otherwise the next call to kvm_apic_accept_events()
will report a warning.

Same applies if we are already in MP state INIT_RECEIVED and SMM is
requested to be turned on. Refuse to set the VCPU events in this case.

Fixes: cd7764fe ("KVM: x86: latch INITs while in system management mode")
Cc: stable@vger.kernel.org # 4.2+
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

28bf2888

kvm: make KVM_COALESCED_MMIO_PAGE_OFFSET public · 4b4357e0

由 Paolo Bonzini 提交于 3月 31, 2017

Its value has never changed; we might as well make it part of the ABI instead
of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).

Because PPC does not always make MMIO available, the code has to be made
dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4b4357e0

kvm: make KVM_CAP_COALESCED_MMIO architecture agnostic · 30422558

由 Paolo Bonzini 提交于 3月 31, 2017

Remove code from architecture files that can be moved to virt/kvm, since there
is already common code for coalesced MMIO.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[Removed a pointless 'break' after 'return'.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

30422558

KVM: nVMX: support RDRAND and RDSEED exiting · a5f46457

由 Paolo Bonzini 提交于 3月 30, 2017

Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a5f46457

KVM: VMX: add missing exit reasons · 1f519992

由 Paolo Bonzini 提交于 3月 30, 2017

In order to simplify adding exit reasons in the future,
the array of exit reason names is now also sorted by
exit reason code.
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1f519992

kvm: nVMX: support EPT accessed/dirty bits · ae1e2d10

由 Paolo Bonzini 提交于 3月 30, 2017

Now use bit 6 of EPTP to optionally enable A/D bits for EPTP.  Another
thing to change is that, when EPT accessed and dirty bits are not in use,
VMX treats accesses to guest paging structures as data reads.  When they
are in use (bit 6 of EPTP is set), they are treated as writes and the
corresponding EPT dirty bit is set.  The MMU didn't know this detail,
so this patch adds it.

We also have to fix up the exit qualification.  It may be wrong because
KVM sets bit 6 but the guest might not.

L1 emulates EPT A/D bits using write permissions, so in principle it may
be possible for EPT A/D bits to be used by L1 even though not available
in hardware.  The problem is that guest page-table walks will be treated
as reads rather than writes, so they would not cause an EPT violation.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[Fixed typo in walk_addr_generic() comment and changed bit clear +
 conditional-set pattern in handle_ept_violation() to conditional-clear]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ae1e2d10

kvm: x86: MMU support for EPT accessed/dirty bits · 86407bcb

由 Paolo Bonzini 提交于 3月 30, 2017

This prepares the MMU paging code for EPT accessed and dirty bits,
which can be enabled optionally at runtime.  Code that updates the
accessed and dirty bits will need a pointer to the struct kvm_mmu.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

86407bcb

KVM: VMX: remove bogus check for invalid EPT violation · 00477231

由 Paolo Bonzini 提交于 3月 30, 2017

handle_ept_violation is checking for "guest-linear-address invalid" +
"not a paging-structure walk".  However, _all_ EPT violations without
a valid guest linear address are paging structure walks, because those
EPT violations happen when loading the guest PDPTEs.

Therefore, the check can never be true, and even if it were, KVM doesn't
care about the guest linear address; it only uses the guest *physical*
address VMCS field.  So, remove the check altogether.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

00477231

KVM: nVMX: we support 1GB EPT pages · 7db74265

由 Paolo Bonzini 提交于 3月 08, 2017

Large pages at the PDPE level can be emulated by the MMU, so the bit
can be set unconditionally in the EPT capabilities MSR.  The same is
true of 2MB EPT pages, though all Intel processors with EPT in practice
support those.
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7db74265

KVM: x86: drop legacy device assignment · ad6260da

由 Paolo Bonzini 提交于 3月 27, 2017

Legacy device assignment has been deprecated since 4.2 (released
1.5 years ago).  VFIO is better and everyone should have switched to it.
If they haven't, this should convince them. :)
Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ad6260da