提交 · 332518706195007f9fbafa69652aa5b3cf72df24 · openanolis / cloud-kernel

14 4月, 2017 1 次提交

KVM: nVMX: fix AD condition when handling EPT violation · 33251870

由 Radim Krčmář 提交于 4月 13, 2017

I have introduced this bug when applying and simplifying Paolo's patch
as we agreed on the list.  The original was "x &= ~y; if (z) x |= y;".

Here is the story of a bad workflow:

  A maintainer was already testing with the intended change, but it was
  applied only to a testing repo on a different machine.  When the time
  to push tested patches to kvm/next came, he realized that this change
  was missing and quickly added it to the maintenance repo, didn't test
  again (because the change is trivial, right), and pushed the world to
  fire.

Fixes: ae1e2d10 ("kvm: nVMX: support EPT accessed/dirty bits")
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

33251870

13 4月, 2017 25 次提交

KVM: x86: Add MSR_AMD64_DC_CFG to the list of ignored MSRs · 405a353a

由 Ladi Prosek 提交于 4月 06, 2017

Hyper-V writes 0x800000000000 to MSR_AMD64_DC_CFG when running on AMD CPUs
as recommended in erratum 383, analogous to our svm_init_erratum_383.

By ignoring the MSR, this patch enables running Hyper-V in L1 on AMD.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

405a353a

KVM: x86: fix maintaining of kvm_clock stability on guest CPU hotplug · bd8fab39

由 Denis Plotnikov 提交于 4月 07, 2017

VCPU TSC synchronization is perfromed in kvm_write_tsc() when the TSC
value being set is within 1 second from the expected, as obtained by
extrapolating of the TSC in already synchronized VCPUs.

This is naturally achieved on all VCPUs at VM start and resume;
however on VCPU hotplug it is not: the newly added VCPU is created
with TSC == 0 while others are well ahead.

To compensate for that, consider host-initiated kvm_write_tsc() with
TSC == 0 a special case requiring synchronization regardless of the
current TSC on other VCPUs.
Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

bd8fab39

KVM: x86: remaster kvm_write_tsc code · c5e8ec8e

由 Denis Plotnikov 提交于 4月 07, 2017

Reuse existing code instead of using inline asm.
Make the code more concise and clear in the TSC
synchronization part.
Signed-off-by: NDenis Plotnikov <dplotnikov@virtuozzo.com>
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c5e8ec8e

KVM: x86: use irqchip_kernel() to check for pic+ioapic · 900ab14c

由 David Hildenbrand 提交于 4月 07, 2017

Although the current check is not wrong, this check explicitly includes
the pic.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

900ab14c

KVM: x86: simplify pic_ioport_read() · b5e7cf52

由 David Hildenbrand 提交于 4月 07, 2017

Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b5e7cf52

KVM: x86: set data directly in picdev_read() · 84a5c79e

由 David Hildenbrand 提交于 4月 07, 2017

Now it looks almost as picdev_write().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

84a5c79e

KVM: x86: drop picdev_in_range() · 9fecaa9e

由 David Hildenbrand 提交于 4月 07, 2017

We already have the exact same checks a couple of lines below.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

9fecaa9e

KVM: x86: make kvm_pic_reset() static · dc24d1d2

由 David Hildenbrand 提交于 4月 07, 2017

Not used outside of i8259.c, so let's make it static.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

dc24d1d2

KVM: x86: simplify pic_unlock() · e21d1758

由 David Hildenbrand 提交于 4月 07, 2017

We can easily compact this code and get rid of one local variable.
Reviewed-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e21d1758

KVM: x86: drop goto label in kvm_set_routing_entry() · 43ae312c

由 David Hildenbrand 提交于 4月 07, 2017

No need for the goto label + local variable "r".
Reviewed-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

43ae312c

KVM: x86: rename kvm_vcpu_request_scan_ioapic() · 993225ad

由 David Hildenbrand 提交于 4月 07, 2017

Let's rename it into a proper arch specific callback.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

993225ad

KVM: x86: directly call kvm_make_scan_ioapic_request() in ioapic.c · ca8ab3f8

由 David Hildenbrand 提交于 4月 07, 2017

We know there is an ioapic, so let's call it directly.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ca8ab3f8

KVM: x86: remove all-vcpu request from kvm_ioapic_init() · d62f270b

由 David Hildenbrand 提交于 4月 07, 2017

kvm_ioapic_init() is guaranteed to be called without any created VCPUs,
so doing an all-vcpu request results in a NOP.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d62f270b

KVM: x86: KVM_IRQCHIP_PIC_MASTER only has 8 pins · 445ee82d

由 David Hildenbrand 提交于 4月 07, 2017

Currently, one could set pin 8-15, implicitly referring to
KVM_IRQCHIP_PIC_SLAVE.

Get rid of the two local variables max_pin and delta on the way.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

445ee82d

KVM: x86: push usage of slots_lock down · 49f520b9

由 David Hildenbrand 提交于 4月 07, 2017

Let's just move it to the place where it is actually needed.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

49f520b9

KVM: x86: don't take kvm->irq_lock when creating IRQCHIP · ba7454e1

由 David Hildenbrand 提交于 4月 07, 2017

I don't see any reason any more for this lock, seemed to be used to protect
removal of kvm->arch.vpic / kvm->arch.vioapic when already partially
inititalized, now access is properly protected using kvm->arch.irqchip_mode
and this shouldn't be necessary anymore.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ba7454e1

D
KVM: x86: convert kvm_(set|get)_ioapic() into void · 33392b49
由 David Hildenbrand 提交于 4月 07, 2017
```
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
```
33392b49

KVM: x86: remove duplicate checks for ioapic · 4c0b06d8

由 David Hildenbrand 提交于 4月 07, 2017

When handling KVM_GET_IRQCHIP, we already check irqchip_kernel(), which
implies a fully inititalized ioapic.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4c0b06d8

D
KVM: x86: use ioapic_in_kernel() to check for ioapic existence · 0bceb15a
由 David Hildenbrand 提交于 4月 07, 2017
```
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
```
0bceb15a

KVM: x86: get rid of ioapic_irqchip() · 0191e92d

由 David Hildenbrand 提交于 4月 07, 2017

Let's just use kvm->arch.vioapic directly.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0191e92d

KVM: x86: get rid of pic_irqchip() · 90bca052

由 David Hildenbrand 提交于 4月 07, 2017

It seemed like a nice idea to encapsulate access to kvm->arch.vpic. But
as the usage is already mixed, internal locks are taken outside of i8259.c
and grepping for "vpic" only is much easier, let's just get rid of
pic_irqchip().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

90bca052

KVM: x86: check against irqchip_mode in ioapic_in_kernel() · f567080b

由 David Hildenbrand 提交于 4月 07, 2017

KVM_IRQCHIP_KERNEL implies a fully inititalized ioapic, while
kvm->arch.vioapic might temporarily be set but invalidated again if e.g.
setting of default routing fails when setting KVM_CREATE_IRQCHIP.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

f567080b

KVM: x86: check against irqchip_mode in pic_in_kernel() · 19d25a0e

由 David Hildenbrand 提交于 4月 07, 2017

Let's avoid checking against kvm->arch.vpic. We have kvm->arch.irqchip_mode
for that now.

KVM_IRQCHIP_KERNEL implies a fully inititalized pic, while kvm->arch.vpic
might temporarily be set but invalidated again if e.g. kvm_ioapic_init()
fails when setting KVM_CREATE_IRQCHIP. Although current users seem to be
fine, this avoids future bugs.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

19d25a0e

KVM: x86: check against irqchip_mode in kvm_set_routing_entry() · 8bf463f3

由 David Hildenbrand 提交于 4月 07, 2017

Let's replace the checks for pic_in_kernel() and ioapic_in_kernel() by
checks against irqchip_mode.

Also make sure that creation of any route is only possible if we have
an lapic in kernel (irqchip_in_kernel()) or if we are currently
inititalizing the irqchip.

This is necessary to switch pic_in_kernel() and ioapic_in_kernel() to
irqchip_mode, too.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8bf463f3

KVM: x86: new irqchip mode KVM_IRQCHIP_INIT_IN_PROGRESS · 637e3f86

由 David Hildenbrand 提交于 4月 07, 2017

Let's add a new mode and set it while we create the irqchip via
KVM_CREATE_IRQCHIP and KVM_CAP_SPLIT_IRQCHIP.

This mode will be used later to test if adding routes
(in kvm_set_routing_entry()) is already allowed.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

637e3f86

07 4月, 2017 13 次提交

kvm: nVMX: Disallow userspace-injected exceptions in guest mode · 28d06353

由 Jim Mattson 提交于 4月 05, 2017

The userspace exception injection API and code path are entirely
unprepared for exceptions that might cause a VM-exit from L2 to L1, so
the best course of action may be to simply disallow this for now.

1. The API provides no mechanism for userspace to specify the new DR6
bits for a #DB exception or the new CR2 value for a #PF
exception. Presumably, userspace is expected to modify these registers
directly with KVM_SET_SREGS before the next KVM_RUN ioctl. However, in
the event that L1 intercepts the exception, these registers should not
be changed. Instead, the new values should be provided in the
exit_qualification field of vmcs12 (Intel SDM vol 3, section 27.1).

2. In the case of a userspace-injected #DB, inject_pending_event()
clears DR7.GD before calling vmx_queue_exception(). However, in the
event that L1 intercepts the exception, this is too early, because
DR7.GD should not be modified by a #DB that causes a VM-exit directly
(Intel SDM vol 3, section 27.1).

3. If the injected exception is a #PF, nested_vmx_check_exception()
doesn't properly check whether or not L1 is interested in the
associated error code (using the #PF error code mask and match fields
from vmcs12). It may either return 0 when it should call
nested_vmx_vmexit() or vice versa.

4. nested_vmx_check_exception() assumes that it is dealing with a
hardware-generated exception intercept from L2, with some of the
relevant details (the VM-exit interruption-information and the exit
qualification) live in vmcs02. For userspace-injected exceptions, this
is not the case.

5. prepare_vmcs12() assumes that when its exit_intr_info argument
specifies valid information with a valid error code that it can VMREAD
the VM-exit interruption error code from vmcs02. For
userspace-injected exceptions, this is not the case.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

28d06353

KVM: x86: fix user triggerable warning in kvm_apic_accept_events() · 28bf2888

由 David Hildenbrand 提交于 3月 23, 2017

If we already entered/are about to enter SMM, don't allow switching to
INIT/SIPI_RECEIVED, otherwise the next call to kvm_apic_accept_events()
will report a warning.

Same applies if we are already in MP state INIT_RECEIVED and SMM is
requested to be turned on. Refuse to set the VCPU events in this case.

Fixes: cd7764fe ("KVM: x86: latch INITs while in system management mode")
Cc: stable@vger.kernel.org # 4.2+
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

28bf2888

kvm: make KVM_CAP_COALESCED_MMIO architecture agnostic · 30422558

由 Paolo Bonzini 提交于 3月 31, 2017

Remove code from architecture files that can be moved to virt/kvm, since there
is already common code for coalesced MMIO.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[Removed a pointless 'break' after 'return'.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

30422558

KVM: nVMX: support RDRAND and RDSEED exiting · a5f46457

由 Paolo Bonzini 提交于 3月 30, 2017

Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a5f46457

kvm: nVMX: support EPT accessed/dirty bits · ae1e2d10

由 Paolo Bonzini 提交于 3月 30, 2017

Now use bit 6 of EPTP to optionally enable A/D bits for EPTP.  Another
thing to change is that, when EPT accessed and dirty bits are not in use,
VMX treats accesses to guest paging structures as data reads.  When they
are in use (bit 6 of EPTP is set), they are treated as writes and the
corresponding EPT dirty bit is set.  The MMU didn't know this detail,
so this patch adds it.

We also have to fix up the exit qualification.  It may be wrong because
KVM sets bit 6 but the guest might not.

L1 emulates EPT A/D bits using write permissions, so in principle it may
be possible for EPT A/D bits to be used by L1 even though not available
in hardware.  The problem is that guest page-table walks will be treated
as reads rather than writes, so they would not cause an EPT violation.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
[Fixed typo in walk_addr_generic() comment and changed bit clear +
 conditional-set pattern in handle_ept_violation() to conditional-clear]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ae1e2d10

kvm: x86: MMU support for EPT accessed/dirty bits · 86407bcb

由 Paolo Bonzini 提交于 3月 30, 2017

This prepares the MMU paging code for EPT accessed and dirty bits,
which can be enabled optionally at runtime.  Code that updates the
accessed and dirty bits will need a pointer to the struct kvm_mmu.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

86407bcb

KVM: VMX: remove bogus check for invalid EPT violation · 00477231

由 Paolo Bonzini 提交于 3月 30, 2017

handle_ept_violation is checking for "guest-linear-address invalid" +
"not a paging-structure walk".  However, _all_ EPT violations without
a valid guest linear address are paging structure walks, because those
EPT violations happen when loading the guest PDPTEs.

Therefore, the check can never be true, and even if it were, KVM doesn't
care about the guest linear address; it only uses the guest *physical*
address VMCS field.  So, remove the check altogether.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

00477231

KVM: nVMX: we support 1GB EPT pages · 7db74265

由 Paolo Bonzini 提交于 3月 08, 2017

Large pages at the PDPE level can be emulated by the MMU, so the bit
can be set unconditionally in the EPT capabilities MSR.  The same is
true of 2MB EPT pages, though all Intel processors with EPT in practice
support those.
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7db74265

KVM: x86: drop legacy device assignment · ad6260da

由 Paolo Bonzini 提交于 3月 27, 2017

Legacy device assignment has been deprecated since 4.2 (released
1.5 years ago).  VFIO is better and everyone should have switched to it.
If they haven't, this should convince them. :)
Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ad6260da

KVM: VMX: require virtual NMI support · 2c82878b

由 Paolo Bonzini 提交于 3月 27, 2017

Virtual NMIs are only missing in Prescott and Yonah chips. Both are obsolete
for virtualization usage---Yonah is 32-bit only even---so drop vNMI emulation.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2c82878b

kvm/svm: Setup MCG_CAP on AMD properly · 74f16909

由 Borislav Petkov 提交于 3月 26, 2017

MCG_CAP[63:9] bits are reserved on AMD. However, on an AMD guest, this
MSR returns 0x100010a. More specifically, bit 24 is set, which is simply
wrong. That bit is MCG_SER_P and is present only on Intel. Thus, clean
up the reserved bits in order not to confuse guests.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

74f16909

KVM: nVMX: single function for switching between vmcs · 1279a6b1

由 David Hildenbrand 提交于 3月 20, 2017

Let's combine it in a single function vmx_switch_vmcs().
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1279a6b1

kvm: vmx: Don't use INVVPID when EPT is enabled · f0b98c02

由 Jim Mattson 提交于 3月 15, 2017

According to the Intel SDM, volume 3, section 28.3.2: Creating and
Using Cached Translation Information, "No linear mappings are used
while EPT is in use." INVEPT will invalidate both the guest-physical
mappings and the combined mappings in the TLBs and paging-structure
caches, so an INVVPID is superfluous.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

f0b98c02

28 3月, 2017 1 次提交

KVM: x86: cleanup the page tracking SRCU instance · 2beb6dad

由 Paolo Bonzini 提交于 3月 27, 2017

SRCU uses a delayed work item.  Skip cleaning it up, and
the result is use-after-free in the work item callbacks.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Suggested-by: NDmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: 0eb05bf2Reviewed-by: NXiao Guangrong <xiaoguangrong.eric@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2beb6dad

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功