提交 · 640bd6e5752274f7dbd2a0a6642fe2db85813bd9 · openeuler / Kernel

24 8月, 2017 1 次提交

KVM: SVM: Enable Virtual GIF feature · 640bd6e5

由 Janakarajan Natarajan 提交于 8月 23, 2017

Enable the Virtual GIF feature. This is done by setting bit 25 at position
60h in the vmcb.

With this feature enabled, the processor uses bit 9 at position 60h as the
virtual GIF when executing STGI/CLGI instructions.

Since the execution of STGI by the L1 hypervisor does not cause a return to
the outermost (L0) hypervisor, the enable_irq_window and enable_nmi_window
are modified.

The IRQ window will be opened even if GIF is not set, under the assumption
that on resuming the L1 hypervisor the IRQ will be held pending until the
processor executes the STGI instruction.

For the NMI window, the STGI intercept is set. This will assist in opening
the window only when GIF=1.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

640bd6e5

18 8月, 2017 2 次提交

KVM: SVM: delete avic_vm_id_bitmap (2 megabyte static array) · 3f0d4db7

由 Denys Vlasenko 提交于 8月 11, 2017

With lightly tweaked defconfig:

    text    data     bss      dec     hex filename
11259661 5109408 2981888 19350957 12745ad vmlinux.before
11259661 5109408  884736 17253805 10745ad vmlinux.after

Only compile-tested.
Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

3f0d4db7

KVM: x86: Avoid guest page table walk when gpa_available is set · 618232e2

由 Brijesh Singh 提交于 8月 17, 2017

When a guest causes a page fault which requires emulation, the
vcpu->arch.gpa_available flag is set to indicate that cr2 contains a
valid GPA.

Currently, emulator_read_write_onepage() makes use of gpa_available flag
to avoid a guest page walk for a known MMIO regions. Lets not limit
the gpa_available optimization to just MMIO region. The patch extends
the check to avoid page walk whenever gpa_available flag is set.
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
[Fix EPT=0 according to Wanpeng Li's fix, plus ensure VMX also uses the
 new code. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
[Moved "ret < 0" to the else brach, as per David's review. - Radim]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

618232e2

08 8月, 2017 2 次提交

KVM: X86: implement the logic for spinlock optimization · de63ad4c

由 Longpeng(Mike) 提交于 8月 08, 2017

get_cpl requires vcpu_load, so we must cache the result (whether the
vcpu was preempted when its cpl=0) in kvm_vcpu_arch.
Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

de63ad4c

KVM: add spinlock optimization framework · 199b5763

由 Longpeng(Mike) 提交于 8月 08, 2017

If a vcpu exits due to request a user mode spinlock, then
the spinlock-holder may be preempted in user mode or kernel mode.
(Note that not all architectures trap spin loops in user mode,
only AMD x86 and ARM/ARM64 currently do).

But if a vcpu exits in kernel mode, then the holder must be
preempted in kernel mode, so we should choose a vcpu in kernel mode
as a more likely candidate for the lock holder.

This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted.  kvm_vcpu_on_spin's
new argument says the same of the spinning VCPU.
Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

199b5763

07 8月, 2017 2 次提交

KVM: x86: use general helpers for some cpuid manipulation · 1b4d56b8

由 Radim Krčmář 提交于 8月 05, 2017

Add guest_cpuid_clear() and use it instead of kvm_find_cpuid_entry().
Also replace some uses of kvm_find_cpuid_entry() with guest_cpuid_has().
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1b4d56b8

KVM: x86: generalize guest_cpuid_has_ helpers · d6321d49

由 Radim Krčmář 提交于 8月 05, 2017

This patch turns guest_cpuid_has_XYZ(cpuid) into guest_cpuid_has(cpuid,
X86_FEATURE_XYZ), which gets rid of many very similar helpers.

When seeing a X86_FEATURE_*, we can know which cpuid it belongs to, but
this information isn't in common code, so we recreate it for KVM.

Add some BUILD_BUG_ONs to make sure that it runs nicely.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d6321d49

02 8月, 2017 1 次提交

KVM: nVMX: fixes to nested virt interrupt injection · b96fb439

由 Paolo Bonzini 提交于 7月 27, 2017

There are three issues in nested_vmx_check_exception:

1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported
by Wanpeng Li;

2) it should rebuild the interruption info and exit qualification fields
from scratch, as reported by Jim Mattson, because the values from the
L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes
a page fault, the EPT misconfig's exit qualification is incorrect).

3) CR2 and DR6 should not be written for exception intercept vmexits
(CR2 only for AMD).

This patch fixes the first two and adds a comment about the last,
outlining the fix.

Cc: Jim Mattson <jmattson@google.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b96fb439

14 7月, 2017 3 次提交

KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf · adfe20fb

由 Wanpeng Li 提交于 7月 13, 2017

Add an nested_apf field to vcpu->arch.exception to identify an async page
fault, and constructs the expected vm-exit information fields. Force a
nested VM exit from nested_vmx_check_exception() if the injected #PF is
async page fault.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

adfe20fb

KVM: async_pf: Add L1 guest async_pf #PF vmexit handler · 1261bfa3

由 Wanpeng Li 提交于 7月 13, 2017

This patch adds the L1 guest async page fault #PF vmexit handler, such
by L1 similar to ordinary async page fault.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
[Passed insn parameters to kvm_mmu_page_fault().]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

1261bfa3

KVM: x86: Simplify kvm_x86_ops->queue_exception parameter list · cfcd20e5

由 Wanpeng Li 提交于 7月 13, 2017

This patch removes all arguments except the first in
kvm_x86_ops->queue_exception since they can extract the arguments from
vcpu->arch.exception themselves.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

cfcd20e5

13 7月, 2017 4 次提交

KVM: SVM: Enable Virtual VMLOAD VMSAVE feature · 89c8a498

由 Janakarajan Natarajan 提交于 7月 06, 2017

Enable the Virtual VMLOAD VMSAVE feature. This is done by setting bit 1
at position B8h in the vmcb.

The processor must have nested paging enabled, be in 64-bit mode and
have support for the Virtual VMLOAD VMSAVE feature for the bit to be set
in the vmcb.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

89c8a498

KVM: SVM: Rename lbr_ctl field in the vmcb control area · 0dc92119

由 Janakarajan Natarajan 提交于 7月 06, 2017

Rename the lbr_ctl variable to better reflect the purpose of the field -
provide support for virtualization extensions.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0dc92119

KVM: SVM: Prepare for new bit definition in lbr_ctl · 8a77e909

由 Janakarajan Natarajan 提交于 7月 06, 2017

The lbr_ctl variable in the vmcb control area is used to enable or
disable Last Branch Record (LBR) virtualization. However, this is to be
done using only bit 0 of the variable. To correct this and to prepare
for a new feature, change the current usage to work only on a particular
bit.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8a77e909

KVM: SVM: handle singlestep exception when skipping emulated instructions · b742c1e6

由 Ladi Prosek 提交于 6月 22, 2017

kvm_skip_emulated_instruction handles the singlestep debug exception
which is something we almost always want. This commit (specifically
the change in rdmsr_interception) makes the debug.flat KVM unit test
pass on AMD.

Two call sites still call skip_emulated_instruction directly:

* In svm_queue_exception where it's used only for moving the rip forward

* In task_switch_interception which is analogous to handle_task_switch
  in VMX
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b742c1e6

30 6月, 2017 1 次提交

objtool, x86: Add several functions and files to the objtool whitelist · c207aee4

由 Josh Poimboeuf 提交于 6月 28, 2017

In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.

These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage.  Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c207aee4

27 6月, 2017 5 次提交

KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit · 1a5e1852

由 Ladi Prosek 提交于 6月 21, 2017

enable_nmi_window is supposed to be a no-op if we know that we'll see
a VM exit by the time the NMI window opens. This commit adds two more
cases:

* We intercept stgi so we don't need to singlestep on GIF=0.

* We emulate nested vmexit so we don't need to singlestep when nested
  VM exit is required.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1a5e1852

KVM: SVM: don't NMI singlestep over event injection · a12713c2

由 Ladi Prosek 提交于 6月 21, 2017

Singlestepping is enabled by setting the TF flag and care must be
taken to not let the guest see (and reuse at an inconvenient time)
the modified rflag value. One such case is event injection, as part
of which flags are pushed on the stack and restored later on iret.

This commit disables singlestepping when we're about to inject an
event and forces an immediate exit for us to re-evaluate the NMI
related state.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a12713c2

KVM: SVM: hide TF/RF flags used by NMI singlestep · 9b611747

由 Ladi Prosek 提交于 6月 21, 2017

These flags are used internally by SVM so it's cleaner to not leak
them to callers of svm_get_rflags. This is similar to how the TF
flag is handled on KVM_GUESTDBG_SINGLESTEP by kvm_get_rflags and
kvm_set_rflags.

Without this change, the flags may propagate from host VMCB to nested
VMCB or vice versa while singlestepping over a nested VM enter/exit,
and then get stuck in inappropriate places.

Example: NMI singlestepping is enabled while running L1 guest. The
instruction to step over is VMRUN and nested vmrun emulation stashes
rflags to hsave->save.rflags. Then if singlestepping is disabled
while still in L2, TF/RF will be cleared from the nested VMCB but the
next nested VM exit will restore them from hsave->save.rflags and
cause an unexpected DB exception.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9b611747

KVM: nSVM: do not forward NMI window singlestep VM exits to L1 · ab2f4d73

由 Ladi Prosek 提交于 6月 21, 2017

Nested hypervisor should not see singlestep VM exits if singlestepping
was enabled internally by KVM. Windows is particularly sensitive to this
and known to bluescreen on unexpected VM exits.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ab2f4d73

KVM: SVM: introduce disable_nmi_singlestep helper · 4aebd0e9

由 Ladi Prosek 提交于 6月 21, 2017

Just moving the code to a new helper in preparation for following
commits.
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4aebd0e9

01 6月, 2017 2 次提交

KVM: Tidy the whitespace in nested_svm_check_permissions() · e9196ceb

由 Dan Carpenter 提交于 5月 18, 2017

I moved the || to the line before.  Also I replaced some spaces with a
tab on the "return 0;" line.  It looks OK in the diff but originally
that line was only indented 7 spaces.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

e9196ceb

KVM: SVM: do not zero out segment attributes if segment is unusable or not present · d9c1b543

由 Roman Pen 提交于 6月 01, 2017

This is a fix for the problem [1], where VMCB.CPL was set to 0 and interrupt
was taken on userspace stack.  The root cause lies in the specific AMD CPU
behaviour which manifests itself as unusable segment attributes on SYSRET.
The corresponding work around for the kernel is the following:

61f01dd9 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue")

In other turn virtualization side treated unusable segment incorrectly and
restored CPL from SS attributes, which were zeroed out few lines above.

In current patch it is assured only that P bit is cleared in VMCB.save state
and segment attributes are not zeroed out if segment is not presented or is
unusable, therefore CPL can be safely restored from DPL field.

This is only one part of the fix, since QEMU side should be fixed accordingly
not to zero out attributes on its side.  Corresponding patch will follow.

[1] Message id: CAJrWOzD6Xq==b-zYCDdFLgSRMPM-NkNuTSDFEtX=7MreT45i7Q@mail.gmail.com
Signed-off-by: NRoman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: NMikhail Sennikovskii <mikhail.sennikovskii@profitbricks.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d9c1b543

30 5月, 2017 1 次提交

KVM: SVM: ignore type when setting segment registers · 8eae9570

由 Gioh Kim 提交于 5月 30, 2017

Commit 19bca6ab ("KVM: SVM: Fix cross vendor migration issue with
unusable bit") added checking type when setting unusable.
So unusable can be set if present is 0 OR type is 0.
According to the AMD processor manual, long mode ignores the type value
in segment descriptor. And type can be 0 if it is read-only data segment.
Therefore type value is not related to unusable flag.

This patch is based on linux-next v4.12.0-rc3.
Signed-off-by: NGioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8eae9570

18 5月, 2017 1 次提交

KVM: Silence underflow warning in avic_get_physical_id_entry() · d3e7dec0

由 Dan Carpenter 提交于 5月 18, 2017

Smatch complains that we check cap the upper bound of "index" but don't
check for negatives.  It's a false positive because "index" is never
negative.  But it's also simple enough to make it unsigned which makes
the code easier to audit.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d3e7dec0

21 4月, 2017 1 次提交

kvm: better MWAIT emulation for guests · 668fffa3

由 Michael S. Tsirkin 提交于 4月 21, 2017

Guests that are heavy on futexes end up IPI'ing each other a lot. That
can lead to significant slowdowns and latency increase for those guests
when running within KVM.

If only a single guest is needed on a host, we have a lot of spare host
CPU time we can throw at the problem. Modern CPUs implement a feature
called "MWAIT" which allows guests to wake up sleeping remote CPUs without
an IPI - thus without an exit - at the expense of never going out of guest
context.

The decision whether this is something sensible to use should be up to the
VM admin, so to user space. We can however allow MWAIT execution on systems
that support it properly hardware wise.

This patch adds a CAP to user space and a KVM cpuid leaf to indicate
availability of native MWAIT execution. With that enabled, the worst a
guest can do is waste as many cycles as a "jmp ." would do, so it's not
a privilege problem.

We consciously do *not* expose the feature in our CPUID bitmap, as most
people will want to benefit from sleeping vCPUs to allow for over commit.
Reported-by: N"Gabriel L. Somlo" <gsomlo@gmail.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
[agraf: fix amd, change commit message]
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

668fffa3

07 4月, 2017 1 次提交

kvm/svm: Setup MCG_CAP on AMD properly · 74f16909

由 Borislav Petkov 提交于 3月 26, 2017

MCG_CAP[63:9] bits are reserved on AMD. However, on an AMD guest, this
MSR returns 0x100010a. More specifically, bit 24 is set, which is simply
wrong. That bit is MCG_SER_P and is present only on Intel. Thus, clean
up the reserved bits in order not to confuse guests.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

74f16909

20 3月, 2017 1 次提交

kvm: fix usage of uninit spinlock in avic_vm_destroy() · 3863dff0

由 Dmitry Vyukov 提交于 1月 24, 2017

If avic is not enabled, avic_vm_init() does nothing and returns early.
However, avic_vm_destroy() still tries to destroy what hasn't been created.
The only bad consequence of this now is that avic_vm_destroy() uses
svm_vm_data_hash_lock that hasn't been initialized (and is not meant
to be used at all if avic is not enabled).

Return early from avic_vm_destroy() if avic is not enabled.
It has nothing to destroy.
Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: kvm@vger.kernel.org
Cc: syzkaller@googlegroups.com
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

3863dff0

16 3月, 2017 1 次提交

x86: Make the GDT remapping read-only on 64-bit · 45fc8757

由 Thomas Garnier 提交于 3月 14, 2017

This patch makes the GDT remapped pages read-only, to prevent accidental
(or intentional) corruption of this key data structure.

This change is done only on 64-bit, because 32-bit needs it to be writable
for TSS switches.

The native_load_tr_desc function was adapted to correctly handle a
read-only GDT. The LTR instruction always writes to the GDT TSS entry.
This generates a page fault if the GDT is read-only. This change checks
if the current GDT is a remap and swap GDTs as needed. This function was
tested by booting multiple machines and checking hibernation works
properly.

KVM SVM and VMX were adapted to use the writeable GDT. On VMX, the
per-cpu variable was removed for functions to fetch the original GDT.
Instead of reloading the previous GDT, VMX will reload the fixmap GDT as
expected. For testing, VMs were started and restored on multiple
configurations.
Signed-off-by: NThomas Garnier <thgarnie@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Luis R . Rodriguez <mcgrof@kernel.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kasan-dev@googlegroups.com
Cc: kernel-hardening@lists.openwall.com
Cc: kvm@vger.kernel.org
Cc: lguest@lists.ozlabs.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-pm@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: zijun_hu <zijun_hu@htc.com>
Link: http://lkml.kernel.org/r/20170314170508.100882-3-thgarnie@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

45fc8757

17 2月, 2017 1 次提交

KVM: x86: remove code for lazy FPU handling · bd7e5b08

由 Paolo Bonzini 提交于 2月 03, 2017

The FPU is always active now when running KVM.
Reviewed-by: NDavid Matlack <dmatlack@google.com>
Reviewed-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bd7e5b08

15 2月, 2017 2 次提交

KVM: svm: inititalize hash table structures directly · 681bcea8

由 David Hildenbrand 提交于 1月 24, 2017

The hashtable and guarding spinlock are global data structures,
we can inititalize them statically.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Message-Id: <20170124212116.4568-1-david@redhat.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

681bcea8

KVM: x86: do not scan IRR twice on APICv vmentry · 76dfafd5

由 Paolo Bonzini 提交于 12月 19, 2016

Calls to apic_find_highest_irr are scanning IRR twice, once
in vmx_sync_pir_from_irr and once in apic_search_irr. Change
sync_pir_from_irr to get the new maximum IRR from kvm_apic_update_irr;
now that it does the computation, it can also do the RVI write.

In order to avoid complications in svm.c, make the callback optional.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

76dfafd5

09 1月, 2017 1 次提交

kvm: svm: Use the hardware provided GPA instead of page walk · 0f89b207

由 Tom Lendacky 提交于 12月 14, 2016

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0f89b207

08 12月, 2016 2 次提交

KVM: x86: Add kvm_skip_emulated_instruction and use it. · 6affcbed

由 Kyle Huey 提交于 11月 29, 2016

kvm_skip_emulated_instruction calls both
kvm_x86_ops->skip_emulated_instruction and kvm_vcpu_check_singlestep,
skipping the emulated instruction and generating a trap if necessary.

Replacing skip_emulated_instruction calls with
kvm_skip_emulated_instruction is straightforward, except for:

- ICEBP, which is already inside a trap, so avoid triggering another trap.
- Instructions that can trigger exits to userspace, such as the IO insns,
  MOVs to CR8, and HALT. If kvm_skip_emulated_instruction does trigger a
  KVM_GUESTDBG_SINGLESTEP exit, and the handling code for
  IN/OUT/MOV CR8/HALT also triggers an exit to userspace, the latter will
  take precedence. The singlestep will be triggered again on the next
  instruction, which is the current behavior.
- Task switch instructions which would require additional handling (e.g.
  the task switch bit) and are instead left alone.
- Cases where VMLAUNCH/VMRESUME do not proceed to the next instruction,
  which do not trigger singlestep traps as mentioned previously.
Signed-off-by: NKyle Huey <khuey@kylehuey.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

6affcbed

KVM: x86: Add a return value to kvm_emulate_cpuid · 6a908b62

由 Kyle Huey 提交于 11月 29, 2016

Once skipping the emulated instruction can potentially trigger an exit to
userspace (via KVM_GUESTDBG_SINGLESTEP) kvm_emulate_cpuid will need to
propagate a return value.
Signed-off-by: NKyle Huey <khuey@kylehuey.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

6a908b62

25 11月, 2016 2 次提交

kvm: svm: Add kvm_fast_pio_in support · 8370c3d0

由 Tom Lendacky 提交于 11月 23, 2016

Update the I/O interception support to add the kvm_fast_pio_in function
to speed up the in instruction similar to the out instruction.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8370c3d0

kvm: svm: Add support for additional SVM NPF error codes · 14727754

由 Tom Lendacky 提交于 11月 23, 2016

AMD hardware adds two additional bits to aid in nested page fault handling.

Bit 32 - NPF occurred while translating the guest's final physical address
Bit 33 - NPF occurred while translating the guest page tables

The guest page tables fault indicator can be used as an aid for nested
virtualization. Using V0 for the host, V1 for the first level guest and
V2 for the second level guest, when both V1 and V2 are using nested paging
there are currently a number of unnecessary instruction emulations. When
V2 is launched shadow paging is used in V1 for the nested tables of V2. As
a result, KVM marks these pages as RO in the host nested page tables. When
V2 exits and we resume V1, these pages are still marked RO.

Every nested walk for a guest page table is treated as a user-level write
access and this causes a lot of NPFs because the V1 page tables are marked
RO in the V0 nested tables. While executing V1, when these NPFs occur KVM
sees a write to a read-only page, emulates the V1 instruction and unprotects
the page (marking it RW). This patch looks for cases where we get a NPF due
to a guest page table walk where the page was marked RO. It immediately
unprotects the page and resumes the guest, leading to far fewer instruction
emulations when nested virtualization is used.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

14727754

03 11月, 2016 1 次提交

KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK · ea26e4ec

由 Paolo Bonzini 提交于 11月 01, 2016

Since commit a545ab6a ("kvm: x86: add tsc_offset field to struct
kvm_vcpu_arch", 2016-09-07) the offset between host and L1 TSC is
cached and need not be fished out of the VMCS or VMCB.  This means
that we can implement adjust_tsc_offset_guest and read_l1_tsc
entirely in generic code.  The simplification is particularly
significant for VMX code, where vmx->nested.vmcs01_tsc_offset
was duplicating what is now in vcpu->arch.tsc_offset.  Therefore
the vmcs01_tsc_offset can be dropped completely.

More importantly, this fixes KVM_GET_CLOCK/KVM_SET_CLOCK
which, after commit 108b249c ("KVM: x86: introduce get_kvmclock_ns",
2016-09-01) called read_l1_tsc while the VMCS was not loaded.
It thus returned bogus values on Intel CPUs.

Fixes: 108b249cReported-by: NRoman Kagan <rkagan@virtuozzo.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ea26e4ec

20 9月, 2016 1 次提交

kvm: svm: fix unsigned compare less than zero comparison · adad0d02

由 Colin Ian King 提交于 9月 19, 2016

vm_data->avic_vm_id is a u32, so the check for a error
return (less than zero) such as -EAGAIN from
avic_get_next_vm_id currently has no effect whatsoever.
Fix this by using a temporary int for the comparison
and assign vm_data->avic_vm_id to this. I used an explicit
u32 cast in the assignment to show why vm_data->avic_vm_id
cannot be used in the assign/compare steps.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

adad0d02

16 9月, 2016 1 次提交

kvm: x86: drop read_tsc_offset() · 3e3f5026

由 Luiz Capitulino 提交于 9月 07, 2016

The TSC offset can now be read directly from struct kvm_arch_vcpu.
Signed-off-by: NLuiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3e3f5026

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功