提交 · 78051e3b7e35722ad3f31dd611f1b34770bddab8 · openeuler / raspberrypi-kernel

11 12月, 2014 1 次提交

KVM: nVMX: Disable unrestricted mode if ept=0 · 78051e3b

由 Bandan Das 提交于 12月 06, 2014

If L0 has disabled EPT, don't advertise unrestricted
mode at all since it depends on EPT to run real mode code.

Fixes: 92fbc7b1
Cc: stable@vger.kernel.org
Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

78051e3b

05 12月, 2014 4 次提交

kvm: vmx: add nested virtualization support for xsaves · 81dc01f7

由 Wanpeng Li 提交于 12月 04, 2014

Add nested virtualization support for xsaves.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

81dc01f7

kvm: vmx: add MSR logic for XSAVES · 20300099

由 Wanpeng Li 提交于 12月 02, 2014

Add logic to get/set the XSS model-specific register.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

20300099

kvm: x86: handle XSAVES vmcs and vmexit · f53cd63c

由 Wanpeng Li 提交于 12月 02, 2014

Initialize the XSS exit bitmap.  It is zero so there should be no XSAVES
or XRSTORS exits.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f53cd63c

kvm: x86: Add kvm_x86_ops hook that enables XSAVES for guest · 55412b2e

由 Wanpeng Li 提交于 12月 02, 2014

Expose the XSAVES feature to the guest if the kvm_x86_ops say it is
available.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

55412b2e

18 11月, 2014 1 次提交

kvm: x86: vmx: cleanup handle_ept_violation · 81ed33e4

由 Tiejun Chen 提交于 11月 18, 2014

Instead, just use PFERR_{FETCH, PRESENT, WRITE}_MASK
inside handle_ept_violation() for slightly better code.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

81ed33e4

12 11月, 2014 2 次提交

x86, kvm, vmx: Don't set LOAD_IA32_EFER when host and guest match · 54b98bff

由 Andy Lutomirski 提交于 11月 10, 2014

There's nothing to switch if the host and guest values are the same.
I am unable to find evidence that this makes any difference
whatsoever.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
[I could see a difference on Nehalem.  From 5 runs:

 userspace exit, guest!=host   12200 11772 12130 12164 12327
 userspace exit, guest=host    11983 11780 11920 11919 12040
 lightweight exit, guest!=host  3214  3220  3238  3218  3337
 lightweight exit, guest=host   3178  3193  3193  3187  3220

 This passes the t-test with 99% confidence for userspace exit,
 98.5% confidence for lightweight exit. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54b98bff

x86, kvm, vmx: Always use LOAD_IA32_EFER if available · f6577a5f

由 Andy Lutomirski 提交于 11月 07, 2014

At least on Sandy Bridge, letting the CPU switch IA32_EFER is much
faster than switching it manually.

I benchmarked this using the vmexit kvm-unit-test (single run, but
GOAL multiplied by 5 to do more iterations):

Test Before After Change
cpuid 2000 1932 -3.40%
vmcall 1914 1817 -5.07%
mov_from_cr8 13 13 0.00%
mov_to_cr8 19 19 0.00%
inl_from_pmtimer 19164 10619 -44.59%
inl_from_qemu 15662 10302 -34.22%
inl_from_kernel 3916 3802 -2.91%
outl_to_kernel 2230 2194 -1.61%
mov_dr 172 176 2.33%
ipi (skipped) (skipped)
ipi+halt (skipped) (skipped)
ple-round-robin 13 13 0.00%
wr_tsc_adjust_msr 1920 1845 -3.91%
rd_tsc_adjust_msr 1892 1814 -4.12%
mmio-no-eventfd:pci-mem 16394 11165 -31.90%
mmio-wildcard-eventfd:pci-mem 4607 4645 0.82%
mmio-datamatch-eventfd:pci-mem 4601 4610 0.20%
portio-no-eventfd:pci-io 11507 7942 -30.98%
portio-wildcard-eventfd:pci-io 2239 2225 -0.63%
portio-datamatch-eventfd:pci-io 2250 2234 -0.71%

I haven't explicitly computed the significance of these numbers,
but this isn't subtle.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
[The results were reproducible on all of Nehalem, Sandy Bridge and
Ivy Bridge. The slowness of manual switching is because writing
to EFER with WRMSR triggers a TLB flush, even if the only bit you're
touching is SCE (so the page table format is not affected). Doing
the write as part of vmentry/vmexit, instead, does not flush the TLB,
probably because all processors that have EPT also have VPID. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f6577a5f

07 11月, 2014 6 次提交

KVM: x86: Breakpoints do not consider CS.base · 82b32774

由 Nadav Amit 提交于 11月 02, 2014

x86 debug registers hold a linear address. Therefore, breakpoints detection
should consider CS.base, and check whether instruction linear address equals
(CS.base + RIP). This patch introduces a function to evaluate RIP linear
address and uses it for breakpoints detection.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

82b32774

KVM: x86: Clear DR6[0:3] on #DB during handle_dr · 7305eb5d

由 Nadav Amit 提交于 11月 02, 2014

DR6[0:3] (previous breakpoint indications) are cleared when #DB is injected
during handle_exception, just as real hardware does.  Similarily, handle_dr
should clear DR6[0:3].
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7305eb5d

KVM: x86: reset RVI upon system reset · 4114c27d

由 Wei Wang 提交于 11月 05, 2014

A bug was reported as follows: when running Windows 7 32-bit guests on qemu-kvm,
sometimes the guests run into blue screen during reboot. The problem was that a
guest's RVI was not cleared when it rebooted. This patch has fixed the problem.
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NYang Zhang <yang.z.zhang@intel.com>
Tested-by: NRongrong Liu <rongrongx.liu@intel.com>, Da Chun <ngugc@qq.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4114c27d

kvm: x86: vmx: avoid returning bool to distinguish success from error · a2ae9df7

由 Paolo Bonzini 提交于 11月 04, 2014

Return a negative error code instead, and WARN() when we should be covering
the entire 2-bit space of vmcs_field_type's return value. For increased
robustness, add a BUILD_BUG_ON checking the range of vmcs_field_to_offset.
Suggested-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a2ae9df7

kvm: x86: vmx: move some vmx setting from vmx_init() to hardware_setup() · 34a1cd60

由 Tiejun Chen 提交于 10月 28, 2014

Instead of vmx_init(), actually it would make reasonable sense to do
anything specific to vmx hardware setting in vmx_x86_ops->hardware_setup().
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

34a1cd60

kvm: x86: vmx: move down hardware_setup() and hardware_unsetup() · f2c7648d

由 Tiejun Chen 提交于 10月 28, 2014

Just move this pair of functions down to make sure later we can
add something dependent on others.
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f2c7648d

03 11月, 2014 4 次提交

KVM: vmx: Unavailable DR4/5 is checked before CPL · 16f8a6f9

由 Nadav Amit 提交于 10月 03, 2014

If DR4/5 is accessed when it is unavailable (since CR4.DE is set), then #UD
should be generated even if CPL>0. This is according to Intel SDM Table 6-2:
"Priority Among Simultaneous Exceptions and Interrupts".

Note, that this may happen on the first DR access, even if the host does not
sets debug breakpoints. Obviously, it occurs when the host debugs the guest.

This patch moves the DR4/5 checks from __kvm_set_dr/_kvm_get_dr to handle_dr.
The emulator already checks DR4/5 availability in check_dr_read. Nested
virutalization related calls to kvm_set_dr/kvm_get_dr would not like to inject
exceptions to the guest.

As for SVM, the patch follows the previous logic as much as possible. Anyhow,
it appears the DR interception code might be buggy - even if the DR access
may cause an exception, the instruction is skipped.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

16f8a6f9

KVM: x86: Clear DR7.LE during task-switch · 0e8a0996

由 Nadav Amit 提交于 10月 03, 2014

DR7.LE should be cleared during task-switch. This feature is poorly documented.
For reference, see:
http://pdos.csail.mit.edu/6.828/2005/readings/i386/s12_02.htm

SDM [17.2.4]:
  This feature is not supported in the P6 family processors, later IA-32
  processors, and Intel 64 processors.

AMD [2:13.1.1.4]:
  This bit is ignored by implementations of the AMD64 architecture.

Intel's formulation could mean that it isn't even zeroed, but current
hardware indeed does not behave like that.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0e8a0996

KVM: x86: DR7.GD should be cleared upon any #DB exception · 6bdf0662

由 Nadav Amit 提交于 9月 30, 2014

Intel SDM 17.2.4 (Debug Control Register (DR7)) says: "The processor clears the
GD flag upon entering to the debug exception handler." This sentence may be
misunderstood as if it happens only on #DB due to debug-register protection,
but it happens regardless to the cause of the #DB.

Fix the behavior to match both real hardware and Bochs.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6bdf0662

x86,kvm,vmx: Don't trap writes to CR4.TSD · 52ce3c21

由 Andy Lutomirski 提交于 10月 07, 2014

CR4.TSD is guest-owned; don't trap writes to it in VMX guests.  This
avoids a VM exit on context switches into or out of a PR_TSC_SIGSEGV
task.

I think that this fixes an unintentional side-effect of:
    4c38609a KVM: VMX: Make guest cr4 mask more conservative
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

52ce3c21

02 11月, 2014 2 次提交

KVM: vmx: defer load of APIC access page address during reset · a73896cb

由 Paolo Bonzini 提交于 11月 02, 2014

Most call paths to vmx_vcpu_reset do not hold the SRCU lock.  Defer loading
the APIC access page to the next vmentry.

This avoids the following lockdep splat:

[ INFO: suspicious RCU usage. ]
3.18.0-rc2-test2+ #70 Not tainted
-------------------------------
include/linux/kvm_host.h:474 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-x86/2371:
 #0:  (&vcpu->mutex){+.+...}, at: [<ffffffffa037d800>] vcpu_load+0x20/0xd0 [kvm]

stack backtrace:
CPU: 4 PID: 2371 Comm: qemu-system-x86 Not tainted 3.18.0-rc2-test2+ #70
Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
 0000000000000001 ffff880209983ca8 ffffffff816f514f 0000000000000000
 ffff8802099b8990 ffff880209983cd8 ffffffff810bd687 00000000000fee00
 ffff880208a2c000 ffff880208a10000 ffff88020ef50040 ffff880209983d08
Call Trace:
 [<ffffffff816f514f>] dump_stack+0x4e/0x71
 [<ffffffff810bd687>] lockdep_rcu_suspicious+0xe7/0x120
 [<ffffffffa037d055>] gfn_to_memslot+0xd5/0xe0 [kvm]
 [<ffffffffa03807d3>] __gfn_to_pfn+0x33/0x60 [kvm]
 [<ffffffffa0380885>] gfn_to_page+0x25/0x90 [kvm]
 [<ffffffffa038aeec>] kvm_vcpu_reload_apic_access_page+0x3c/0x80 [kvm]
 [<ffffffffa08f0a9c>] vmx_vcpu_reset+0x20c/0x460 [kvm_intel]
 [<ffffffffa039ab8e>] kvm_vcpu_reset+0x15e/0x1b0 [kvm]
 [<ffffffffa039ac0c>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm]
 [<ffffffffa037f7e0>] kvm_vm_ioctl+0x1d0/0x780 [kvm]
 [<ffffffff810bc664>] ? __lock_is_held+0x54/0x80
 [<ffffffff812231f0>] do_vfs_ioctl+0x300/0x520
 [<ffffffff8122ee45>] ? __fget+0x5/0x250
 [<ffffffff8122f0fa>] ? __fget_light+0x2a/0xe0
 [<ffffffff81223491>] SyS_ioctl+0x81/0xa0
 [<ffffffff816fed6d>] system_call_fastpath+0x16/0x1b
Reported-by: NTakashi Iwai <tiwai@suse.de>
Reported-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Fixes: 38b99173Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a73896cb

KVM: nVMX: Disable preemption while reading from shadow VMCS · 282da870

由 Jan Kiszka 提交于 10月 08, 2014

In order to access the shadow VMCS, we need to load it. At this point,
vmx->loaded_vmcs->vmcs and the actually loaded one start to differ. If
we now get preempted by Linux, vmx_vcpu_put and, on return, the
vmx_vcpu_load will work against the wrong vmcs. That can cause
copy_shadow_to_vmcs12 to corrupt the vmcs12 state.

Fix the issue by disabling preemption during the copy operation.
copy_vmcs12_to_shadow is safe from this issue as it is executed by
vmx_vcpu_run when preemption is already disabled before vmentry.

This bug is exposed by running Jailhouse within KVM on CPUs with
shadow VMCS support. Jailhouse never expects an interrupt pending
vmexit, but the bug can cause it if, after copy_shadow_to_vmcs12
is preempted, the active VMCS happens to have the virtual interrupt
pending flag set in the CPU-based execution controls.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

282da870

29 10月, 2014 1 次提交

KVM: nVMX: Disable preemption while reading from shadow VMCS · 41e7ed64

由 Jan Kiszka 提交于 10月 08, 2014

Fix the issue by disabling preemption during the copy operation.
copy_vmcs12_to_shadow is safe from this issue as it is executed by
vmx_vcpu_run when preemption is already disabled before vmentry.

41e7ed64

24 10月, 2014 4 次提交

kvm: x86: don't kill guest on unknown exit reason · 2bc19dc3

由 Michael S. Tsirkin 提交于 9月 18, 2014

KVM_EXIT_UNKNOWN is a kvm bug, we don't really know whether it was
triggered by a priveledged application.  Let's not kill the guest: WARN
and inject #UD instead.

Cc: stable@vger.kernel.org
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2bc19dc3

kvm: vmx: handle invvpid vm exit gracefully · a642fc30

由 Petr Matousek 提交于 9月 23, 2014

On systems with invvpid instruction support (corresponding bit in
IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid
causes vm exit, which is currently not handled and results in
propagation of unknown exit to userspace.

Fix this by installing an invvpid vm exit handler.

This is CVE-2014-3646.

Cc: stable@vger.kernel.org
Signed-off-by: NPetr Matousek <pmatouse@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a642fc30

KVM: x86: Prevent host from panicking on shared MSR writes. · 8b3c3104

由 Andy Honig 提交于 8月 27, 2014

The previous patch blocked invalid writes directly when the MSR
is written.  As a precaution, prevent future similar mistakes by
gracefulling handle GPs caused by writes to shared MSRs.

Cc: stable@vger.kernel.org
Signed-off-by: NAndrew Honig <ahonig@google.com>
[Remove parts obsoleted by Nadav's patch. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8b3c3104

KVM: x86: Check non-canonical addresses upon WRMSR · 854e8bb1

由 Nadav Amit 提交于 9月 16, 2014

Upon WRMSR, the CPU should inject #GP if a non-canonical value (address) is
written to certain MSRs. The behavior is "almost" identical for AMD and Intel
(ignoring MSRs that are not implemented in either architecture since they would
anyhow #GP). However, IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if
non-canonical address is written on Intel but not on AMD (which ignores the top
32-bits).

Accordingly, this patch injects a #GP on the MSRs which behave identically on
Intel and AMD. To eliminate the differences between the architecutres, the
value which is written to IA32_SYSENTER_ESP and IA32_SYSENTER_EIP is turned to
canonical value before writing instead of injecting a #GP.

Some references from Intel and AMD manuals:

According to Intel SDM description of WRMSR instruction #GP is expected on
WRMSR "If the source register contains a non-canonical address and ECX
specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE,
IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP."

According to AMD manual instruction manual:
LSTAR/CSTAR (SYSCALL): "The WRMSR instruction loads the target RIP into the
LSTAR and CSTAR registers. If an RIP written by WRMSR is not in canonical
form, a general-protection exception (#GP) occurs."
IA32_GS_BASE and IA32_FS_BASE (WRFSBASE/WRGSBASE): "The address written to the
base field must be in canonical form or a #GP fault will occur."
IA32_KERNEL_GS_BASE (SWAPGS): "The address stored in the KernelGSbase MSR must
be in canonical form."

This patch fixes CVE-2014-3610.

Cc: stable@vger.kernel.org
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

854e8bb1

19 10月, 2014 1 次提交

x86,kvm,vmx: Preserve CR4 across VM entry · d974baa3

由 Andy Lutomirski 提交于 10月 08, 2014

CR4 isn't constant; at least the TSD and PCE bits can vary.

TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.

This adds a branch and a read from cr4 to each vm entry.  Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact.  A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d974baa3

24 9月, 2014 7 次提交

kvm: x86: Unpin and remove kvm_arch->apic_access_page · c24ae0dc

由 Tang Chen 提交于 9月 24, 2014

In order to make the APIC access page migratable, stop pinning it in
memory.

And because the APIC access page is not pinned in memory, we can
remove kvm_arch->apic_access_page.  When we need to write its
physical address into vmcs, we use gfn_to_page() to get its page
struct, which is needed to call page_to_phys(); the page is then
immediately unpinned.
Suggested-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c24ae0dc

kvm: vmx: Implement set_apic_access_page_addr · 38b99173

由 Tang Chen 提交于 9月 24, 2014

Currently, the APIC access page is pinned by KVM for the entire life
of the guest. We want to make it migratable in order to make memory
hot-unplug available for machines that run KVM.

This patch prepares to handle this for the case where there is no nested
virtualization, or where the nested guest does not have an APIC page of
its own. All accesses to kvm->arch.apic_access_page are changed to go
through kvm_vcpu_reload_apic_access_page.

If the APIC access page is invalidated when the host is running, we update
the VMCS in the next guest entry.

If it is invalidated when the guest is running, the MMU notifier will force
an exit, after which we will handle everything as in the previous case.

If it is invalidated when a nested guest is running, the request will update
either the VMCS01 or the VMCS02. Updating the VMCS01 is done at the
next L2->L1 exit, while updating the VMCS02 is done in prepare_vmcs02.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

38b99173

kvm: x86: fix two typos in comment · b4619660

由 Tiejun Chen 提交于 9月 22, 2014

s/drity/dirty and s/vmsc01/vmcs01
Signed-off-by: NTiejun Chen <tiejun.chen@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4619660

KVM: vmx: Inject #GP on invalid PAT CR · 4566654b

由 Nadav Amit 提交于 9月 18, 2014

Guest which sets the PAT CR to invalid value should get a #GP.  Currently, if
vmx supports loading PAT CR during entry, then the value is not checked.  This
patch makes the required check in that case.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4566654b

KVM: x86: directly use kvm_make_request again · 77c3913b

由 Liang Chen 提交于 9月 18, 2014

A one-line wrapper around kvm_make_request is not particularly
useful. Replace kvm_mmu_flush_tlb() with kvm_make_request().
Signed-off-by: NLiang Chen <liangchen.linux@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

77c3913b

KVM: nested VMX: disable perf cpuid reporting · bc613494

由 Marcelo Tosatti 提交于 9月 18, 2014

Initilization of L2 guest with -cpu host, on L1 guest with -cpu host
triggers:

(qemu) KVM: entry failed, hardware error 0x7
...
nested_vmx_run: VMCS MSR_{LOAD,STORE} unsupported

Nested VMX MSR load/store support is not sufficient to
allow perf for L2 guest.

Until properly fixed, trap CPUID and disable function 0xA.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bc613494

kvm: Make init_rmode_tss() return 0 on success. · 1f755a82

由 Paolo Bonzini 提交于 9月 16, 2014

In init_rmode_tss(), there two variables indicating the return
value, r and ret, and it return 0 on error, 1 on success. The function
is only called by vmx_set_tss_addr(), and ret is redundant.

This patch removes the redundant variable, by making init_rmode_tss()
return 0 on success, -errno on failure.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1f755a82

17 9月, 2014 2 次提交

kvm: Make init_rmode_identity_map() return 0 on success. · f51770ed

由 Tang Chen 提交于 9月 16, 2014

In init_rmode_identity_map(), there two variables indicating the return
value, r and ret, and it return 0 on error, 1 on success. The function
is only called by vmx_create_vcpu(), and ret is redundant.

This patch removes the redundant variable, and makes init_rmode_identity_map()
return 0 on success, -errno on failure.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f51770ed

kvm: Remove ept_identity_pagetable from struct kvm_arch. · a255d479

由 Tang Chen 提交于 9月 16, 2014

kvm_arch->ept_identity_pagetable holds the ept identity pagetable page. But
it is never used to refer to the page at all.

In vcpu initialization, it indicates two things:
1. indicates if ept page is allocated
2. indicates if a memory slot for identity page is initialized

Actually, kvm_arch->ept_identity_pagetable_done is enough to tell if the ept
identity pagetable is initialized. So we can remove ept_identity_pagetable.

NOTE: In the original code, ept identity pagetable page is pinned in memroy.
As a result, it cannot be migrated/hot-removed. After this patch, since
kvm_arch->ept_identity_pagetable is removed, ept identity pagetable page
is no longer pinned in memory. And it can be migrated/hot-removed.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a255d479

11 9月, 2014 1 次提交

kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address. · 73a6d941

由 Tang Chen 提交于 9月 11, 2014

We have APIC_DEFAULT_PHYS_BASE defined as 0xfee00000, which is also the address of
apic access page. So use this macro.
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NGleb Natapov <gleb@kernel.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73a6d941

29 8月, 2014 4 次提交

KVM: remove garbage arg to *hardware_{en,dis}able · 13a34e06

由 Radim Krčmář 提交于 8月 28, 2014

In the beggining was on_each_cpu(), which required an unused argument to
kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten.

Remove unnecessary arguments that stem from this.
Signed-off-by: NRadim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

13a34e06

KVM: x86: fix some sparse warnings · 48d89b92

由 Paolo Bonzini 提交于 8月 26, 2014

Sparse reports the following easily fixed warnings:

arch/x86/kvm/vmx.c:8795:48: sparse: Using plain integer as NULL pointer
arch/x86/kvm/vmx.c:2138:5: sparse: symbol vmx_read_l1_tsc was not declared. Should it be static?
arch/x86/kvm/vmx.c:6151:48: sparse: Using plain integer as NULL pointer
arch/x86/kvm/vmx.c:8851:6: sparse: symbol vmx_sched_in was not declared. Should it be static?

arch/x86/kvm/svm.c:2162:5: sparse: symbol svm_read_l1_tsc was not declared. Should it be static?

Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

48d89b92

KVM: nVMX: nested TPR shadow/threshold emulation · a7c0b07d

由 Wanpeng Li 提交于 8月 21, 2014

This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411

TPR shadow/threshold feature is important to speed up the Windows guest.
Besides, it is a must feature for certain VMM.

We map virtual APIC page address and TPR threshold from L1 VMCS. If
TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested
in, we inject it into L1 VMM for handling.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
[Add PAGE_ALIGNED check, do not write useless virtual APIC page address
 if TPR shadowing is disabled. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a7c0b07d

KVM: nVMX: introduce nested_get_vmcs12_pages · a2bcba50

由 Wanpeng Li 提交于 8月 21, 2014

Introduce function nested_get_vmcs12_pages() to check the valid
of nested apic access page and virtual apic page earlier.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a2bcba50