提交 · fc07e76ac7ffa3afd621a1c3858a503386a14281 · openanolis / cloud-kernel

01 10月, 2015 4 次提交

Revert "KVM: SVM: use NPT page attributes" · fc07e76a

由 Paolo Bonzini 提交于 10月 01, 2015

This reverts commit 3c2e7f7d.
Initializing the mapping from MTRR to PAT values was reported to
fail nondeterministically, and it also caused extremely slow boot
(due to caching getting disabled---bug 103321) with assigned devices.
Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: NSebastian Schuette <dracon@ewetel.net>
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fc07e76a

Revert "KVM: svm: handle KVM_X86_QUIRK_CD_NW_CLEARED in svm_get_mt_mask" · bcf166a9

由 Paolo Bonzini 提交于 10月 01, 2015

This reverts commit 54928303.
It builds on the commit that is being reverted next.

Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bcf166a9

Revert "KVM: SVM: Sync g_pat with guest-written PAT value" · 625422f6

由 Paolo Bonzini 提交于 10月 01, 2015

This reverts commit e098223b,
which has a dependency on other commits being reverted.

Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

625422f6

Revert "KVM: x86: apply guest MTRR virtualization on host reserved pages" · 606decd6

由 Paolo Bonzini 提交于 10月 01, 2015

This reverts commit fd717f11.
It was reported to cause Machine Check Exceptions (bug 104091).

Reported-by: harn-solo@gmx.de
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

606decd6

25 9月, 2015 1 次提交

KVM: svm: do not call kvm_set_cr0 from init_vmcb · 79a8059d

由 Paolo Bonzini 提交于 9月 21, 2015

kvm_set_cr0 may want to call kvm_zap_gfn_range and thus access the
memslots array (SRCU protected).  Using a mini SRCU critical section
is ugly, and adding it to kvm_arch_vcpu_create doesn't work because
the VMX vcpu_create callback calls synchronize_srcu.

Fixes this lockdep splat:

===============================
[ INFO: suspicious RCU usage. ]
4.3.0-rc1+ #1 Not tainted
-------------------------------
include/linux/kvm_host.h:488 suspicious rcu_dereference_check() usage!

other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-i38/17000:
 #0:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: kvm_zap_gfn_range+0x24/0x1a0 [kvm]

[...]
Call Trace:
 dump_stack+0x4e/0x84
 lockdep_rcu_suspicious+0xfd/0x130
 kvm_zap_gfn_range+0x188/0x1a0 [kvm]
 kvm_set_cr0+0xde/0x1e0 [kvm]
 init_vmcb+0x760/0xad0 [kvm_amd]
 svm_create_vcpu+0x197/0x250 [kvm_amd]
 kvm_arch_vcpu_create+0x47/0x70 [kvm]
 kvm_vm_ioctl+0x302/0x7e0 [kvm]
 ? __lock_is_held+0x51/0x70
 ? __fget+0x101/0x210
 do_vfs_ioctl+0x2f4/0x560
 ? __fget_light+0x29/0x90
 SyS_ioctl+0x4c/0x90
 entry_SYSCALL_64_fastpath+0x16/0x73
Reported-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

79a8059d

18 9月, 2015 1 次提交

kvm: svm: reset mmu on VCPU reset · ebae871a

由 Igor Mammedov 提交于 9月 18, 2015

When INIT/SIPI sequence is sent to VCPU which before that
was in use by OS, VMRUN might fail with:

 KVM: entry failed, hardware error 0xffffffff
 EAX=00000000 EBX=00000000 ECX=00000000 EDX=000006d3
 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
 EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =0000 00000000 0000ffff 00009300
 CS =9a00 0009a000 0000ffff 00009a00
 [...]
 CR0=60000010 CR2=b6f3e000 CR3=01942000 CR4=000007e0
 [...]
 EFER=0000000000000000

with corresponding SVM error:
 KVM: FAILED VMRUN WITH VMCB:
 [...]
 cpl:            0                efer:         0000000000001000
 cr0:            0000000080010010 cr2:          00007fd7fe85bf90
 cr3:            0000000187d0c000 cr4:          0000000000000020
 [...]

What happens is that VCPU state right after offlinig:
CR0: 0x80050033  EFER: 0xd01  CR4: 0x7e0
  -> long mode with CR3 pointing to longmode page tables

and when VCPU gets INIT/SIPI following transition happens
CR0: 0 -> 0x60000010 EFER: 0x0  CR4: 0x7e0
  -> paging disabled with stale CR3

However SVM under the hood puts VCPU in Paged Real Mode*
which effectively translates CR0 0x60000010 -> 80010010 after

   svm_vcpu_reset()
       -> init_vmcb()
           -> kvm_set_cr0()
               -> svm_set_cr0()

but from  kvm_set_cr0() perspective CR0: 0 -> 0x60000010
only caching bits are changed and
commit d81135a5
 ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")'
regressed svm_vcpu_reset() which relied on MMU being reset.

As result VMRUN after svm_vcpu_reset() tries to run
VCPU in Paged Real Mode with stale MMU context (longmode page tables),
which causes some AMD CPUs** to bail out with VMEXIT_INVALID.

Fix issue by unconditionally resetting MMU context
at init_vmcb() time.

	* AMD64 Architecture Programmerâ€™s Manual,
	    Volume 2: System Programming, rev: 3.25
	      15.19 Paged Real Mode
	** Opteron 1216
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Fixes: d81135a5
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ebae871a

05 8月, 2015 1 次提交

KVM: MMU: introduce the framework to check zero bits on sptes · c258b62b

由 Xiao Guangrong 提交于 8月 05, 2015

We have abstracted the data struct and functions which are used to check
reserved bit on guest page tables, now we extend the logic to check
zero bits on shadow page tables

The zero bits on sptes include not only reserved bits on hardware but also
the bits that SPTEs willnever use.  For example, shadow pages will never
use GB pages unless the guest uses them too.
Signed-off-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c258b62b

23 7月, 2015 3 次提交

KVM: svm: handle KVM_X86_QUIRK_CD_NW_CLEARED in svm_get_mt_mask · 54928303

由 Paolo Bonzini 提交于 7月 10, 2015

We can disable CD unconditionally when there is no assigned device.
KVM now forces guest PAT to all-writeback in that case, so it makes
sense to also force CR0.CD=0.

When there are assigned devices, emulate cache-disabled operation
through the page tables.  This behavior is consistent with VMX
microcode, where CD/NW are not touched by vmentry/vmexit.  However,
keep this dependent on the quirk because OVMF enables the caches
too late.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54928303

KVM: x86: rename quirk constants to KVM_X86_QUIRK_* · 0da029ed

由 Paolo Bonzini 提交于 7月 23, 2015

Make them clearly architecture-dependent; the capability is valid for
all architectures, but the argument is not.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0da029ed

KVM: x86: introduce kvm_check_has_quirk · 41dbc6bc

由 Paolo Bonzini 提交于 7月 23, 2015

The logic of the disabled_quirks field usually results in a double
negation.  Wrap it in a simple function that checks the bit and
negates it.

Based on a patch from Xiao Guangrong.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

41dbc6bc

10 7月, 2015 3 次提交

KVM: x86: apply guest MTRR virtualization on host reserved pages · fd717f11

由 Paolo Bonzini 提交于 7月 07, 2015

Currently guest MTRR is avoided if kvm_is_reserved_pfn returns true.
However, the guest could prefer a different page type than UC for
such pages. A good example is that pass-throughed VGA frame buffer is
not always UC as host expected.

This patch enables full use of virtual guest MTRRs.
Suggested-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Tested-by: Joerg Roedel <jroedel@suse.de> (on AMD)
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

fd717f11

KVM: SVM: Sync g_pat with guest-written PAT value · e098223b

由 Jan Kiszka 提交于 4月 20, 2015

When hardware supports the g_pat VMCB field, we can use it for emulating
the PAT configuration that the guest configures by writing to the
corresponding MSR.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Tested-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e098223b

KVM: SVM: use NPT page attributes · 3c2e7f7d

由 Paolo Bonzini 提交于 7月 07, 2015

Right now, NPT page attributes are not used, and the final page
attribute depends solely on gPAT (which however is not synced
correctly), the guest MTRRs and the guest page attributes.

However, we can do better by mimicking what is done for VMX.
In the absence of PCI passthrough, the guest PAT can be ignored
and the page attributes can be just WB.  If passthrough is being
used, instead, keep respecting the guest PAT, and emulate the guest
MTRRs through the PAT field of the nested page tables.

The only snag is that WP memory cannot be emulated correctly,
because Linux's default PAT setting only includes the other types.
Tested-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

3c2e7f7d

06 7月, 2015 1 次提交

x86/asm/tsc: Rename native_read_tsc() to rdtsc() · 4ea1636b

由 Andy Lutomirski 提交于 6月 25, 2015

Now that there is no paravirt TSC, the "native" is
inappropriate. The function does RDTSC, so give it the obvious
name: rdtsc().
Suggested-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/fd43e16281991f096c1e4d21574d9e1402c62d39.1434501121.git.luto@kernel.org
[ Ported it to v4.2-rc1. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4ea1636b

23 6月, 2015 1 次提交

KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch · 25462f7f

由 Wei Huang 提交于 6月 19, 2015

This patch defines a new function pointer struct (kvm_pmu_ops) to
support vPMU for both Intel and AMD. The functions pointers defined in
this new struct will be linked with Intel and AMD functions later. In the
meanwhile the struct that maps from event_sel bits to PERF_TYPE_HARDWARE
events is renamed and moved from Intel specific code to kvm_host.h as a
common struct.
Reviewed-by: NJoerg Roedel <jroedel@suse.de>
Tested-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NWei Huang <wei@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

25462f7f

19 6月, 2015 1 次提交

KVM: nSVM: Check for NRIPS support before updating control field · f104765b

由 Bandan Das 提交于 6月 11, 2015

If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f104765b

05 6月, 2015 2 次提交

KVM: x86: advertise KVM_CAP_X86_SMM · 6d396b55

由 Paolo Bonzini 提交于 4月 01, 2015

... and we're done. :)

Because SMBASE is usually relocated above 1M on modern chipsets, and
SMM handlers might indeed rely on 4G segment limits, we only expose it
if KVM is able to run the guest in big real mode.  This includes any
of VMX+emulate_invalid_guest_state, VMX+unrestricted_guest, or SVM.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6d396b55

KVM: x86: use vcpu-specific functions to read/write/translate GFNs · 54bf36aa

由 Paolo Bonzini 提交于 4月 08, 2015

We need to hide SMRAM from guests not running in SMM.  Therefore,
all uses of kvm_read_guest* and kvm_write_guest* must be changed to
check whether the VCPU is in system management mode and use a
different set of memslots.  Switch from kvm_* to the newly-introduced
kvm_vcpu_*, which call into kvm_arch_vcpu_memslots_id.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

54bf36aa

04 6月, 2015 2 次提交

KVM: x86: stubs for SMM support · 64d60670

由 Paolo Bonzini 提交于 5月 07, 2015

This patch adds the interface between x86.c and the emulator: the
SMBASE register, a new emulator flag, the RSM instruction.  It also
adds a new request bit that will be used by the KVM_SMI ioctl.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

64d60670

KVM: x86: pass host_initiated to functions that read MSRs · 609e36d3

由 Paolo Bonzini 提交于 4月 08, 2015

SMBASE is only readable from SMM for the VCPU, but it must be always
accessible if userspace is accessing it.  Thus, all functions that
read MSRs are changed to accept a struct msr_data; the host_initiated
and index fields are pre-initialized, while the data field is filled
on return.
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

609e36d3

20 5月, 2015 1 次提交

Revert "KVM: x86: drop fpu_activate hook" · 0fdd74f7

由 Paolo Bonzini 提交于 5月 20, 2015

This reverts commit 4473b570.  We'll
use the hook again.

Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0fdd74f7

14 5月, 2015 1 次提交

tracing: Rename ftrace_event.h to trace_events.h · af658dca

由 Steven Rostedt (Red Hat) 提交于 4月 29, 2015

The term "ftrace" is really the infrastructure of the function hooks,
and not the trace events. Rename ftrace_event.h to trace_events.h to
represent the trace_event infrastructure and decouple the term ftrace
from it.
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

af658dca

07 5月, 2015 3 次提交

KVM: x86: fix initial PAT value · 74545705

由 Radim Krčmář 提交于 4月 27, 2015

PAT should be 0007_0406_0007_0406h on RESET and not modified on INIT.
VMX used a wrong value (host's PAT) and while SVM used the right one,
it never got to arch.pat.

This is not an issue with QEMU as it will force the correct value.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

74545705

KVM: x86: INIT and reset sequences are different · d28bc9dd

由 Nadav Amit 提交于 4月 13, 2015

x86 architecture defines differences between the reset and INIT sequences.
INIT does not initialize the FPU (including MMX, XMM, YMM, etc.), TSC, PMU,
MSRs (in general), MTRRs machine-check, APIC ID, APIC arbitration ID and BSP.

References (from Intel SDM):

"If the MP protocol has completed and a BSP is chosen, subsequent INITs (either
to a specific processor or system wide) do not cause the MP protocol to be
repeated." [8.4.2: MP Initialization Protocol Requirements and Restrictions]

[Table 9-1. IA-32 Processor States Following Power-up, Reset, or INIT]

"If the processor is reset by asserting the INIT# pin, the x87 FPU state is not
changed." [9.2: X87 FPU INITIALIZATION]

"The state of the local APIC following an INIT reset is the same as it is after
a power-up or hardware reset, except that the APIC ID and arbitration ID
registers are not affected." [10.4.7.3: Local APIC State After an INIT Reset
("Wait-for-SIPI" State)]
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Message-Id: <1428924848-28212-1-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d28bc9dd

KVM: x86: Support for disabling quirks · 90de4a18

由 Nadav Amit 提交于 4月 13, 2015

Introducing KVM_CAP_DISABLE_QUIRKS for disabling x86 quirks that were previous
created in order to overcome QEMU issues. Those issue were mostly result of
invalid VM BIOS.  Currently there are two quirks that can be disabled:

1. KVM_QUIRK_LINT0_REENABLED - LINT0 was enabled after boot
2. KVM_QUIRK_CD_NW_CLEARED - CD and NW are cleared after boot

These two issues are already resolved in recent releases of QEMU, and would
therefore be disabled by QEMU.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Message-Id: <1428879221-29996-1-git-send-email-namit@cs.technion.ac.il>
[Report capability from KVM_CHECK_EXTENSION too. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

90de4a18

08 4月, 2015 1 次提交

KVM: x86: BSP in MSR_IA32_APICBASE is writable · 58d269d8

由 Nadav Amit 提交于 4月 02, 2015

After reset, the CPU can change the BSP, which will be used upon INIT.  Reset
should return the BSP which QEMU asked for, and therefore handled accordingly.

To quote: "If the MP protocol has completed and a BSP is chosen, subsequent
INITs (either to a specific processor or system wide) do not cause the MP
protocol to be repeated."
[Intel SDM 8.4.2: MP Initialization Protocol Requirements and Restrictions]
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Message-Id: <1427933438-12782-3-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

58d269d8

19 3月, 2015 1 次提交

KVM: SVM: Fix confusing message if no exit handlers are installed · faac2458

由 Bandan Das 提交于 3月 16, 2015

I hit this path on a AMD box and thought
someone was playing a April Fool's joke on me.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

faac2458

18 3月, 2015 1 次提交

KVM: x86: For the symbols used locally only should be static type · 52eb5a6d

由 Xiubo Li 提交于 3月 13, 2015

This patch fix the following sparse warnings:

for arch/x86/kvm/x86.c:
warning: symbol 'emulator_read_write' was not declared. Should it be static?
warning: symbol 'emulator_write_emulated' was not declared. Should it be static?
warning: symbol 'emulator_get_dr' was not declared. Should it be static?
warning: symbol 'emulator_set_dr' was not declared. Should it be static?

for arch/x86/kvm/pmu.c:
warning: symbol 'fixed_pmc_events' was not declared. Should it be static?
Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

52eb5a6d

13 3月, 2015 1 次提交

x86: svm: use cr_interception for SVM_EXIT_CR0_SEL_WRITE · 5e57518d

由 David Kaplan 提交于 3月 06, 2015

Another patch in my war on emulate_on_interception() use as a svm exit handler.

These were pulled out of a larger patch at the suggestion of Radim Krcmar, see
https://lkml.org/lkml/2015/2/25/559

Changes since v1:
	* fixed typo introduced after test, retested
Signed-off-by: NDavid Kaplan <david.kaplan@amd.com>
[separated out just cr_interception part from larger removal of
INTERCEPT_CR0_WRITE, forward ported, tested]
Signed-off-by: NJoel Schopp <joel.schopp@amd.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5e57518d

11 3月, 2015 2 次提交

kvm: svm: make wbinvd faster · dab429a7

由 David Kaplan 提交于 3月 02, 2015

No need to re-decode WBINVD since we know what it is from the intercept.
Signed-off-by: NDavid Kaplan <David.Kaplan@amd.com>
[extracted from larger unlrelated patch, forward ported, tested,style cleanup]
Signed-off-by: NJoel Schopp <joel.schopp@amd.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

dab429a7

kvm: x86: make kvm_emulate_* consistant · 5cb56059

由 Joel Schopp 提交于 3月 02, 2015

Currently kvm_emulate() skips the instruction but kvm_emulate_* sometimes
don't.  The end reult is the caller ends up doing the skip themselves.
Let's make them consistant.
Signed-off-by: NJoel Schopp <joel.schopp@amd.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

5cb56059

10 3月, 2015 1 次提交

KVM: SVM: use kvm_register_write()/read() · 668f198f

由 David Kaplan 提交于 2月 20, 2015

KVM has nice wrappers to access the register values, clean up a few places
that should use them but currently do not.
Signed-off-by: NDavid Kaplan <david.kaplan@amd.com>
[forward port and testing]
Signed-off-by: NJoel Schopp <joel.schopp@amd.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

668f198f

03 3月, 2015 1 次提交

KVM: SVM: fix interrupt injection (apic->isr_count always 0) · f563db4b

由 Radim Krčmář 提交于 2月 27, 2015

In commit b4eef9b3, we started to use hwapic_isr_update() != NULL
instead of kvm_apic_vid_enabled(vcpu->kvm). This didn't work because
SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
change apic->isr_count, because it should always be 1. The initial
value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
which is always 0 for SVM, so KVM could have injected interrupts when it
shouldn't.

Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the
initial isr_count depend on hwapic_isr_update() for good measure.

Fixes: b4eef9b3 ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv")
Reported-and-tested-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

f563db4b

04 2月, 2015 1 次提交

x86: Store a per-cpu shadow copy of CR4 · 1e02ce4c

由 Andy Lutomirski 提交于 10月 24, 2014

Context switches and TLB flushes can change individual bits of CR4.
CR4 reads take several cycles, so store a shadow copy of CR4 in a
per-cpu variable.

To avoid wasting a cache line, I added the CR4 shadow to
cpu_tlbstate, which is already touched in switch_mm.  The heaviest
users of the cr4 shadow will be switch_mm and __switch_to_xtra, and
__switch_to_xtra is called shortly after switch_mm during context
switch, so the cacheline is likely to be hot.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Vince Weaver <vince@deater.net>
Cc: "hillf.zj" <hillf.zj@alibaba-inc.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/3a54dd3353fffbf84804398e00dfdc5b7c1afd7d.1414190806.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

1e02ce4c

09 1月, 2015 1 次提交

KVM: x86: mmu: remove argument to kvm_init_shadow_mmu and kvm_init_shadow_ept_mmu · ad896af0

由 Paolo Bonzini 提交于 10月 02, 2013

The initialization function in mmu.c can always use walk_mmu, which
is known to be vcpu->arch.mmu.  Only init_kvm_nested_mmu is used to
initialize vcpu->arch.nested_mmu.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ad896af0

05 12月, 2014 1 次提交

kvm: x86: Add kvm_x86_ops hook that enables XSAVES for guest · 55412b2e

由 Wanpeng Li 提交于 12月 02, 2014

Expose the XSAVES feature to the guest if the kvm_x86_ops say it is
available.
Signed-off-by: NWanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

55412b2e

13 11月, 2014 1 次提交

kvm: svm: move WARN_ON in svm_adjust_tsc_offset · d913b904

由 Chris J Arges 提交于 11月 12, 2014

When running the tsc_adjust kvm-unit-test on an AMD processor with the
IA32_TSC_ADJUST feature enabled, the WARN_ON in svm_adjust_tsc_offset can be
triggered. This WARN_ON checks for a negative adjustment in case __scale_tsc
is called; however it may trigger unnecessary warnings.

This patch moves the WARN_ON to trigger only if __scale_tsc will actually be
called from svm_adjust_tsc_offset. In addition make adj in kvm_set_msr_common
s64 since this can have signed values.
Signed-off-by: NChris J Arges <chris.j.arges@canonical.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d913b904

03 11月, 2014 1 次提交

KVM: vmx: Unavailable DR4/5 is checked before CPL · 16f8a6f9

由 Nadav Amit 提交于 10月 03, 2014

If DR4/5 is accessed when it is unavailable (since CR4.DE is set), then #UD
should be generated even if CPL>0. This is according to Intel SDM Table 6-2:
"Priority Among Simultaneous Exceptions and Interrupts".

Note, that this may happen on the first DR access, even if the host does not
sets debug breakpoints. Obviously, it occurs when the host debugs the guest.

This patch moves the DR4/5 checks from __kvm_set_dr/_kvm_get_dr to handle_dr.
The emulator already checks DR4/5 availability in check_dr_read. Nested
virutalization related calls to kvm_set_dr/kvm_get_dr would not like to inject
exceptions to the guest.

As for SVM, the patch follows the previous logic as much as possible. Anyhow,
it appears the DR interception code might be buggy - even if the DR access
may cause an exception, the instruction is skipped.
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

16f8a6f9

24 10月, 2014 2 次提交

kvm: x86: don't kill guest on unknown exit reason · 2bc19dc3

由 Michael S. Tsirkin 提交于 9月 18, 2014

KVM_EXIT_UNKNOWN is a kvm bug, we don't really know whether it was
triggered by a priveledged application.  Let's not kill the guest: WARN
and inject #UD instead.

Cc: stable@vger.kernel.org
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2bc19dc3

KVM: x86: Check non-canonical addresses upon WRMSR · 854e8bb1

由 Nadav Amit 提交于 9月 16, 2014

Upon WRMSR, the CPU should inject #GP if a non-canonical value (address) is
written to certain MSRs. The behavior is "almost" identical for AMD and Intel
(ignoring MSRs that are not implemented in either architecture since they would
anyhow #GP). However, IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if
non-canonical address is written on Intel but not on AMD (which ignores the top
32-bits).

Accordingly, this patch injects a #GP on the MSRs which behave identically on
Intel and AMD. To eliminate the differences between the architecutres, the
value which is written to IA32_SYSENTER_ESP and IA32_SYSENTER_EIP is turned to
canonical value before writing instead of injecting a #GP.

Some references from Intel and AMD manuals:

According to Intel SDM description of WRMSR instruction #GP is expected on
WRMSR "If the source register contains a non-canonical address and ECX
specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE,
IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP."

According to AMD manual instruction manual:
LSTAR/CSTAR (SYSCALL): "The WRMSR instruction loads the target RIP into the
LSTAR and CSTAR registers. If an RIP written by WRMSR is not in canonical
form, a general-protection exception (#GP) occurs."
IA32_GS_BASE and IA32_FS_BASE (WRFSBASE/WRGSBASE): "The address written to the
base field must be in canonical form or a #GP fault will occur."
IA32_KERNEL_GS_BASE (SWAPGS): "The address stored in the KernelGSbase MSR must
be in canonical form."

This patch fixes CVE-2014-3610.

Cc: stable@vger.kernel.org
Signed-off-by: NNadav Amit <namit@cs.technion.ac.il>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

854e8bb1

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功