提交 · 7af40ad37b3f097f367cbe9c0198caccce6fd83b · openeuler / raspberrypi-kernel

17 1月, 2014 9 次提交

KVM: nVMX: Fix nested_run_pending on activity state HLT · 7af40ad3

由 Jan Kiszka 提交于 1月 04, 2014

When we suspend the guest in HLT state, the nested run is no longer
pending - we emulated it completely. So only set nested_run_pending
after checking the activity state.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7af40ad3

KVM: nVMX: Clean up handling of VMX-related MSRs · cae50139

由 Jan Kiszka 提交于 1月 04, 2014

This simplifies the code and also stops issuing warning about writing to
unhandled MSRs when VMX is disabled or the Feature Control MSR is
locked - we do handle them all according to the spec.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cae50139

KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject · 542060ea

由 Jan Kiszka 提交于 1月 04, 2014

Already used by nested SVM for tracing nested vmexit: kvm_nested_vmexit
marks exits from L2 to L0 while kvm_nested_vmexit_inject marks vmexits
that are reflected to L1.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

542060ea

KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit · 533558bc

由 Jan Kiszka 提交于 1月 04, 2014

Instead of fixing up the vmcs12 after the nested vmexit, pass key
parameters already when calling nested_vmx_vmexit. This will help
tracing those vmexits.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

533558bc

KVM: nVMX: Leave VMX mode on clearing of feature control MSR · 42124925

由 Jan Kiszka 提交于 1月 04, 2014

When userspace sets MSR_IA32_FEATURE_CONTROL to 0, make sure we leave
root and non-root mode, fully disabling VMX. The register state of the
VCPU is undefined after this step, so userspace has to set it to a
proper state afterward.

This enables to reboot a VM while it is running some hypervisor code.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

42124925

KVM: VMX: Fix DR6 update on #DB exception · 8246bf52

由 Jan Kiszka 提交于 1月 04, 2014

According to the SDM, only bits 0-3 of DR6 "may" be cleared by "certain"
debug exception. So do update them on #DB exception in KVM, but leave
the rest alone, only setting BD and BS in addition to already set bits
in DR6. This also aligns us with kvm_vcpu_check_singlestep.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8246bf52

KVM: SVM: Fix reading of DR6 · 73aaf249

由 Jan Kiszka 提交于 1月 04, 2014

In contrast to VMX, SVM dose not automatically transfer DR6 into the
VCPU's arch.dr6. So if we face a DR6 read, we must consult a new vendor
hook to obtain the current value. And as SVM now picks the DR6 state
from its VMCB, we also need a set callback in order to write updates of
DR6 back.

Fixes a regression of 020df079.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

73aaf249

KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS · 9926c9fd

由 Jan Kiszka 提交于 1月 04, 2014

Whenever we change arch.dr7, we also have to call kvm_update_dr7. In
case guest debugging is off, this will synchronize the new state into
hardware.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9926c9fd

add support for Hyper-V reference time counter · e984097b

由 Vadim Rozenfeld 提交于 1月 16, 2014

Signed-off: Peter Lieven <pl@kamp.de>
Signed-off: Gleb Natapov
Signed-off: Vadim Rozenfeld <vrozenfe@redhat.com>

After some consideration I decided to submit only Hyper-V reference
counters support this time. I will submit iTSC support as a separate
patch as soon as it is ready.

v1 -> v2
1. mark TSC page dirty as suggested by
    Eric Northup <digitaleric@google.com> and Gleb
2. disable local irq when calling get_kernel_ns,
    as it was done by Peter Lieven <pl@amp.de>
3. move check for TSC page enable from second patch
    to this one.

v3 -> v4
    Get rid of ref counter offset.

v4 -> v5
    replace __copy_to_user with kvm_write_guest
    when updateing iTSC page.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e984097b

16 1月, 2014 1 次提交

KVM: remove useless write to vcpu->hv_clock.tsc_timestamp · aab6d7ce

由 Paolo Bonzini 提交于 1月 15, 2014

After the previous patch from Marcelo, the comment before this write
became obsolete.  In fact, the write is unnecessary.  The calls to
kvm_write_tsc ultimately result in a master clock update as soon as
all TSCs agree and the master clock is re-enabled.  This master
clock update will rewrite tsc_timestamp.

So, together with the comment, delete the dead write too.
Reviewed-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

aab6d7ce

15 1月, 2014 3 次提交

KVM: x86: fix tsc catchup issue with tsc scaling · f25e656d

由 Marcelo Tosatti 提交于 1月 06, 2014

To fix a problem related to different resolution of TSC and system clock,
the offset in TSC units is approximated by

delta = vcpu->hv_clock.tsc_timestamp 	- 	vcpu->last_guest_tsc

(Guest TSC value at 			(Guest TSC value at last VM-exit)
the last kvm_guest_time_update
call)

Delta is then later scaled using mult,shift pair found in hv_clock
structure (which is correct against tsc_timestamp in that
structure).

However, if a frequency change is performed between these two points,
this delta is measured using different TSC frequencies, but scaled using
mult,shift pair for one frequency only.

The end result is an incorrect delta.

The bug which this code works around is not the only cause for
clock backwards events. The global accumulator is still
necessary, so remove the max_kernel_ns fix and rely on the
global accumulator for no clock backwards events.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f25e656d

KVM: x86: limit PIT timer frequency · 9ed96e87

由 Marcelo Tosatti 提交于 1月 06, 2014

Limit PIT timer frequency similarly to the limit applied by
LAPIC timer.

Cc: stable@kernel.org
Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9ed96e87

KVM: x86: handle invalid root_hpa everywhere · 37f6a4e2

由 Marcelo Tosatti 提交于 1月 03, 2014

Rom Freiman <rom@stratoscale.com> notes other code paths vulnerable to
bug fixed by 989c6b34.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

37f6a4e2

09 1月, 2014 3 次提交

KVM: VMX: fix use after free of vmx->loaded_vmcs · 26a865f4

由 Marcelo Tosatti 提交于 1月 03, 2014

After free_loaded_vmcs executes, the "loaded_vmcs" structure
is kfreed, and now vmx->loaded_vmcs points to a kfreed area.
Subsequent free_loaded_vmcs then attempts to manipulate
vmx->loaded_vmcs.

Switch the order to avoid the problem.

https://bugzilla.redhat.com/show_bug.cgi?id=1047892Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

26a865f4

KVM: x86: Fix debug typo error in lapic · 96893977

由 Chen Fan 提交于 1月 02, 2014

fix the 'vcpi' typos when apic_debug is enabled.
Signed-off-by: NChen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

96893977

KVM: VMX: check use I/O bitmap first before unconditional I/O exit · 2f0a6397

由 Zhihui Zhang 提交于 12月 30, 2013

According to Table C-1 of Intel SDM 3C, a VM exit happens on an I/O instruction when
"use I/O bitmaps" VM-execution control was 0 _and_ the "unconditional I/O exiting"
VM-execution control was 1. So we can't just check "unconditional I/O exiting" alone.
This patch was improved by suggestion from Jan Kiszka.
Reviewed-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NZhihui Zhang <zzhsuny@gmail.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

2f0a6397

21 12月, 2013 2 次提交

KVM: MMU: handle invalid root_hpa at __direct_map · 989c6b34

由 Marcelo Tosatti 提交于 12月 19, 2013

It is possible for __direct_map to be called on invalid root_hpa
(-1), two examples:

1) try_async_pf -> can_do_async_pf
    -> vmx_interrupt_allowed -> nested_vmx_vmexit
2) vmx_handle_exit -> vmx_interrupt_allowed -> nested_vmx_vmexit

Then to load_vmcs12_host_state and kvm_mmu_reset_context.

Check for this possibility, let fault exception be regenerated.

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=924916Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

989c6b34

KVM: VMX: Do not skip the instruction if handle_dr injects a fault · 4c4d563b

由 Jan Kiszka 提交于 12月 18, 2013

If kvm_get_dr or kvm_set_dr reports that it raised a fault, we must not
advance the instruction pointer. Otherwise the exception will hit the
wrong instruction.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4c4d563b

18 12月, 2013 1 次提交

KVM: nVMX: Support direct APIC access from L2 · ca3f257a

由 Jan Kiszka 提交于 12月 16, 2013

It's a pathological case, but still a valid one: If L1 disables APIC
virtualization and also allows L2 to directly write to the APIC page, we
have to forcibly enable APIC virtualization while in L2 if the in-kernel
APIC is in use.

This allows to run the direct interrupt test case in the vmx unit test
without x2APIC.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ca3f257a

13 12月, 2013 2 次提交

KVM: x86: Add comment on vcpu_enter_guest()'s return value · 9357d939

由 Takuya Yoshikawa 提交于 12月 13, 2013

Giving proper names to the 0 and 1 was once suggested. But since 0 is
returned to the userspace, giving it another name can introduce extra
confusion. This patch just explains the meanings instead.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9357d939

KVM: Use cond_resched() directly and remove useless kvm_resched() · c08ac06a

由 Takuya Yoshikawa 提交于 12月 13, 2013

Since the commit 15ad7146 ("KVM: Use the scheduler preemption notifiers
to make kvm preemptible"), the remaining stuff in this function is a
simple cond_resched() call with an extra need_resched() check which was
there to avoid dropping VCPUs unnecessarily. Now it is meaningless.
Signed-off-by: NTakuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c08ac06a

12 12月, 2013 2 次提交

KVM: nVMX: Add support for activity state HLT · 6dfacadd

由 Jan Kiszka 提交于 12月 04, 2013

We can easily emulate the HLT activity state for L1: If it decides that
L2 shall be halted on entry, just invoke the normal emulation of halt
after switching to L2. We do not depend on specific host features to
provide this, so we can expose the capability unconditionally.
Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6dfacadd

KVM: VMX: shadow VM_(ENTRY|EXIT)_CONTROLS vmcs field · 2961e876

由 Gleb Natapov 提交于 11月 25, 2013

VM_(ENTRY|EXIT)_CONTROLS vmcs fields are read/written on each guest
entry but most times it can be avoided since values do not changes.
Keep fields copy in memory to avoid unnecessary reads from vmcs.
Signed-off-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2961e876

20 11月, 2013 1 次提交

kvm: mmu: delay mmu audit activation · 521ee0cf

由 Sasha Levin 提交于 11月 19, 2013

We should not be using jump labels before they were initialized. Push back
the callback to until after jump label initialization.
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

521ee0cf

14 11月, 2013 1 次提交

kvm, vmx: Fix lazy FPU on nested guest · e504c909

由 Anthoine Bourgeois 提交于 11月 13, 2013

If a nested guest does a NM fault but its CR0 doesn't contain the TS
flag (because it was already cleared by the guest with L1 aid) then we
have to activate FPU ourselves in L0 and then continue to L2. If TS flag
is set then we fallback on the previous behavior, forward the fault to
L1 if it asked for.
Signed-off-by: NAnthoine Bourgeois <bourgeois@bertin.fr>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e504c909

07 11月, 2013 1 次提交

kvm, cpuid: Fix sparse warning · 1b2ca422

由 Borislav Petkov 提交于 11月 06, 2013

We need to copy padding to kernel space first before looking at it.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

1b2ca422

06 11月, 2013 1 次提交

kvm: optimize out smp_mb after srcu_read_unlock · 01b71917

由 Michael S. Tsirkin 提交于 11月 04, 2013

I noticed that srcu_read_lock/unlock both have a memory barrier,
so just by moving srcu_read_unlock earlier we can get rid of
one call to smp_mb() using smp_mb__after_srcu_read_unlock instead.

Unsurprisingly, the gain is small but measureable using the unit test
microbenchmark:
before
        vmcall in the ballpark of 1410 cycles
after
        vmcall in the ballpark of 1360 cycles
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

01b71917

05 11月, 2013 3 次提交

KVM: x86: trace cpuid emulation when called from emulator · a9d4e439

由 Gleb Natapov 提交于 11月 04, 2013

Currently cpuid emulation is traced only when executed by intercept.
Move trace point so that emulator invocation is traced too.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

a9d4e439

KVM: emulator: cleanup decode_register_operand() a bit · 6d4d85ec

由 Gleb Natapov 提交于 11月 04, 2013

Make code shorter.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

6d4d85ec

KVM: emulator: check rex prefix inside decode_register() · aa9ac1a6

由 Gleb Natapov 提交于 11月 04, 2013

All decode_register() callers check if instruction has rex prefix
to properly decode one byte operand. It make sense to move the check
inside.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

aa9ac1a6

03 11月, 2013 1 次提交

KVM: x86: fix emulation of "movzbl %bpl, %eax" · daf72722

由 Paolo Bonzini 提交于 10月 31, 2013

When I was looking at RHEL5.9's failure to start with
unrestricted_guest=0/emulate_invalid_guest_state=1, I got it working with a
slightly older tree than kvm.git.  I now debugged the remaining failure,
which was introduced by commit 660696d1 (KVM: X86 emulator: fix
source operand decoding for 8bit mov[zs]x instructions, 2013-04-24)
introduced a similar mis-emulation to the one in commit 8acb4207 (KVM:
fix sil/dil/bpl/spl in the mod/rm fields, 2013-05-30).  The incorrect
decoding occurs in 8-bit movzx/movsx instructions whose 8-bit operand
is sil/dil/bpl/spl.

Needless to say, "movzbl %bpl, %eax" does occur in RHEL5.9's decompression
prolog, just a handful of instructions before finally giving control to
the decompressed vmlinux and getting out of the invalid guest state.

Because OpMem8 bypasses decode_modrm, the same handling of the REX prefix
must be applied to OpMem8.
Reported-by: NMichele Baldessari <michele@redhat.com>
Cc: stable@vger.kernel.org
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGleb Natapov <gleb@redhat.com>

daf72722

01 11月, 2013 1 次提交

KVM: x86: emulate SAHF instruction · 98f73630

由 Paolo Bonzini 提交于 10月 31, 2013

Yet another instruction that we fail to emulate, this time found
in Windows 2008R2 32-bit.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

98f73630

31 10月, 2013 8 次提交

kvm/vmx: error message typo fix · 60266204

由 Michael S. Tsirkin 提交于 10月 31, 2013

mst can't be blamed for lack of switch entries: the
issue is with msrs actually.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

60266204

KVM: x86: fix KVM_SET_XCRS loop · c67a04cb

由 Paolo Bonzini 提交于 10月 17, 2013

The loop was always using 0 as the index.  This means that
any rubbish after the first element of the array went undetected.
It seems reasonable to assume that no KVM userspace did that.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c67a04cb

KVM: x86: fix KVM_SET_XCRS for CPUs that do not support XSAVE · 46c34cb0

由 Paolo Bonzini 提交于 10月 17, 2013

The KVM_SET_XCRS ioctl must accept anything that KVM_GET_XCRS
could return.  XCR0's bit 0 is always 1 in real processors with
XSAVE, and KVM_GET_XCRS will always leave bit 0 set even if the
emulated processor does not have XSAVE.  So, KVM_SET_XCRS must
ignore that bit when checking for attempts to enable unsupported
save states.
Reviewed-by: NGleb Natapov <gleb@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

46c34cb0

kvm: Create non-coherent DMA registeration · e0f0bbc5

由 Alex Williamson 提交于 10月 30, 2013

We currently use some ad-hoc arch variables tied to legacy KVM device
assignment to manage emulation of instructions that depend on whether
non-coherent DMA is present. Create an interface for this, adapting
legacy KVM device assignment and adding VFIO via the KVM-VFIO device.
For now we assume that non-coherent DMA is possible any time we have a
VFIO group. Eventually an interface can be developed as part of the
VFIO external user interface to query the coherency of a group.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

e0f0bbc5

kvm/x86: Convert iommu_flags to iommu_noncoherent · d96eb2c6

由 Alex Williamson 提交于 10月 30, 2013

Default to operating in coherent mode.  This simplifies the logic when
we switch to a model of registering and unregistering noncoherent I/O
with KVM.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d96eb2c6

kvm: Add VFIO device · ec53500f

由 Alex Williamson 提交于 10月 30, 2013

So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made. This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction. The user creates
the device and can add and remove VFIO groups to it via file
descriptors. When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ec53500f

kvm: Emulate MOVBE · 84cffe49

由 Borislav Petkov 提交于 10月 29, 2013

This basically came from the need to be able to boot 32-bit Atom SMP
guests on an AMD host, i.e. a host which doesn't support MOVBE. As a
matter of fact, qemu has since recently received MOVBE support but we
cannot share that with kvm emulation and thus we have to do this in the
host. We're waay faster in kvm anyway. :-)

So, we piggyback on the #UD path and emulate the MOVBE functionality.
With it, an 8-core SMP guest boots in under 6 seconds.

Also, requesting MOVBE emulation needs to happen explicitly to work,
i.e. qemu -cpu n270,+movbe...

Just FYI, a fairly straight-forward boot of a MOVBE-enabled 3.9-rc6+
kernel in kvm executes MOVBE ~60K times.
Signed-off-by: NAndre Przywara <andre@andrep.de>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

84cffe49

kvm, emulator: Add initial three-byte insns support · 0bc5eedb

由 Borislav Petkov 提交于 10月 29, 2013

Add initial support for handling three-byte instructions in the
emulator.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0bc5eedb