提交 · 4504b5c9414c55da37f26b1faf49c09a2acbf255 · openeuler / raspberrypi-kernel

17 11月, 2016 10 次提交

kvm: x86: Add AVX512_4VNNIW and AVX512_4FMAPS support · 4504b5c9

由 Luwei Kang 提交于 11月 07, 2016

Add two new AVX512 subfeatures support for KVM guest.

AVX512_4VNNIW:
Vector instructions for deep learning enhanced word variable precision.

AVX512_4FMAPS:
Vector instructions for deep learning floating-point single precision.
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NHe Chen <he.chen@linux.intel.com>
Signed-off-by: NLuwei Kang <luwei.kang@intel.com>
[Changed subject tags.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4504b5c9

KVM: x86: emulate FXSAVE and FXRSTOR · 283c95d0

由 Radim Krčmář 提交于 11月 09, 2016

Internal errors were reported on 16 bit fxsave and fxrstor with ipxe.
Old Intels don't have unrestricted_guest, so we have to emulate them.

The patch takes advantage of the hardware implementation.

AMD and Intel differ in saving and restoring other fields in first 32
bytes.  A test wrote 0xff to the fxsave area, 0 to upper bits of MCSXR
in the fxsave area, executed fxrstor, rewrote the fxsave area to 0xee,
and executed fxsave:

  Intel (Nehalem):
    7f 1f 7f 7f ff 00 ff 07 ff ff ff ff ff ff 00 00
    ff ff ff ff ff ff 00 00 ff ff 00 00 ff ff 00 00
  Intel (Haswell -- deprecated FPU CS and FPU DS):
    7f 1f 7f 7f ff 00 ff 07 ff ff ff ff 00 00 00 00
    ff ff ff ff 00 00 00 00 ff ff 00 00 ff ff 00 00
  AMD (Opteron 2300-series):
    7f 1f 7f 7f ff 00 ee ee ee ee ee ee ee ee ee ee
    ee ee ee ee ee ee ee ee ff ff 00 00 ff ff 02 00

fxsave/fxrstor will only be emulated on early Intels, so KVM can't do
much to improve the situation.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

283c95d0

KVM: x86: add asm_safe wrapper · aabba3c6

由 Radim Krčmář 提交于 11月 08, 2016

Move the existing exception handling for inline assembly into a macro
and switch its return values to X86EMUL type.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

aabba3c6

KVM: x86: save one bit in ctxt->d · 48520187

由 Radim Krčmář 提交于 11月 08, 2016

Alignments are exclusive, so 5 modes can be expressed in 3 bits.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

48520187

KVM: x86: add Align16 instruction flag · d3fe959f

由 Radim Krčmář 提交于 11月 08, 2016

Needed for FXSAVE and FXRSTOR.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d3fe959f

kvm: x86: remove unused but set variable · 69515196

由 Jiang Biao 提交于 11月 07, 2016

The local variable *gpa_offset* is set but not used afterwards,
which make the compiler issue a warning with option
-Wunused-but-set-variable. Remove it to avoid the warning.
Signed-off-by: NJiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

69515196

kvm: x86: hyperv: make function static to avoid compiling warning · ecd8a8c2

由 Jiang Biao 提交于 11月 07, 2016

synic_set_irq is only used in hyperv.c, and should be static to
avoid compiling warning when with -Wmissing-prototypes option.
Signed-off-by: NJiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ecd8a8c2

kvm: x86: cpuid: remove the unnecessary variable · 1e13175b

由 Jiang Biao 提交于 11月 07, 2016

The use of local variable *function* is not necessary here. Remove
it to avoid compiling warning with -Wunused-but-set-variable option.
Signed-off-by: NJiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1e13175b

kvm: x86: make a function in x86.c static to avoid compiling warning · ae6a2375

由 Jiang Biao 提交于 11月 07, 2016

kvm_emulate_wbinvd_noskip is only used in x86.c, and should be
static to avoid compiling warning when with -Wmissing-prototypes
option.
Signed-off-by: NJiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ae6a2375

kvm: x86: make function static to avoid compiling warning · 33365e7a

由 Jiang Biao 提交于 11月 03, 2016

vmx_arm_hv_timer is only used in vmx.c, and should be static to
avoid compiling warning when with -Wmissing-prototypes option.
Signed-off-by: NJiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

33365e7a

16 11月, 2016 2 次提交

x86/cpuid: Provide get_scattered_cpuid_leaf() · 47bdf337

由 He Chen 提交于 11月 11, 2016

Sparse populated CPUID leafs are collected in a software provided leaf to
avoid bloat of the x86_capability array, but there is no way to rebuild the
real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID
leaf from the CPU. While this is possible it is problematic as it does not
take software disabled features into account. If a feature is disabled on
the host it should not be exposed to a guest either.

Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered
cpuid table information and the active CPU features.

[ tglx: Rewrote changelog ]
Signed-off-by: NHe Chen <he.chen@linux.intel.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Piotr Luc <Piotr.Luc@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

47bdf337

x86/cpuid: Cleanup cpuid_regs definitions · 47f10a36

由 He Chen 提交于 11月 11, 2016

cpuid_regs is defined multiple times as structure and enum. Rename the enum
and move all of it to processor.h so we don't end up with more instances.

Rename the misnomed register enumeration from CR_* to the obvious CPUID_*.

[ tglx: Rewrote changelog ]
Signed-off-by: NHe Chen <he.chen@linux.intel.com>
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Piotr Luc <Piotr.Luc@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.chen@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

47f10a36

12 11月, 2016 2 次提交

crypto: aesni: shut up -Wmaybe-uninitialized warning · beae2c9e

由 Arnd Bergmann 提交于 11月 10, 2016

The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:

  arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
  include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]

The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.

This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn on
this warning again by default without producing useless output.  A
follow-up patch for v4.10 rearranges the code to make the warning go
away.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

beae2c9e

x86: apm: avoid uninitialized data · 3a6d8676

由 Arnd Bergmann 提交于 11月 10, 2016

apm_bios_call() can fail, and return a status in its argument structure.
If that status however is zero during a call from
apm_get_power_status(), we end up using data that may have never been
set, as reported by "gcc -Wmaybe-uninitialized":

arch/x86/kernel/apm_32.c: In function ‘apm’:
arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here
arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here

This changes the function to return "APM_NO_ERROR" here, which makes the
code more robust to broken BIOS versions, and avoids the warning.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NJiri Kosina <jkosina@suse.cz>
Reviewed-by: NLuis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a6d8676

04 11月, 2016 3 次提交

kvm/page_track: export symbols for external usage · 871b7ef2

由 Jike Song 提交于 10月 25, 2016

Signed-off-by: NJike Song <jike.song@intel.com>
Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

871b7ef2

kvm/page_track: call notifiers with kvm_page_track_notifier_node · d126363d

由 Jike Song 提交于 10月 25, 2016

The user of page_track might needs extra information, so pass
the kvm_page_track_notifier_node to callbacks.
Signed-off-by: NJike Song <jike.song@intel.com>
Reviewed-by: NXiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d126363d

KVM: x86: add track_flush_slot page track notifier · ae7cd873

由 Xiaoguang Chen 提交于 10月 09, 2016

When a memory slot is being moved or removed users of page track
can be notified. So users can drop write-protection for the pages
in that memory slot.

This notifier type is needed by KVMGT to sync up its shadow page
table when memory slot is being moved or removed.

Register the notifier type track_flush_slot to receive memslot move
and remove event.
Reviewed-by: NXiao Guangrong <guangrong.xiao@intel.com>
Signed-off-by: NChen Xiaoguang <xiaoguang.chen@intel.com>
[Squashed commits to avoid bisection breakage and reworded the subject.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ae7cd873

03 11月, 2016 20 次提交

kvm: x86: avoid atomic operations on APICv vmentry · ad361091

由 Paolo Bonzini 提交于 9月 20, 2016

On some benchmarks (e.g. netperf with ioeventfd disabled), APICv
posted interrupts turn out to be slower than interrupt injection via
KVM_REQ_EVENT.

This patch optimizes a bit the IRR update, avoiding expensive atomic
operations in the common case where PI.ON=0 at vmentry or the PIR vector
is mostly zero.  This saves at least 20 cycles (1%) per vmexit, as
measured by kvm-unit-tests' inl_from_qemu test (20 runs):

              | enable_apicv=1  |  enable_apicv=0
              | mean     stdev  |  mean     stdev
    ----------|-----------------|------------------
    before    | 5826     32.65  |  5765     47.09
    after     | 5809     43.42  |  5777     77.02

Of course, any change in the right column is just placebo effect. :)
The savings are bigger if interrupts are frequent.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ad361091

KVM: nVMX: support descriptor table exits · 1b07304c

由 Paolo Bonzini 提交于 10月 25, 2016

These are never used by the host, but they can still be reflected to
the guest.
Tested-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1b07304c

P
KVM: x86: use ktime_get instead of seeking the hrtimer_clock_base · 5587859f
由 Paolo Bonzini 提交于 10月 25, 2016
```
The base clock for the LAPIC timer is always CLOCK_MONOTONIC.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
```
5587859f

KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support · 8003c9ae

由 Wanpeng Li 提交于 10月 24, 2016

Most windows guests still utilize APIC Timer periodic/oneshot mode
instead of tsc-deadline mode, and the APIC Timer periodic/oneshot
mode are still emulated by high overhead hrtimer on host. This patch
converts the expected expire time of the periodic/oneshot mode to
guest deadline tsc in order to leverage VMX preemption timer logic
for APIC Timer tsc-deadline mode. After each preemption timer vmexit
preemption timer is restarted to emulate LVTT current-count register
is automatically reloaded from the initial-count register when the
count reaches 0. This patch reduces ~5600 cycles for each APIC Timer
periodic mode operation virtualization.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
[Squashed with my fixes that were reviewed-by Paolo.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8003c9ae

KVM: LAPIC: rename start/cancel_hv_tscdeadline to start/cancel_hv_timer · 7e810a38

由 Wanpeng Li 提交于 10月 24, 2016

Rename start/cancel_hv_tscdeadline to start/cancel_hv_timer since
they will handle both APIC Timer periodic/oneshot mode and tsc-deadline
mode.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

7e810a38

KVM: LAPIC: introduce kvm_get_lapic_target_expiration_tsc() · 498f8162

由 Wanpeng Li 提交于 10月 24, 2016

Introdce kvm_get_lapic_target_expiration_tsc() to get APIC Timer target
deadline tsc.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

498f8162

KVM: LAPIC: guarantee the timer is in tsc-deadline mode · a10388e1

由 Wanpeng Li 提交于 10月 24, 2016

Check apic_lvtt_tscdeadline() mode directly instead of apic_lvtt_oneshot()
and apic_lvtt_period() to guarantee the timer is in tsc-deadline mode when
rdmsr MSR_IA32_TSCDEADLINE.
Suggested-by: NRadim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a10388e1

KVM: LAPIC: extract start_sw_period() to handle periodic/oneshot mode · 7d7f7da2

由 Wanpeng Li 提交于 10月 24, 2016

Extract start_sw_period() to handle periodic/oneshot mode, it will be
used by later patch.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

7d7f7da2

kvm: x86: remove the misleading comment in vmx_handle_external_intr · 868a32f3

由 Longpeng(Mike) 提交于 10月 13, 2016

Since Paolo has removed irq-enable-operation in vmx_handle_external_intr
(KVM: x86: use guest_exit_irqoff), the original comment about the IF bit
in rflags is incorrect and stale now, so remove it.
Signed-off-by: NLongpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

868a32f3

KVM: x86: add track_flush_slot page track notifier · b5f5fdca

由 Xiaoguang Chen 提交于 10月 09, 2016

When a memory slot is being moved or removed users of page track
can be notified. So users can drop write-protection for the pages
in that memory slot.

This notifier type is needed by KVMGT to sync up its shadow page
table when memory slot is being moved or removed.

Register the notifier type track_flush_slot to receive memslot move
and remove event.
Reviewed-by: NXiao Guangrong <guangrong.xiao@intel.com>
Signed-off-by: NChen Xiaoguang <xiaoguang.chen@intel.com>
[Squashed commits to avoid bisection breakage and reworded the subject.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b5f5fdca

KVM: VMX: refactor setup of global page-sized bitmaps · 23611332

由 Radim Krčmář 提交于 9月 29, 2016

We've had 10 page-sized bitmaps that were being allocated and freed one
by one when we could just use a cycle.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

23611332

KVM: VMX: join functions that disable x2apic msr intercepts · 2e69f865

由 Radim Krčmář 提交于 9月 29, 2016

vmx_disable_intercept_msr_read_x2apic() and
vmx_disable_intercept_msr_write_x2apic() differed only in the type.
Pass the type to a new function.

[Ordered and commented TPR intercept according to Paolo's suggestion.]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

2e69f865

KVM: VMX: remove functions that enable msr intercepts · 40d8338d

由 Radim Krčmář 提交于 9月 29, 2016

All intercepts are enabled at the beginning, so they can only be used if
we disabled an intercept that we wanted to have enabled.
This was done for TMCCT to simplify a loop that disables all x2APIC MSR
intercepts, but just keeping TMCCT enabled yields better results.
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

40d8338d

kvm: nVMX: Update MSR load counts on a VMCS switch · 83bafef1

由 Jim Mattson 提交于 10月 04, 2016

When L0 establishes (or removes) an MSR entry in the VM-entry or VM-exit
MSR load lists, the change should affect the dormant VMCS as well as the
current VMCS. Moreover, the vmcs02 MSR-load addresses should be
initialized.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

83bafef1

kvm: nVMX: Fetch VM_INSTRUCTION_ERROR from vmcs02 on vmx->fail · cf3215d9

由 Jim Mattson 提交于 9月 06, 2016

When forwarding a hardware VM-entry failure to L1, fetch the
VM_INSTRUCTION_ERROR field from vmcs02 before loading vmcs01.

(Note that there is an implicit assumption that the VM-entry failure was
on the first VM-entry to vmcs02 after nested_vmx_run; otherwise, L1 is
going to be very confused.)
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NPeter Feiner <pfeiner@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

cf3215d9

KVM: X86: MMU: no mmu_notifier_seq++ in kvm_age_hva · 66d73e12

由 Peter Feiner 提交于 9月 26, 2016

The MMU notifier sequence number keeps GPA->HPA mappings in sync when
GPA->HPA lookups are done outside of the MMU lock (e.g., in
tdp_page_fault). Since kvm_age_hva doesn't change GPA->HPA, it's
unnecessary to increment the sequence number.
Signed-off-by: NPeter Feiner <pfeiner@google.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

66d73e12

KVM: VMX: Better name x2apic msr bitmaps · c63e4563

由 Wanpeng Li 提交于 9月 23, 2016

Renames x2apic_apicv_inactive msr_bitmaps to x2apic and original
x2apic bitmaps to x2apic_apicv.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c63e4563

kvm: x86: Check memopp before dereference (CVE-2016-8630) · d9092f52

由 Owen Hofmann 提交于 10月 27, 2016

Commit 41061cdb ("KVM: emulate: do not initialize memopp") removes a
check for non-NULL under incorrect assumptions. An undefined instruction
with a ModR/M byte with Mod=0 and R/M-5 (e.g. 0xc7 0x15) will attempt
to dereference a null pointer here.

Fixes: 41061cdb
Message-Id: <1477592752-126650-2-git-send-email-osh@google.com>
Signed-off-by: NOwen Hofmann <osh@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d9092f52

kvm: nVMX: VMCLEAR an active shadow VMCS after last use · 355f4fb1

由 Jim Mattson 提交于 10月 28, 2016

After a successful VM-entry with the "VMCS shadowing" VM-execution
control set, the shadow VMCS referenced by the VMCS link pointer field
in the current VMCS becomes active on the logical processor.

A VMCS that is made active on more than one logical processor may become
corrupted. Therefore, before an active VMCS can be migrated to another
logical processor, the first logical processor must execute a VMCLEAR
for the active VMCS. VMCLEAR both ensures that all VMCS data are written
to memory and makes the VMCS inactive.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-By: NDavid Matlack <dmatlack@google.com>
Message-Id: <1477668579-22555-1-git-send-email-jmattson@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

355f4fb1

KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK · ea26e4ec

由 Paolo Bonzini 提交于 11月 01, 2016

Since commit a545ab6a ("kvm: x86: add tsc_offset field to struct
kvm_vcpu_arch", 2016-09-07) the offset between host and L1 TSC is
cached and need not be fished out of the VMCS or VMCB.  This means
that we can implement adjust_tsc_offset_guest and read_l1_tsc
entirely in generic code.  The simplification is particularly
significant for VMX code, where vmx->nested.vmcs01_tsc_offset
was duplicating what is now in vcpu->arch.tsc_offset.  Therefore
the vmcs01_tsc_offset can be dropped completely.

More importantly, this fixes KVM_GET_CLOCK/KVM_SET_CLOCK
which, after commit 108b249c ("KVM: x86: introduce get_kvmclock_ns",
2016-09-01) called read_l1_tsc while the VMCS was not loaded.
It thus returned bogus values on Intel CPUs.

Fixes: 108b249cReported-by: NRoman Kagan <rkagan@virtuozzo.com>
Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ea26e4ec

29 10月, 2016 1 次提交

x86/smpboot: Init apic mapping before usage · 1e90a13d

由 Thomas Gleixner 提交于 10月 29, 2016

The recent changes, which forced the registration of the boot cpu on UP
systems, which do not have ACPI tables, have been fixed for systems w/o
local APIC, but left a wreckage for systems which have neither ACPI nor
mptables, but the CPU has an APIC, e.g. virtualbox.

The boot process crashes in prefill_possible_map() as it wants to register
the boot cpu, which needs to access the local apic, but the local APIC is
not yet mapped.

There is no reason why init_apic_mapping() can't be invoked before
prefill_possible_map(). So instead of playing another silly early mapping
game, as the ACPI/mptables code does, we just move init_apic_mapping()
before the call to prefill_possible_map().

In hindsight, I should have noticed that combination earlier.

Sorry for the churn (also in stable)!

Fixes: ff856051 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC")
Reported-and-debugged-by: NMichal Necasek <michal.necasek@oracle.com>
Reported-and-tested-by: NWolfgang Bauer <wbauer@tmo.at>
Cc: prarit@redhat.com
Cc: ville.syrjala@linux.intel.com
Cc: michael.thayer@oracle.com
Cc: knut.osmundsen@oracle.com
Cc: frank.mehnert@oracle.com
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

1e90a13d

28 10月, 2016 2 次提交

KVM: x86: fix wbinvd_dirty_mask use-after-free · bd768e14

由 Ido Yariv 提交于 10月 21, 2016

vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
corrupting memory. For example, the following call trace may set a bit
in an already freed cpu mask:
    kvm_arch_vcpu_load
    vcpu_load
    vmx_free_vcpu_nested
    vmx_free_vcpu
    kvm_arch_vcpu_free

Fix this by deferring freeing of wbinvd_dirty_mask.

Cc: stable@vger.kernel.org
Signed-off-by: NIdo Yariv <ido@wizery.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

bd768e14

perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors · f92b7604

由 Imre Palik 提交于 10月 21, 2016

perf doesn't seem to honour the number of fixed counters specified by CPUID
leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters.

So if some of the fixed counters are masked out by the hypervisor, it still
tries to check/set them.

This patch makes perf behave nicer when the kernel is running under a
hypervisor that doesn't expose all the counters.

This patch contains some ideas from Matt Wilson.
Signed-off-by: NImre Palik <imrep@amazon.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Cc: Alexander Kozyrev <alexander.kozyrev@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Wilson <msw@amazon.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

f92b7604