提交 · 967235d320329e4a7a2bd1a36b04293063e985ae · openeuler / raspberrypi-kernel

15 2月, 2017 2 次提交

KVM: vmx: clear pending interrupts on KVM_SET_LAPIC · 967235d3

由 Paolo Bonzini 提交于 12月 19, 2016

Pending interrupts might be in the PI descriptor when the
LAPIC is restored from an external state; we do not want
them to be injected.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

967235d3

kvm: vmx: Use the hardware provided GPA instead of page walk · db1c056c

由 Paolo Bonzini 提交于 12月 08, 2016

As in the SVM patch, the guest physical address is passed by
VMX to x86_emulate_instruction already, so mark the GPA as available
in vcpu->arch.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

db1c056c

09 2月, 2017 1 次提交

KVM: x86: hide KVM_HC_CLOCK_PAIRING on 32 bit · 8ef81a9a

由 Arnd Bergmann 提交于 2月 09, 2017

The newly added hypercall doesn't work on x86-32:

arch/x86/kvm/x86.c: In function 'kvm_pv_clock_pairing':
arch/x86/kvm/x86.c:6163:6: error: implicit declaration of function 'kvm_get_walltime_and_clockread';did you mean 'kvm_get_time_scale'? [-Werror=implicit-function-declaration]

This adds an #ifdef around it, matching the one around the related
functions that are also only implemented on 64-bit systems.

Fixes: 55dd00a7 ("KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

8ef81a9a

08 2月, 2017 4 次提交

KVM: x86: fix compilation · 80fbd89c

由 Paolo Bonzini 提交于 2月 08, 2017

Fix rebase breakage from commit 55dd00a7 ("KVM: x86: add
KVM_HC_CLOCK_PAIRING hypercall", 2017-01-24), courtesy of the
"I could have sworn I had pushed the right branch" department.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

80fbd89c

KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall · 55dd00a7

由 Marcelo Tosatti 提交于 1月 24, 2017

Add a hypercall to retrieve the host realtime clock and the TSC value
used to calculate that clock read.

Used to implement clock synchronization between host and guest.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

55dd00a7

KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail · 6342c50a

由 David Hildenbrand 提交于 1月 25, 2017

vmx_complete_nested_posted_interrupt() can't fail, let's turn it into
a void function.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6342c50a

KVM: nVMX: kmap() can't fail · 42cf014d

由 David Hildenbrand 提交于 1月 25, 2017

kmap() can't fail, therefore it will always return a valid pointer. Let's
just get rid of the unnecessary checks.
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

42cf014d

27 1月, 2017 5 次提交

kvm: x86: mmu: Verify that restored PTE has needed perms in fast page fault · d3e328f2

由 Junaid Shahid 提交于 12月 21, 2016

Before fast page fault restores an access track PTE back to a regular PTE,
it now also verifies that the restored PTE would grant the necessary
permissions for the faulting access to succeed. If not, it falls back
to the slow page fault path.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d3e328f2

kvm: x86: mmu: Move pgtbl walk inside retry loop in fast_page_fault · d162f30a

由 Junaid Shahid 提交于 12月 21, 2016

Redo the page table walk in fast_page_fault when retrying so that we are
working on the latest PTE even if the hierarchy changes.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d162f30a

kvm: x86: mmu: Update comment in mark_spte_for_access_track · 20d65236

由 Junaid Shahid 提交于 12月 21, 2016

Reword the comment to hopefully make it more clear.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

20d65236

kvm: x86: mmu: Set SPTE_SPECIAL_MASK within mmu.c · 312b616b

由 Junaid Shahid 提交于 12月 21, 2016

Instead of the caller including the SPTE_SPECIAL_MASK in the masks being
supplied to kvm_mmu_set_mmio_spte_mask() and kvm_mmu_set_mask_ptes(),
those functions now themselves include the SPTE_SPECIAL_MASK.

Note that bit 63 is now reset in the default MMIO mask.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

312b616b

kvm: x86: mmu: Rename EPT_VIOLATION_READ/WRITE/INSTR constants · ab22a473

由 Junaid Shahid 提交于 12月 21, 2016

Rename the EPT_VIOLATION_READ/WRITE/INSTR constants to
EPT_VIOLATION_ACC_READ/WRITE/INSTR to more clearly indicate that these
signify the type of the memory access as opposed to the permissions
granted by the PTE.
Signed-off-by: NJunaid Shahid <junaids@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ab22a473

21 1月, 2017 1 次提交

Revert "KVM: nested VMX: disable perf cpuid reporting" · 0b4c208d

由 Jim Mattson 提交于 12月 20, 2016

This reverts commit bc613494.

A CPUID instruction executed in VMX non-root mode always causes a
VM-exit, regardless of the leaf being queried.

Fixes: bc613494 ("KVM: nested VMX: disable perf cpuid reporting")
Signed-off-by: NJim Mattson <jmattson@google.com>
[The issue solved by bc613494 has been resolved with ff651cb6
 ("KVM: nVMX: Add nested msr load/restore algorithm").]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0b4c208d

18 1月, 2017 1 次提交

kvm: x86: Expose Intel VPOPCNTDQ feature to guest · a17f3227

由 Piotr Luc 提交于 1月 10, 2017

Vector population count instructions for dwords and qwords are to be
used in future Intel Xeon & Xeon Phi processors. The bit 14 of
CPUID[level:0x07, ECX] indicates that the new instructions are
supported by a processor.

The spec can be found in the Intel Software Developer Manual (SDM)
or in the Instruction Set Extensions Programming Reference (ISE).
Signed-off-by: NPiotr Luc <piotr.luc@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a17f3227

17 1月, 2017 1 次提交

KVM: x86: fix fixing of hypercalls · ce2e852e

由 Dmitry Vyukov 提交于 1月 17, 2017

emulator_fix_hypercall() replaces hypercall with vmcall instruction,
but it does not handle GP exception properly when writes the new instruction.
It can return X86EMUL_PROPAGATE_FAULT without setting exception information.
This leads to incorrect emulation and triggers
WARN_ON(ctxt->exception.vector > 0x1f) in x86_emulate_insn()
as discovered by syzkaller fuzzer:

WARNING: CPU: 2 PID: 18646 at arch/x86/kvm/emulate.c:5558
Call Trace:
 warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
 x86_emulate_insn+0x16a5/0x4090 arch/x86/kvm/emulate.c:5572
 x86_emulate_instruction+0x403/0x1cc0 arch/x86/kvm/x86.c:5618
 emulate_instruction arch/x86/include/asm/kvm_host.h:1127 [inline]
 handle_exception+0x594/0xfd0 arch/x86/kvm/vmx.c:5762
 vmx_handle_exit+0x2b7/0x38b0 arch/x86/kvm/vmx.c:8625
 vcpu_enter_guest arch/x86/kvm/x86.c:6888 [inline]
 vcpu_run arch/x86/kvm/x86.c:6947 [inline]

Set exception information when write in emulator_fix_hypercall() fails.
Signed-off-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: kvm@vger.kernel.org
Cc: syzkaller@googlegroups.com
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

ce2e852e

12 1月, 2017 4 次提交

KVM: x86: fix emulation of "MOV SS, null selector" · 33ab9110

由 Paolo Bonzini 提交于 1月 12, 2017

This is CVE-2017-2583.  On Intel this causes a failed vmentry because
SS's type is neither 3 nor 7 (even though the manual says this check is
only done for usable SS, and the dmesg splat says that SS is unusable!).
On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb.

The fix fabricates a data segment descriptor when SS is set to a null
selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb.
Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3;
this in turn ensures CPL < 3 because RPL must be equal to CPL.

Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing
the bug and deciphering the manuals.
Reported-by: NXiaohan Zhang <zhangxiaohan1@huawei.com>
Fixes: 79d5b4c3
Cc: stable@nongnu.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

33ab9110

KVM: x86: fix NULL deref in vcpu_scan_ioapic · 546d87e5

由 Wanpeng Li 提交于 1月 03, 2017

Reported by syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
    IP: _raw_spin_lock+0xc/0x30
    PGD 3e28eb067
    PUD 3f0ac6067
    PMD 0
    Oops: 0002 [#1] SMP
    CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
    Call Trace:
     ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
     kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
     ? pick_next_task_fair+0xe1/0x4e0
     ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
     kvm_vcpu_ioctl+0x33a/0x600 [kvm]
     ? hrtimer_try_to_cancel+0x29/0x130
     ? do_nanosleep+0x97/0xf0
     do_vfs_ioctl+0xa1/0x5d0
     ? __hrtimer_init+0x90/0x90
     ? do_nanosleep+0x5b/0xf0
     SyS_ioctl+0x79/0x90
     do_syscall_64+0x6e/0x180
     entry_SYSCALL64_slow_path+0x25/0x25
    RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0

The syzkaller folks reported a NULL pointer dereference due to
ENABLE_CAP succeeding even without an irqchip.  The Hyper-V
synthetic interrupt controller is activated, resulting in a
wrong request to rescan the ioapic and a NULL pointer dereference.

    #include <sys/ioctl.h>
    #include <sys/mman.h>
    #include <sys/types.h>
    #include <linux/kvm.h>
    #include <pthread.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #ifndef KVM_CAP_HYPERV_SYNIC
    #define KVM_CAP_HYPERV_SYNIC 123
    #endif

    void* thr(void* arg)
    {
	struct kvm_enable_cap cap;
	cap.flags = 0;
	cap.cap = KVM_CAP_HYPERV_SYNIC;
	ioctl((long)arg, KVM_ENABLE_CAP, &cap);
	return 0;
    }

    int main()
    {
	void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
	int kvmfd = open("/dev/kvm", 0);
	int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
	struct kvm_userspace_memory_region memreg;
	memreg.slot = 0;
	memreg.flags = 0;
	memreg.guest_phys_addr = 0;
	memreg.memory_size = 0x1000;
	memreg.userspace_addr = (unsigned long)host_mem;
	host_mem[0] = 0xf4;
	ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg);
	int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0);
	struct kvm_sregs sregs;
	ioctl(cpufd, KVM_GET_SREGS, &sregs);
	sregs.cr0 = 0;
	sregs.cr4 = 0;
	sregs.efer = 0;
	sregs.cs.selector = 0;
	sregs.cs.base = 0;
	ioctl(cpufd, KVM_SET_SREGS, &sregs);
	struct kvm_regs regs = { .rflags = 2 };
	ioctl(cpufd, KVM_SET_REGS, &regs);
	ioctl(vmfd, KVM_CREATE_IRQCHIP, 0);
	pthread_t th;
	pthread_create(&th, 0, thr, (void*)(long)cpufd);
	usleep(rand() % 10000);
	ioctl(cpufd, KVM_RUN, 0);
	pthread_join(th, 0);
	return 0;
    }

This patch fixes it by failing ENABLE_CAP if without an irqchip.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Fixes: 5c919412 (kvm/x86: Hyper-V synthetic interrupt controller)
Cc: stable@vger.kernel.org # 4.5+
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

546d87e5

KVM: x86: Introduce segmented_write_std · 129a72a0

由 Steve Rutherford 提交于 1月 11, 2017

Introduces segemented_write_std.

Switches from emulated reads/writes to standard read/writes in fxsave,
fxrstor, sgdt, and sidt.  This fixes CVE-2017-2584, a longstanding
kernel memory leak.

Since commit 283c95d0 ("KVM: x86: emulate FXSAVE and FXRSTOR",
2016-11-09), which is luckily not yet in any final release, this would
also be an exploitable kernel memory *write*!
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: 96051572
Fixes: 283c95d0Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSteve Rutherford <srutherford@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

129a72a0

KVM: x86: flush pending lapic jump label updates on module unload · cef84c30

由 David Matlack 提交于 12月 16, 2016

KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled).
These are implemented with delayed_work structs which can still be
pending when the KVM module is unloaded. We've seen this cause kernel
panics when the kvm_intel module is quickly reloaded.

Use the new static_key_deferred_flush() API to flush pending updates on
module unload.
Signed-off-by: NDavid Matlack <dmatlack@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

cef84c30

09 1月, 2017 21 次提交

kvm: nVMX: Reorder error checks for emulated VMXON · 21e7fbe7

由 Jim Mattson 提交于 12月 22, 2016

Checks on the operand to VMXON are performed after the check for
legacy mode operation and the #GP checks, according to the pseudo-code
in Intel's SDM.
Signed-off-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

21e7fbe7

KVM: lapic: do not scan IRR when delivering an interrupt · 4d82d12b

由 Paolo Bonzini 提交于 12月 18, 2016

On interrupt delivery the PPR can only grow (except for auto-EOI),
so it is impossible that non-auto-EOI interrupt delivery results
in KVM_REQ_EVENT.  We can therefore use __apic_update_ppr.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

4d82d12b

KVM: lapic: do not set KVM_REQ_EVENT unnecessarily on PPR update · 26fbbee5

由 Paolo Bonzini 提交于 12月 18, 2016

On PPR update, we set KVM_REQ_EVENT unconditionally anytime PPR is lowered.
But we can take into account IRR here already.
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

26fbbee5

KVM: lapic: remove unnecessary KVM_REQ_EVENT on PPR update · b3c045d3

由 Paolo Bonzini 提交于 12月 18, 2016

PPR needs to be updated whenever on every IRR read because we
may have missed TPR writes that _increased_ PPR.  However, these
writes need not generate KVM_REQ_EVENT, because either KVM_REQ_EVENT
has been set already in __apic_accept_irq, or we are going to
process the interrupt right away.
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b3c045d3

KVM: vmx: speed up TPR below threshold vmexits · eb90f341

由 Paolo Bonzini 提交于 12月 18, 2016

Since we're already in VCPU context, all we have to do here is recompute
the PPR value.  That will in turn generate a KVM_REQ_EVENT if necessary.
Reviewed-by: NRoman Kagan <rkagan@virtuozzo.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

eb90f341

KVM: x86: add VCPU stat for KVM_REQ_EVENT processing · 0f1e261e

由 Paolo Bonzini 提交于 12月 17, 2016

This statistic can be useful to estimate the cost of an IRQ injection
scenario, by comparing it with irq_injections. For example the stat
shows that sti;hlt triggers more KVM_REQ_EVENT than sti;nop.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0f1e261e

kvm: svm: Use the hardware provided GPA instead of page walk · 0f89b207

由 Tom Lendacky 提交于 12月 14, 2016

When a guest causes a NPF which requires emulation, KVM sometimes walks
the guest page tables to translate the GVA to a GPA. This is unnecessary
most of the time on AMD hardware since the hardware provides the GPA in
EXITINFO2.

The only exception cases involve string operations involving rep or
operations that use two memory locations. With rep, the GPA will only be
the value of the initial NPF and with dual memory locations we won't know
which memory address was translated into EXITINFO2.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

0f89b207

KVM: x86: allow hotplug of VCPU with APIC ID over 0xff · 5bd5db38

由 Radim Krčmář 提交于 12月 15, 2016

LAPIC after reset is in xAPIC mode, which poses a problem for hotplug of
VCPUs with high APIC ID, because reset VCPU is waiting for INIT/SIPI,
but there is no way to uniquely address it using xAPIC.

From many possible options, we chose the one that also works on real
hardware: accepting interrupts addressed to LAPIC's x2APIC ID even in
xAPIC mode.

KVM intentionally differs from real hardware, because real hardware
(Knights Landing) does just "x2apic_id & 0xff" to decide whether to
accept the interrupt in xAPIC mode and it can deliver one interrupt to
more than one physical destination, e.g. 0x123 to 0x123 and 0x23.

Fixes: 682f732e ("KVM: x86: bump MAX_VCPUS to 288")
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

5bd5db38

KVM: x86: make interrupt delivery fast and slow path behave the same · b4535b58

由 Radim Krčmář 提交于 12月 15, 2016

Slow path tried to prevent IPIs from x2APIC VCPUs from being delivered
to xAPIC VCPUs and vice-versa.  Make slow path behave like fast path,
which never distinguished that.
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b4535b58

KVM: x86: replace kvm_apic_id with kvm_{x,x2}apic_id · 6e500439

由 Radim Krčmář 提交于 12月 15, 2016

There were three calls sites:
 - recalculate_apic_map and kvm_apic_match_physical_addr, where it would
   only complicate implementation of x2APIC hotplug;
 - in apic_debug, where it was still somewhat preserved, but keeping the
   old function just for apic_debug was not worth it
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6e500439

KVM: x86: use delivery to self in hyperv synic · f98a3efb

由 Radim Krčmář 提交于 12月 15, 2016

Interrupt to self can be sent without knowing the APIC ID.
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f98a3efb

kvm: x86: mmu: Lockless access tracking for Intel CPUs without EPT A bits. · f160c7b7