提交 · addb63c18a0d52a9ce2611d039f981f7b6148d2b · openeuler / Kernel

22 6月, 2017 1 次提交

KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows · addb63c1

由 Heiko Carstens 提交于 6月 19, 2017

For real-space designation asces the asce origin part is only a token.
The asce token origin must not be used to generate an effective
address for storage references. This however is erroneously done
within kvm_s390_shadow_tables().

Furthermore within the same function the wrong parts of virtual
addresses are used to generate a corresponding real address
(e.g. the region second index is used as region first index).

Both of the above can result in incorrect address translations. Only
for real space designations with a token origin of zero and addresses
below one megabyte the translation was correct.

Furthermore replace a "!asce.r" statement with a "!*fake" statement to
make it more obvious that a specific condition has nothing to do with
the architecture, but with the fake handling of real space designations.

Fixes: 3218f709 ("s390/mm: support real-space for gmap shadows")
Cc: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

addb63c1

01 6月, 2017 1 次提交

KVM: s390: fix ais handling vs cpu model · 1ba15b24

由 Christian Borntraeger 提交于 5月 31, 2017

If ais is disabled via cpumodel, we must act accordingly, even if
KVM_CAP_S390_AIS was enabled.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NPierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: NYi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: NEric Farman <farman@linux.vnet.ibm.com>

1ba15b24

18 5月, 2017 3 次提交

Merge tag 'kvm-arm-for-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm · 55c315ee

由 Radim Krčmář 提交于 5月 18, 2017

KVM/ARM Fixes for v4.12-rc2.

Includes:
 - A fix for a build failure introduced in -rc1 when tracepoints are
   enabled on 32-bit ARM.
 - Disabling use of stack pointer protection in the hyp code which can
   cause panics.
 - A handful of VGIC fixes.
 - A fix to the init of the redistributors on GICv3 systems that
   prevented boot with kvmtool on GICv3 systems introduced in -rc1.
 - A number of race conditions fixed in our MMU handling code.
 - A fix for the guest being able to program the debug extensions for
   the host on the 32-bit side.

55c315ee

KVM: arm/arm64: Hold slots_lock when unregistering kvm io bus devices · fa472fa9

由 Christoffer Dall 提交于 5月 17, 2017

We were not holding the kvm->slots_lock as required when calling
kvm_io_bus_unregister_dev() as required.

This only affects the error path, but still, let's do our due
diligence.

Reported by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>

fa472fa9

KVM: arm/arm64: Fix bug when registering redist iodevs · 552c9f47

由 Christoffer Dall 提交于 5月 17, 2017

If userspace creates the VCPUs after initializing the VGIC, then we end
up in a situation where we trigger a bug in kvm_vcpu_get_idx(), because
it is called prior to adding the VCPU into the vcpus array on the VM.

There is no tight coupling between the VCPU index and the area of the
redistributor region used for the VCPU, so we can simply ensure that all
creations of redistributors are serialized per VM, and increment an
offset when we successfully add a redistributor.

The vgic_register_redist_iodev() function can be called from two paths:
vgic_redister_all_redist_iodev() which is called via the kvm_vgic_addr()
device attribute handler. This patch already holds the kvm->lock mutex.

The other path is via kvm_vgic_vcpu_init, which is called through a
longer chain from kvm_vm_ioctl_create_vcpu(), which releases the
kvm->lock mutex just before calling kvm_arch_vcpu_create(), so we can
simply take this mutex again later for our purposes.

Fixes: ab6f468c10 ("KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs")
Signed-off-by: NChristoffer Dall <cdall@linaro.org>
Tested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>

552c9f47

17 5月, 2017 1 次提交

KVM: x86: lower default for halt_poll_ns · b401ee0b

由 Paolo Bonzini 提交于 4月 18, 2017

In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to
increase heavily even in cases where the performance improvement was
small.  In particular, bandwidth divided by CPU usage was as much as
60% lower.

To some extent this is the expected effect of the patch, and the
additional CPU utilization is only visible when running the
benchmarks.  However, halving the threshold also halves the extra
CPU utilization (from +30-130% to +20-70%) and has no negative
effect on performance.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

b401ee0b

16 5月, 2017 3 次提交

kvm: arm/arm64: Fix use after free of stage2 page table · 0c428a6a

由 Suzuki K Poulose 提交于 5月 16, 2017

We yield the kvm->mmu_lock occassionaly while performing an operation
(e.g, unmap or permission changes) on a large area of stage2 mappings.
However this could possibly cause another thread to clear and free up
the stage2 page tables while we were waiting for regaining the lock and
thus the original thread could end up in accessing memory that was
freed. This patch fixes the problem by making sure that the stage2
pagetable is still valid after we regain the lock. The fact that
mmu_notifer->release() could be called twice (via __mmu_notifier_release
and mmu_notifier_unregsister) enhances the possibility of hitting
this race where there are two threads trying to unmap the entire guest
shadow pages.

While at it, cleanup the redudant checks around cond_resched_lock in
stage2_wp_range(), as cond_resched_lock already does the same checks.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: andreyknvl@google.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

0c428a6a

kvm: arm/arm64: Force reading uncached stage2 PGD · 2952a607

由 Suzuki K Poulose 提交于 5月 16, 2017

Make sure we don't use a cached value of the KVM stage2 PGD while
resetting the PGD.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

2952a607

KVM: nVMX: fix EPT permissions as reported in exit qualification · 0780516a

由 Paolo Bonzini 提交于 5月 11, 2017

This fixes the new ept_access_test_read_only and ept_access_test_read_write
testcases from vmx.flat.

The problem is that gpte_access moves bits around to switch from EPT
bit order (XWR) to ACC_*_MASK bit order (RWX).  This results in an
incorrect exit qualification.  To fix this, make pt_access and
pte_access operate on raw PTE values (only with NX flipped to mean
"can execute") and call gpte_access at the end of the walk.  This
lets us use pte_access to compute the exit qualification with XWR
bit order.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NXiao Guangrong <xiaoguangrong@tencent.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0780516a

15 5月, 2017 13 次提交

KVM: VMX: Don't enable EPT A/D feature if EPT feature is disabled · fce6ac4c

由 Wanpeng Li 提交于 5月 11, 2017

We can observe eptad kvm_intel module parameter is still Y
even if ept is disabled which is weird. This patch will
not enable EPT A/D feature if EPT feature is disabled.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

fce6ac4c

KVM: x86: Fix load damaged SSEx MXCSR register · a575813b

由 Wanpeng Li 提交于 5月 11, 2017

Reported by syzkaller:

   BUG: unable to handle kernel paging request at ffffffffc07f6a2e
   IP: report_bug+0x94/0x120
   PGD 348e12067
   P4D 348e12067
   PUD 348e14067
   PMD 3cbd84067
   PTE 80000003f7e87161

   Oops: 0003 [#1] SMP
   CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G           OE   4.11.0+ #8
   task: ffff92fdfb525400 task.stack: ffffbda6c3d04000
   RIP: 0010:report_bug+0x94/0x120
   RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202
    do_trap+0x156/0x170
    do_error_trap+0xa3/0x170
    ? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
    ? mark_held_locks+0x79/0xa0
    ? retint_kernel+0x10/0x10
    ? trace_hardirqs_off_thunk+0x1a/0x1c
    do_invalid_op+0x20/0x30
    invalid_op+0x1e/0x30
   RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
    ? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm]
    kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm]
    kvm_vcpu_ioctl+0x384/0x780 [kvm]
    ? kvm_vcpu_ioctl+0x384/0x780 [kvm]
    ? sched_clock+0x13/0x20
    ? __do_page_fault+0x2a0/0x550
    do_vfs_ioctl+0xa4/0x700
    ? up_read+0x1f/0x40
    ? __do_page_fault+0x2a0/0x550
    SyS_ioctl+0x79/0x90
    entry_SYSCALL_64_fastpath+0x23/0xc2

SDM mentioned that "The MXCSR has several reserved bits, and attempting to write
a 1 to any of these bits will cause a general-protection exception(#GP) to be
generated". The syzkaller forks' testcase overrides xsave area w/ random values
and steps on the reserved bits of MXCSR register. The damaged MXCSR register
values of guest will be restored to SSEx MXCSR register before vmentry. This
patch fixes it by catching userspace override MXCSR register reserved bits w/
random values and bails out immediately.
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a575813b

kvm: nVMX: off by one in vmx_write_pml_buffer() · 4769886b

由 Dan Carpenter 提交于 5月 10, 2017

There are PML_ENTITY_NUM elements in the pml_address[] array so the >
should be >= or we write beyond the end of the array when we do:

	pml_address[vmcs12->guest_pml_index--] = gpa;

Fixes: c5f983f6 ("nVMX: Implement emulated Page Modification Logging")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

4769886b

R
Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · 65acb891
由 Radim Krčmář 提交于 5月 15, 2017
```
- fix build failures with PR KVM configurations
- fix a host crash that can occur on POWER9 with radix guests
```
65acb891

KVM: arm: rename pm_fake handler to trap_raz_wi · 9b619a8f

由 Zhichao Huang 提交于 5月 11, 2017

pm_fake doesn't quite describe what the handler does (ignoring writes
and returning 0 for reads).

As we're about to use it (a lot) in a different context, rename it
with a (admitedly cryptic) name that make sense for all users.
Signed-off-by: NZhichao Huang <zhichao.huang@linaro.org>
Reviewed-by: NAlex Bennee <alex.bennee@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

9b619a8f

KVM: arm: plug potential guest hardware debug leakage · 661e6b02

由 Zhichao Huang 提交于 5月 11, 2017

Hardware debugging in guests is not intercepted currently, it means
that a malicious guest can bring down the entire machine by writing
to the debug registers.

This patch enable trapping of all debug registers, preventing the
guests to access the debug registers. This includes access to the
debug mode(DBGDSCR) in the guest world all the time which could
otherwise mess with the host state. Reads return 0 and writes are
ignored (RAZ_WI).

The result is the guest cannot detect any working hardware based debug
support. As debug exceptions are still routed to the guest normal
debug using software based breakpoints still works.

To support debugging using hardware registers we need to implement a
debug register aware world switch as well as special trapping for
registers that may affect the host state.

Cc: stable@vger.kernel.org
Signed-off-by: NZhichao Huang <zhichao.huang@linaro.org>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

661e6b02

kvm: arm/arm64: Fix race in resetting stage2 PGD · 6c0d706b

由 Suzuki K Poulose 提交于 5月 03, 2017

In kvm_free_stage2_pgd() we check the stage2 PGD before holding
the lock and proceed to take the lock if it is valid. And we unmap
the page tables, followed by releasing the lock. We reset the PGD
only after dropping this lock, which could cause a race condition
where another thread waiting on or even holding the lock, could
potentially see that the PGD is still valid and proceed to perform
a stage2 operation and later encounter a NULL PGD.

[223090.242280] Unable to handle kernel NULL pointer dereference at
virtual address 00000040
[223090.262330] PC is at unmap_stage2_range+0x8c/0x428
[223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
[223090.262531] Call trace:
[223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
[223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
[223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
[223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
[223090.262543] [<ffff0000080a2478>]
kvm_mmu_notifier_invalidate_page+0x50/0x8c
[223090.262547] [<ffff0000082274f8>]
__mmu_notifier_invalidate_page+0x5c/0x84
[223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
[223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
[223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
[223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
[223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
[223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
[223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
[223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
[223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
[223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
[223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
[223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
[223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
[223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
[223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
[223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
[223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
[223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
[223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
[223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
[223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
[223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
[223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c

This patch moves the stage2 PGD manipulation under the lock.
Reported-by: NAlexander Graf <agraf@suse.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: NChristoffer Dall <cdall@linaro.org>
Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NChristoffer Dall <cdall@linaro.org>

6c0d706b

KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2 registers · 15d2bffd