提交 · 3e51d43516b99a5ede461381b4d031998f8dbdf3 · openanolis / cloud-kernel

08 9月, 2016 13 次提交

arm64: KVM: Move kvm_vcpu_get_condition out of emulate.c · 3e51d435

由 Marc Zyngier 提交于 9月 06, 2016

In order to make emulate.c more generic, move the arch-specific
manupulation bits out of emulate.c.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

3e51d435

arm64: KVM: VHE: reset PSTATE.PAN on entry to EL2 · cb96408d

由 Vladimir Murzin 提交于 9月 01, 2016

SCTLR_EL2.SPAN bit controls what happens with the PSTATE.PAN bit on an
exception. However, this bit has no effect on the PSTATE.PAN when
HCR_EL2.E2H or HCR_EL2.TGE is unset. Thus when VHE is used and
exception taken from a guest PSTATE.PAN bit left unchanged and we
continue with a value guest has set.

To address that always reset PSTATE.PAN on entry from EL1.

Fixes: 1f364c8c ("arm64: VHE: Add support for running Linux in EL2 mode")
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

cb96408d

KVM: arm/arm64: Get rid of exported aliases to static functions · cf0ba18a

由 Christoffer Dall 提交于 9月 01, 2016

When rewriting the assembly code to C code, it was useful to have
exported aliases or static functions so that we could keep the existing
common C code unmodified and at the same time rewrite arm64 from
assembly to C code, and later do the arm part.

Now when both are done, we really don't need this level of indirection
anymore, and it's time to save a few lines and brain cells.
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

cf0ba18a

arm64/kvm: remove unused stub functions · 777c1557

由 Mark Rutland 提交于 8月 30, 2016

Now that 32-bit KVM no longer performs cache maintenance for page table
updates, we no longer need empty stubs for arm64. Remove them.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

777c1557

arm/kvm: excise redundant cache maintenance · dcadda14

由 Mark Rutland 提交于 8月 30, 2016

When modifying Stage-2 page tables, we perform cache maintenance to
account for non-coherent page table walks. However, this is unnecessary,
as page table walks are guaranteed to be coherent in the presence of the
virtualization extensions.

Per ARM DDI 0406C.c, section B1.7 ("The Virtualization Extensions"), the
virtualization extensions mandate the multiprocessing extensions.

Per ARM DDI 0406C.c, section B3.10.1 ("General TLB maintenance
requirements"), as described in the sub-section titled "TLB maintenance
operations and the memory order model", this maintenance is not required
in the presence of the multiprocessing extensions.

Hence, we need not perform this cache maintenance when modifying Stage-2
entries.

This patch removes the logic for performing the redundant maintenance.
To ensure visibility and ordering of updates, a dsb(ishst) that was
otherwise implicit in the maintenance is folded into kvm_set_pmd() and
kvm_set_pte().
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

dcadda14

arm64: KVM: Optimize __guest_enter/exit() to save a few instructions · 68381b2b

由 Shanker Donthineni 提交于 8月 30, 2016

We are doing an unnecessary stack push/pop operation when restoring
the guest registers x0-x18 in __guest_enter(). This patch saves the
two instructions by using x18 as a base register. No need to store
the vcpu context pointer in stack because it is redundant, the same
information is available in tpidr_el2. The function __guest_exit()
calling convention is slightly modified, caller only pushes the regs
x0-x1 to stack instead of regs x0-x3.
Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

68381b2b

KVM: nVMX: expose INS/OUTS information support · 9ac7e3e8

由 Jan Dakinevich 提交于 9月 04, 2016

Expose the feature to L1 hypervisor if host CPU supports it, since
certain hypervisors requires it for own purposes.

According to Intel SDM A.1, if CPU supports the feature,
VMX_INSTRUCTION_INFO field of VMCS will contain detailed information
about INS/OUTS instructions handling. This field is already copied to
VMCS12 for L1 hypervisor (see prepare_vmcs12 routine) independently
feature presence.
Signed-off-by: NJan Dakinevich <jan.dakinevich@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

9ac7e3e8

KVM: VMX: not use vmcs_config in setup_vmcs_config · 16cb0255

由 Paolo Bonzini 提交于 9月 05, 2016

setup_vmcs_config takes a pointer to the vmcs_config global.  The
indirection is somewhat pointless, but just keep things consistent
for now.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

16cb0255

KVM: x86: remove stale comments · 1a698235

由 Paolo Bonzini 提交于 8月 19, 2016

handle_external_intr does not enable interrupts anymore, vcpu_enter_guest
does it after calling guest_exit_irqoff.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1a698235

KVM: x86: ratelimit and decrease severity for guest-triggered printk · bbe41b95

由 Paolo Bonzini 提交于 8月 19, 2016

These are mostly related to nested VMX.  They needn't have
a loglevel as high as KERN_WARN, and mustn't be allowed to
pollute the host logs.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

bbe41b95

KVM: nVMX: pass valid guest linear-address to the L1 · 119a9c01

由 Jan Dakinevich 提交于 9月 04, 2016

If EPT support is exposed to L1 hypervisor, guest linear-address field
of VMCS should contain GVA of L2, the access to which caused EPT violation.
Signed-off-by: NJan Dakinevich <jan.dakinevich@gmail.com>
Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

119a9c01

KVM: nVMX: make emulated nested preemption timer pinned · f15a75ee

由 Wanpeng Li 提交于 8月 30, 2016

Commit 61abdbe0 ("kvm: x86: make lapic hrtimer pinned") pins the emulated
lapic timer. This patch does the same for the emulated nested preemption
timer to avoid vmexit an unrelated vCPU and the latency of kicking IPI to
another vCPU.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

f15a75ee

vmx: refine validity check for guest linear address · 72e0ae58

由 Liang Li 提交于 8月 18, 2016

The validity check for the guest line address is inefficient,
check the invalid value instead of enumerating the valid ones.
Signed-off-by: NLiang Li <liang.z.li@intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

72e0ae58

19 8月, 2016 4 次提交

KVM: x86: Expose more Intel AVX512 feature to guest · 8e3562f6

由 Luwei Kang 提交于 8月 02, 2016

Expose AVX512DQ, AVX512BW, AVX512VL feature to guest.
Its spec can be found at:
https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdfSigned-off-by: NLuwei Kang <luwei.kang@intel.com>
[Resolved a trivial conflict with removed F(PCOMMIT).]
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8e3562f6

mmu: don't pass *kvm to spte_write_protect and spte_*_dirty · c4f138b4

由 Bandan Das 提交于 8月 02, 2016

That parameter isn't used in these functions,
it's probably a historical artifact.
Signed-off-by: NBandan Das <bsd@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c4f138b4

KVM: lapic: don't recalculate apic map table twice when enabling LAPIC · 187ca84b

由 Wanpeng Li 提交于 8月 03, 2016

APIC map table is recalculated during reset APIC ID to the initial value
when enabling LAPIC. This patch move the recalculate_apic_map() to the
next branch since we don't need to recalculate apic map twice in current
codes.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

187ca84b

MIPS: KVM: Check for pfn noslot case · ba913e4f

由 James Hogan 提交于 8月 19, 2016

When mapping a page into the guest we error check using is_error_pfn(),
however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an
error HVA for the page. This can only happen on MIPS right now due to
unusual memslot management (e.g. being moved / removed / resized), or
with an Enhanced Virtual Memory (EVA) configuration where the default
KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed
in a later patch). This case will be treated as a pfn of zero, mapping
the first page of physical memory into the guest.

It would appear the MIPS KVM port wasn't updated prior to being merged
(in v3.10) to take commit 81c52c56 ("KVM: do not treat noslot pfn as
a error pfn") into account (merged v3.8), which converted a bunch of
is_error_pfn() calls to is_error_noslot_pfn(). Switch to using
is_error_noslot_pfn() instead to catch this case properly.

Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.y-
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ba913e4f

18 8月, 2016 3 次提交

kvm: nVMX: fix nested tsc scaling · c95ba92a

由 Peter Feiner 提交于 8月 17, 2016

When the host supported TSC scaling, L2 would use a TSC multiplier of
0, which causes a VM entry failure. Now L2's TSC uses the same
multiplier as L1.
Signed-off-by: NPeter Feiner <pfeiner@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

c95ba92a

KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write · dccbfcf5

由 Radim Krčmář 提交于 8月 08, 2016

If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the
write with vmcs02 as the current VMCS.
This will incorrectly apply modifications intended for vmcs01 to vmcs02
and L2 can use it to gain access to L0's x2APIC registers by disabling
virtualized x2APIC while using msr bitmap that assumes enabled.

Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the
current VMCS.  An alternative solution would temporarily make vmcs01 the
current VMCS, but it requires more care.

Fixes: 8d14695f ("x86, apicv: add virtual x2apic support")
Reported-by: NJim Mattson <jmattson@google.com>
Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

dccbfcf5

KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC · d048c098

由 Radim Krčmář 提交于 8月 08, 2016

msr bitmap can be used to avoid a VM exit (interception) on guest MSR
accesses.  In some configurations of VMX controls, the guest can even
directly access host's x2APIC MSRs.  See SDM 29.5 VIRTUALIZING MSR-BASED
APIC ACCESSES.

L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI.
To do so, L1 would first trick KVM to disable all possible interceptions
by enabling APICv features and then would turn those features off;
nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would
not intercept previously enabled MSRs even though they were not safe
with the new configuration.

Correctly re-enabling interceptions is not enough as a second bug would
still allow L1+L2 to access host's MSRs: msr bitmap was shared for all
VMCSs, so L1 could trigger a race to get the desired combination of msr
bitmap and VMX controls.

This fix allocates a msr bitmap for every L1 VCPU, allows only safe
x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would
have to intercept everything anyway.

Fixes: 3af18d9c ("KVM: nVMX: Prepare for using hardware MSR bitmap")
Reported-by: NJim Mattson <jmattson@google.com>
Suggested-by: NWincy Van <fanwenyi0529@gmail.com>
Reviewed-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

d048c098

17 8月, 2016 4 次提交

arm64: KVM: report configured SRE value to 32-bit world · f7f6f2d9

由 Vladimir Murzin 提交于 8月 10, 2016

After commit b34f2bcb ("arm64: KVM: Make ICC_SRE_EL1 access return the
configured SRE value") we report SRE value to 64-bit guest, but 32-bit
one still handled as RAZ/WI what leads to funny promise we do not keep:

"GICv3: GIC: unable to set SRE (disabled at EL2), panic ahead"

Instead, return the actual value of the ICC_SRE_EL1 register that the
guest should see.

 [ Tweaked commit message - Christoffer ]
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

f7f6f2d9

arm64: KVM: remove misleading comment on pmu status · b63bebe2

由 Vladimir Murzin 提交于 8月 10, 2016

Comment about how PMU access is handled is not relavant since v4.6
where proper PMU support was added in.
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

b63bebe2

arm64: Document workaround for Cortex-A72 erratum #853709 · 674e7012

由 Marc Zyngier 提交于 8月 16, 2016

We already have a workaround for Cortex-A57 erratum #852523,
but Cortex-A72 r0p0 to r0p2 do suffer from the same issue
(known as erratum #853709).

Let's document the fact that we already handle this.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

674e7012

KVM: arm/arm64: Change misleading use of is_error_pfn · 9ac71595

由 Christoffer Dall 提交于 8月 17, 2016

When converting a gfn to a pfn, we call gfn_to_pfn_prot, which returns
various kinds of error values. It turns out that is_error_pfn() only
returns true when the gfn was found in a memory slot and could somehow
not be used, but it does not return true if the gfn does not belong to
any memory slot.

Change use to is_error_noslot_pfn() which covers both cases.

Note: Since we already check for kvm_is_error_hva(hva) explicitly in the
caller of this function while holding the kvm->srcu lock protecting the
memory slots, this should never be a problem, but nevertheless this
change is warranted as it shows the intention of the code.
Reported-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>

9ac71595

13 8月, 2016 7 次提交

h8300: Add missing include file to asm/io.h · 2b05980d

由 Guenter Roeck 提交于 6月 08, 2016

h8300 builds fail with

arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’
arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’
arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’

and many related errors.

Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix")
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

2b05980d

unicore32: mm: Add missing parameter to arch_vma_access_permitted · 783011b1

由 Guenter Roeck 提交于 3月 21, 2016

unicore32 fails to compile with the following errors.

mm/memory.c: In function ‘__handle_mm_fault’:
mm/memory.c:3381: error:
	too many arguments to function ‘arch_vma_access_permitted’
mm/gup.c: In function ‘check_vma_flags’:
mm/gup.c:456: error:
	too many arguments to function ‘arch_vma_access_permitted’
mm/gup.c: In function ‘vma_permits_fault’:
mm/gup.c:640: error:
	too many arguments to function ‘arch_vma_access_permitted’

Fixes: d61172b4 ("mm/core, x86/mm/pkeys: Differentiate instruction fetches")
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Acked-by: NGuan Xuetao <gxt@mprc.pku.edu.cn>

783011b1

arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO · 53fb45d3

由 Masahiro Yamada 提交于 6月 08, 2016

When CONFIG_LOCALVERSION_AUTO is disabled, the version string is
just a tag name (or with a '+' appended if HEAD is not a tagged
commit).

During the development (and especially when git-bisecting), longer
version string would be helpful to identify the commit we are running.

This is a default y option, so drop the unset to enable it.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

53fb45d3

arm64: defconfig: add options for virtualization and containers · 2323439f

由 Riku Voipio 提交于 6月 15, 2016

Enable options commonly needed by popular virtualization
and container applications. Use modules when possible to
avoid too much overhead for users not interested.

- add namespace and cgroup options needed
- add seccomp - optional, but enhances Qemu etc
- bridge, nat, veth, macvtap and multicast for routing
  guests and containers
- btfrs and overlayfs modules for container COW backends
- while near it, make fuse a module instead of built-in.

Generated with make saveconfig and dropping unrelated spurious
change hunks while commiting. bloat-o-meter old-vmlinux vmlinux:

add/remove: 905/390 grow/shrink: 767/229 up/down: 183513/-94861 (88652)
....
Total: Before=10515408, After=10604060, chg +0.84%
Signed-off-by: NRiku Voipio <riku.voipio@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

2323439f

arm64: hibernate: handle allocation failures · dfbca61a

由 Mark Rutland 提交于 8月 11, 2016

In create_safe_exec_page(), we create a copy of the hibernate exit text,
along with some page tables to map this via TTBR0. We then install the
new tables in TTBR0.

In swsusp_arch_resume() we call create_safe_exec_page() before trying a
number of operations which may fail (e.g. copying the linear map page
tables). If these fail, we bail out of swsusp_arch_resume() and return
an error code, but leave TTBR0 as-is. Subsequently, the core hibernate
code will call free_basic_memory_bitmaps(), which will free all of the
memory allocations we made, including the page tables installed in
TTBR0.

Thus, we may have TTBR0 pointing at dangling freed memory for some
period of time. If the hibernate attempt was triggered by a user
requesting a hibernate test via the reboot syscall, we may return to
userspace with the clobbered TTBR0 value.

Avoid these issues by reorganising swsusp_arch_resume() such that we
have no failure paths after create_safe_exec_page(). We also add a check
that the zero page allocation succeeded, matching what we have for other
allocations.

Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NJames Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 4.7+
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

dfbca61a

arm64: hibernate: avoid potential TLB conflict · 0194e760

由 Mark Rutland 提交于 8月 11, 2016

In create_safe_exec_page we install a set of global mappings in TTBR0,
then subsequently invalidate TLBs. While TTBR0 points at the zero page,
and the TLBs should be free of stale global entries, we may have stale
ASID-tagged entries (e.g. from the EFI runtime services mappings) for
the same VAs. Per the ARM ARM these ASID-tagged entries may conflict
with newly-allocated global entries, and we must follow a
Break-Before-Make approach to avoid issues resulting from this.

This patch reworks create_safe_exec_page to invalidate TLBs while the
zero page is still in place, ensuring that there are no potential
conflicts when the new TTBR0 value is installed. As a single CPU is
online while this code executes, we do not need to perform broadcast TLB
maintenance, and can call local_flush_tlb_all(), which also subsumes
some barriers. The remaining assembly is converted to use write_sysreg()
and isb().

Other than this, we safely manipulate TTBRs in the hibernate dance. The
code we install as part of the new TTBR0 mapping (the hibernated
kernel's swsusp_arch_suspend_exit) installs a zero page into TTBR1,
invalidates TLBs, then installs its preferred value. Upon being restored
to the middle of swsusp_arch_suspend, the new image will call
__cpu_suspend_exit, which will call cpu_uninstall_idmap, installing the
zero page in TTBR0 and invalidating all TLB entries.

Fixes: 82869ac5 ("arm64: kernel: Add support for hibernate/suspend-to-disk")
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NJames Morse <james.morse@arm.com>
Tested-by: NJames Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 4.7+
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0194e760

arm64: Handle el1 synchronous instruction aborts cleanly · 9adeb8e7

由 Laura Abbott 提交于 8月 09, 2016

Executing from a non-executable area gives an ugly message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at ffff0000084c0e08
lkdtm: attempting bad execution at ffff000008880700
Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL)
CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
Hardware name: linux,dummy-virt (DT)
task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

The 'IABT (current EL)' indicates the error but it's a bit cryptic
without knowledge of the ARM ARM. There is also no indication of the
specific address which triggered the fault. The increase in kernel
page permissions makes hitting this case more likely as well.
Handling the case in the vectors gives a much more familiar looking
error message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at ffff0000084c0840
lkdtm: attempting bad execution at ffff000008880680
Unable to handle kernel paging request at virtual address ffff000008880680
pgd = ffff8000089b2000
[ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000
Internal error: Oops: 8400000e [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
Hardware name: linux,dummy-virt (DT)
task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9adeb8e7

12 8月, 2016 9 次提交

MIPS: KVM: Propagate kseg0/mapped tlb fault errors · 9b731bcf

由 James Hogan 提交于 8月 11, 2016

Propagate errors from kvm_mips_handle_kseg0_tlb_fault() and
kvm_mips_handle_mapped_seg_tlb_fault(), usually triggering an internal
error since they normally indicate the guest accessed bad physical
memory or the commpage in an unexpected way.

Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

9b731bcf

MIPS: KVM: Fix gfn range check in kseg0 tlb faults · 0741f52d

由 James Hogan 提交于 8月 11, 2016

Two consecutive gfns are loaded into host TLB, so ensure the range check
isn't off by one if guest_pmap_npages is odd.

Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

0741f52d

MIPS: KVM: Add missing gfn range check · 8985d503

由 James Hogan 提交于 8月 11, 2016

kvm_mips_handle_mapped_seg_tlb_fault() calculates the guest frame number
based on the guest TLB EntryLo values, however it is not range checked
to ensure it lies within the guest_pmap. If the physical memory the
guest refers to is out of range then dump the guest TLB and emit an
internal error.

Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

8985d503

MIPS: KVM: Fix mapped fault broken commpage handling · c604cffa

由 James Hogan 提交于 8月 11, 2016

kvm_mips_handle_mapped_seg_tlb_fault() appears to map the guest page at
virtual address 0 to PFN 0 if the guest has created its own mapping
there. The intention is unclear, but it may have been an attempt to
protect the zero page from being mapped to anything but the comm page in
code paths you wouldn't expect from genuine commpage accesses (guest
kernel mode cache instructions on that address, hitting trapping
instructions when executing from that address with a coincidental TLB
eviction during the KVM handling, and guest user mode accesses to that
address).

Fix this to check for mappings exactly at KVM_GUEST_COMMPAGE_ADDR (it
may not be at address 0 since commit 42aa12e7 ("MIPS: KVM: Move
commpage so 0x0 is unmapped")), and set the corresponding EntryLo to be
interpreted as 0 (invalid).

Fixes: 858dd5d4 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c604cffa

KVM: Protect device ops->create and list_add with kvm->lock · a28ebea2

由 Christoffer Dall 提交于 8月 09, 2016

KVM devices were manipulating list data structures without any form of
synchronization, and some implementations of the create operations also
suffered from a lack of synchronization.

Now when we've split the xics create operation into create and init, we
can hold the kvm->lock mutex while calling the create operation and when
manipulating the devices list.

The error path in the generic code gets slightly ugly because we have to
take the mutex again and delete the device from the list, but holding
the mutex during anon_inode_getfd or releasing/locking the mutex in the
common non-error path seemed wrong.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

a28ebea2

KVM: PPC: Move xics_debugfs_init out of create · 023e9fdd

由 Christoffer Dall 提交于 8月 09, 2016

As we are about to hold the kvm->lock during the create operation on KVM
devices, we should move the call to xics_debugfs_init into its own
function, since holding a mutex over extended amounts of time might not
be a good idea.

Introduce an init operation on the kvm_device_ops struct which cannot
fail and call this, if configured, after the device has been created.
Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

023e9fdd

KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed · aca411a4

由 Julius Niedworok 提交于 8月 03, 2016

When triggering KVM_RUN without a user memory region being mapped
(KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could
happen, if the user memory region was not mapped initially or if it
was unmapped after the vcpu is initialized. The function
kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The
check function always clears this bit. If gmap_mprotect_notify
returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD
was not set anymore. So the next time kvm_s390_handle_requests is
called, the execution would fall trough the check for
KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if
gmap_mprotect_notify returns an error code. Resetting the bit with
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug.
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: NJulius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

aca411a4

KVM: s390: set the prefix initially properly · 75a4615c

由 Julius Niedworok 提交于 8月 03, 2016

When KVM_RUN is triggered on a VCPU without an initial reset, a
validity intercept occurs.
Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially,
thus preventing the bug.
Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: NJulius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

75a4615c

perf/x86/intel/uncore: Add enable_box for client MSR uncore · 95f3be79

由 Kan Liang 提交于 8月 11, 2016

There are bug reports about miscounting uncore counters on some
client machines like Sandybridge, Broadwell and Skylake. It is
very likely to be observed on idle systems.

This issue is caused by a hardware issue. PERF_GLOBAL_CTL could be
cleared after Package C7, and nothing will be count.
The related errata (HSD 158) could be found in:

  www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf

This patch tries to work around this issue by re-enabling PERF_GLOBAL_CTL
in ->enable_box(). The workaround does not cover all cases. It helps for new
events after returning from C7. But it cannot prevent C7, it will still
miscount if a counter is already active.

There is no drawback in leaving it enabled, so it does not need
disable_box() here.
Signed-off-by: NKan Liang <kan.liang@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1470925874-59943-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

95f3be79

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功