- 18 3月, 2021 1 次提交
-
-
由 Vitaly Kuznetsov 提交于
Create an infrastructure for tracking Hyper-V TSC page status, i.e. if it was updated from guest/host side or if we've failed to set it up (because e.g. guest wrote some garbage to HV_X64_MSR_REFERENCE_TSC) and there's no need to retry. Also, in a hypothetical situation when we are in 'always catchup' mode for TSC we can now avoid contending 'hv->hv_lock' on every guest enter by setting the state to HV_TSC_PAGE_BROKEN after compute_tsc_page_parameters() returns false. Check for HV_TSC_PAGE_SET state instead of '!hv->tsc_ref.tsc_sequence' in get_time_ref_counter() to properly handle the situation when we failed to write the updated TSC page values to the guest. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210316143736.964151-4-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 3月, 2021 6 次提交
-
-
由 Vitaly Kuznetsov 提交于
When KVM_REQ_MASTERCLOCK_UPDATE request is issued (e.g. after migration) we need to make sure no vCPU sees stale values in PV clock structures and thus all vCPUs are kicked with KVM_REQ_CLOCK_UPDATE. Hyper-V TSC page clocksource is global and kvm_guest_time_update() only updates in on vCPU0 but this is not entirely correct: nothing blocks some other vCPU from entering the guest before we finish the update on CPU0 and it can read stale values from the page. Invalidate TSC page in kvm_gen_update_masterclock() to switch all vCPUs to using MSR based clocksource (HV_X64_MSR_TIME_REF_COUNT). Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210316143736.964151-3-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Vitaly Kuznetsov 提交于
HV_X64_MSR_TSC_EMULATION_STATUS indicates whether TSC accesses are emulated after migration (to accommodate for a different host TSC frequency when TSC scaling is not supported; we don't implement this in KVM). Guest can use the same MSR to stop TSC access emulation by writing zero. Writing anything else is forbidden. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210316143736.964151-2-vkuznets@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
Store the address space ID in the TDP iterator so that it can be retrieved without having to bounce through the root shadow page. This streamlines the code and fixes a Sparse warning about not properly using rcu_dereference() when grabbing the ID from the root on the fly. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-5-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
In tdp_mmu_iter_cond_resched there is a call to tdp_iter_start which causes the iterator to continue its walk over the paging structure from the root. This is needed after a yield as paging structure could have been freed in the interim. The tdp_iter_start call is not very clear and something of a hack. It requires exposing tdp_iter fields not used elsewhere in tdp_mmu.c and the effect is not obvious from the function name. Factor a more aptly named function out of tdp_iter_start and call it from tdp_mmu_iter_cond_resched and tdp_iter_start. No functional change intended. Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-4-bgardon@google.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
Fix a missing rcu_dereference in tdp_mmu_zap_spte_atomic. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-3-bgardon@google.com> Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Ben Gardon 提交于
The pt passed into handle_removed_tdp_mmu_page does not need RCU protection, as it is not at any risk of being freed by another thread at that point. However, the implicit cast from tdp_sptep_t to u64 * dropped the __rcu annotation without a proper rcu_derefrence. Fix this by passing the pt as a tdp_ptep_t and then rcu_dereferencing it in the function. Suggested-by: NSean Christopherson <seanjc@google.com> Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NBen Gardon <bgardon@google.com> Message-Id: <20210315233803.2706477-2-bgardon@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 14 3月, 2021 2 次提交
-
-
由 Sergei Trofimovich 提交于
In https://bugs.gentoo.org/769614 Dmitry noticed that `ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly. The bug is in mismatch between get/set errors: static inline long syscall_get_error(struct task_struct *task, struct pt_regs *regs) { return regs->r10 == -1 ? regs->r8:0; } static inline long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs) { return regs->r8; } static inline void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, int error, long val) { if (error) { /* error < 0, but ia64 uses > 0 return value */ regs->r8 = -error; regs->r10 = -1; } else { regs->r8 = val; regs->r10 = 0; } } Tested on v5.10 on rx3600 machine (ia64 9040 CPU). Link: https://lkml.kernel.org/r/20210221002554.333076-2-slyfox@gentoo.org Link: https://bugs.gentoo.org/769614Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org> Reported-by: NDmitry V. Levin <ldv@altlinux.org> Reviewed-by: NDmitry V. Levin <ldv@altlinux.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Sergei Trofimovich 提交于
In https://bugs.gentoo.org/769614 Dmitry noticed that `ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called via glibc's syscall() wrapper. ia64 has two ways to call syscalls from userspace: via `break` and via `eps` instructions. The difference is in stack layout: 1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8} 2. `break` uses userspace stack frame: may be locals (glibc provides one), in{0..7} == out{0..8}. Both work fine in syscall handling cde itself. But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to re-extract syscall arguments but it does not account for locals. The change always skips locals registers. It should not change `eps` path as kernel's handler already enforces locals=0 and fixes `break`. Tested on v5.10 on rx3600 machine (ia64 9040 CPU). Link: https://lkml.kernel.org/r/20210221002554.333076-1-slyfox@gentoo.org Link: https://bugs.gentoo.org/769614Signed-off-by: NSergei Trofimovich <slyfox@gentoo.org> Reported-by: NDmitry V. Levin <ldv@altlinux.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 3月, 2021 4 次提交
-
-
由 Wanpeng Li 提交于
Advancing the timer expiration should only be necessary on guest initiated writes. When we cancel the timer and clear .pending during state restore, clear expired_tscdeadline as well. Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1614818118-965-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sean Christopherson 提交于
If mmu_lock is held for write, don't bother setting !PRESENT SPTEs to REMOVED_SPTE when recursively zapping SPTEs as part of shadow page removal. The concurrent write protections provided by REMOVED_SPTE are not needed, there are no backing page side effects to record, and MMIO SPTEs can be left as is since they are protected by the memslot generation, not by ensuring that the MMIO SPTE is unreachable (which is racy with respect to lockless walks regardless of zapping behavior). Skipping !PRESENT drastically reduces the number of updates needed to tear down sparsely populated MMUs, e.g. when tearing down a 6gb VM that didn't touch much memory, 6929/7168 (~96.6%) of SPTEs were '0' and could be skipped. Avoiding the write itself is likely close to a wash, but avoiding __handle_changed_spte() is a clear-cut win as that involves saving and restoring all non-volatile GPRs (it's a subtly big function), as well as several conditional branches before bailing out. Cc: Ben Gardon <bgardon@google.com> Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210310003029.1250571-1-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 88 On-line CPU(s) list: 0-63 Off-line CPU(s) list: 64-87 # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.10.0-rc3-tlinux2-0050+ root=/dev/mapper/cl-root ro rd.lvm.lv=cl/root rhgb quiet console=ttyS0 LANG=en_US .UTF-8 no-kvmclock-vsyscall # echo 1 > /sys/devices/system/cpu/cpu76/online -bash: echo: write error: Cannot allocate memory The per-cpu vsyscall pvclock data pointer assigns either an element of the static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall and sometimes you end up with vsyscall disabled if the host does something strange. This patch fixes it by allocating this dynamically memory unconditionally even if vsyscall is disabled. Fixes: 6a1cac56 ("x86/kvm: Use __bss_decrypted attribute in shared variables") Reported-by: NZelin Deng <zelin.deng@linux.alibaba.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: stable@vger.kernel.org#v4.19-rc5+ Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1614130683-24137-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Muhammad Usama Anjum 提交于
This patch adds the annotation to fix the following sparse errors: arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map * arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map * arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter * arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map * arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map * arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * Signed-off-by: NMuhammad Usama Anjum <musamaanjum@gmail.com> Message-Id: <20210305191123.GA497469@LEGION> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 3月, 2021 4 次提交
-
-
由 Marc Zyngier 提交于
When registering a memslot, we check the size and location of that memslot against the IPA size to ensure that we can provide guest access to the whole of the memory. Unfortunately, this check rejects memslot that end-up at the exact limit of the addressing capability for a given IPA size. For example, it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit IPA space. Fix it by relaxing the check to accept a memslot reaching the limit of the IPA space. Fixes: c3058d5d ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE") Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: NAndrew Jones <drjones@redhat.com> Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org
-
由 Marc Zyngier 提交于
KVM/arm64 has forever used a 40bit default IPA space, partially due to its 32bit heritage (where the only choice is 40bit). However, there are implementations in the wild that have a *cough* much smaller *cough* IPA space, which leads to a misprogramming of VTCR_EL2, and a guest that is stuck on its first memory access if userspace dares to ask for the default IPA setting (which most VMMs do). Instead, blundly reject the creation of such VM, as we can't satisfy the requirements from userspace (with a one-off warning). Also clarify the boot warning, and document that the VM creation will fail when an unsupported IPA size is provided. Although this is an ABI change, it doesn't really change much for userspace: - the guest couldn't run before this change, but no error was returned. At least userspace knows what is happening. - a memory slot that was accepted because it did fit the default IPA space now doesn't even get a chance to be registered. The other thing that is left doing is to convince userspace to actually use the IPA space setting instead of relying on the antiquated default. Fixes: 233a7cb2 ("kvm: arm64: Allow tuning the physical address size for VM") Signed-off-by: NMarc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: NAndrew Jones <drjones@redhat.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210311100016.3830038-2-maz@kernel.org
-
由 Peter Zijlstra 提交于
Commit ab234a26 ("x86/pv: Rework arch_local_irq_restore() to not use popf") replaced "push %reg; popf" with something like: "test $0x200, %reg; jz 1f; sti; 1:", which breaks the pushf/popf symmetry that commit ea24213d ("objtool: Add UACCESS validation") relies on. The result is: drivers/gpu/drm/amd/amdgpu/si.o: warning: objtool: si_common_hw_init()+0xf36: PUSHF stack exhausted Meanwhile, commit c9c324dc ("objtool: Support stack layout changes in alternatives") makes that we can actually use stack-ops in alternatives, which means we can revert 1ff865e3 ("x86,smap: Fix smap_{save,restore}() alternatives"). That in turn means we can limit the PUSHF/POPF handling of ea24213d to those instructions that are in alternatives. Fixes: ab234a26 ("x86/pv: Rework arch_local_irq_restore() to not use popf") Reported-by: NBorislav Petkov <bp@alien8.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/YEY4rIbQYa5fnnEp@hirez.programming.kicks-ass.net
-
由 Christophe Leroy 提交于
unrecoverable_exception() is called from interrupt handlers or after an interrupt handler has failed. Make it a standard function to avoid doubling the actions performed on interrupt entry (e.g.: user time accounting). Fixes: 3a96570f ("powerpc: convert interrupt handlers to use wrappers") Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ae96c59fa2cb7f24a8929c58cfa2c909cb8ff1f1.1615291471.git.christophe.leroy@csgroup.eu
-
- 11 3月, 2021 6 次提交
-
-
由 Ard Biesheuvel 提交于
These routines lost all existing users during the latest merge window so we can remove them. This avoids the need to fix them in the context of fixing a regression related to the ID map on 52-bit VA kernels. Signed-off-by: NArd Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210310171515.416643-3-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Ard Biesheuvel 提交于
52-bit VA kernels can run on hardware that is only 48-bit capable, but configure the ID map as 52-bit by default. This was not a problem until recently, because the special T0SZ value for a 52-bit VA space was never programmed into the TCR register anwyay, and because a 52-bit ID map happens to use the same number of translation levels as a 48-bit one. This behavior was changed by commit 1401bef7 ("arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ value for a 52-bit VA to be programmed into TCR_EL1. While some hardware simply ignores this, Mark reports that Amberwing systems choke on this, resulting in a broken boot. But even before that commit, the unsupported idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly as well. Given that we already have to deal with address spaces being either 48-bit or 52-bit in size, the cleanest approach seems to be to simply default to a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the kernel in DRAM requires it. This is guaranteed not to happen unless the system is actually 52-bit VA capable. Fixes: 90ec95cd ("arm64: mm: Introduce VA_BITS_MIN") Reported-by: NMark Salter <msalter@redhat.com> Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.comSigned-off-by: NArd Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Masahiro Yamada 提交于
As Documentation/kbuild/llvm.rst notes, LLVM=1 switches the default of tools, but you can still override CC, LD, etc. individually. This LLVM=1 check is unneeded because each tool is already checked separately. "make CC=clang LD=ld.lld NM=llvm-nm AR=llvm-ar LLVM_IAS=1 menuconfig" should be able to enable Clang LTO. Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org> Reviewed-by: NNathan Chancellor <nathan@kernel.org>
-
由 Sami Tolvanen 提交于
While LTO with KASAN is normally not useful, hardware tag-based KASAN can be used also in production kernels with ARM64_MTE. Therefore, allow KASAN_HW_TAGS to be selected together with HAS_LTO_CLANG. Reported-by: NAlistair Delva <adelva@google.com> Signed-off-by: NSami Tolvanen <samitolvanen@google.com> Reviewed-by: NKees Cook <keescook@chromium.org> Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
-
由 Jan Beulich 提交于
It's not helpful if every driver has to cook its own. Generalize xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which shouldn't have expanded to zero to begin with). Use the constants in p2m.c and gntdev.c right away, and update field types where necessary so they would match with the constants' types (albeit without touching struct ioctl_gntdev_grant_ref's ref field, as that's part of the public interface of the kernel and would require introducing a dependency on Xen's grant_table.h public header). Signed-off-by: NJan Beulich <jbeulich@suse.com> Reviewed-by: NJuergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.comSigned-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
由 Jan Beulich 提交于
They're only used internally, and the layering violation they contain (x86) or imply (Arm) of calling HYPERVISOR_grant_table_op() strongly advise against any (uncontrolled) use from a module. The functions also never had users except the ones from drivers/xen/grant-table.c forever since their introduction in 3.15. Signed-off-by: NJan Beulich <jbeulich@suse.com> Reviewed-by: NStefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/746a5cd6-1446-eda4-8b23-03c1cac30b8d@suse.comSigned-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
-
- 10 3月, 2021 11 次提交
-
-
由 Sean Christopherson 提交于
Initialize x86_pmu.guest_get_msrs to return 0/NULL to handle the "nop" case. Patching in perf_guest_get_msrs_nop() during setup does not work if there is no PMU, as setup bails before updating the static calls, leaving x86_pmu.guest_get_msrs NULL and thus a complete nop. Ultimately, this causes VMX abort on VM-Exit due to KVM putting random garbage from the stack into the MSR load list. Add a comment in KVM to note that nr_msrs is valid if and only if the return value is non-NULL. Fixes: abd562df ("x86/perf: Use static_call for x86_pmu.guest_get_msrs") Reported-by: NDmitry Vyukov <dvyukov@google.com> Reported-by: syzbot+cce9ef2dd25246f815ee@syzkaller.appspotmail.com Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210309171019.1125243-1-seanjc@google.com
-
由 Rob Herring 提交于
Commit 0fdf1bb7 ("arm64: perf: Avoid PMXEV* indirection") changed armv8pmu_read_evcntr() to return a u32 instead of u64. The result is silent truncation of the event counter when using 64-bit counters. Given the offending commit appears to have passed thru several folks, it seems likely this was a bad rebase after v8.5 PMU 64-bit counters landed. Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: <stable@vger.kernel.org> Fixes: 0fdf1bb7 ("arm64: perf: Avoid PMXEV* indirection") Signed-off-by: NRob Herring <robh@kernel.org> Acked-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.orgSigned-off-by: NWill Deacon <will@kernel.org>
-
由 James Morse 提交于
As per ARM ARM DDI 0487G.a, when FEAT_LPA2 is implemented, ID_AA64MMFR0_EL1 might contain a range of values to describe supported translation granules (4K and 16K pages sizes in particular) instead of just enabled or disabled values. This changes __enable_mmu() function to handle complete acceptable range of values (depending on whether the field is signed or unsigned) now represented with ID_AA64MMFR0_TGRAN_SUPPORTED_[MIN..MAX] pair. While here, also fix similar situations in EFI stub and KVM as well. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NJames Morse <james.morse@arm.com> Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1615355590-21102-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Catalin Marinas 提交于
In a system supporting MTE, the linear map must allow reading/writing allocation tags by setting the memory type as Normal Tagged. Currently, this is only handled for memory present at boot. Hotplugged memory uses Normal non-Tagged memory. Introduce pgprot_mhp() for hotplugged memory and use it in add_memory_resource(). The arm64 code maps pgprot_mhp() to pgprot_tagged(). Note that ZONE_DEVICE memory should not be mapped as Tagged and therefore setting the memory type in arch_add_memory() is not feasible. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Fixes: 0178dc76 ("arm64: mte: Use Normal Tagged attributes for the linear map") Reported-by: NPatrick Daly <pdaly@codeaurora.org> Tested-by: NPatrick Daly <pdaly@codeaurora.org> Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora.org Cc: <stable@vger.kernel.org> # 5.10.x Cc: Will Deacon <will@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NVincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Corentin Labbe 提交于
After my patch there is CONFIG_ATA defined twice. Remove the duplicate one. Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot test with NFS. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Fixes: a57cdeb3 ("sparc: sparc64_defconfig: add necessary configs for qemu") Signed-off-by: NCorentin Labbe <clabbe@baylibre.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rob Gardner 提交于
is_no_fault_exception() has two bugs which were discovered via random opcode testing with stress-ng. Both are caused by improper filtering of opcodes. The first bug can be triggered by a floating point store with a no-fault ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040. The code first tests op3[5] (0x1000000), which denotes a floating point instruction, and then tests op3[2] (0x200000), which denotes a store instruction. But these bits are not mutually exclusive, and the above mentioned opcode has both bits set. The intent is to filter out stores, so the test for stores must be done first in order to have any effect. The second bug can be triggered by a floating point load with one of the invalid ASI values 0x8e or 0x8f, which pass this check in is_no_fault_exception(): if ((asi & 0xf2) == ASI_PNF) An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38", opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal in handle_ldf_stq(), and is_no_fault_exception() must not allow these invalid asi values to make it that far. In both of these cases, handle_ldf_stq() reacts by calling sun4v_data_access_exception() or spitfire_data_access_exception(), which call is_no_fault_exception() and results in an infinite recursion. Signed-off-by: NRob Gardner <rob.gardner@oracle.com> Tested-by: NAnatoly Pugachev <matorola@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christophe Leroy 提交于
Add stub instances of enable_kernel_vsx() and disable_kernel_vsx() when CONFIG_VSX is not set, to avoid following build failure. CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o In file included from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29, from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:37, from drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:27: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: In function 'dcn_bw_apply_registry_override': ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:64:3: error: implicit declaration of function 'enable_kernel_vsx'; did you mean 'enable_kernel_fp'? [-Werror=implicit-function-declaration] 64 | enable_kernel_vsx(); \ | ^~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:640:2: note: in expansion of macro 'DC_FP_START' 640 | DC_FP_START(); | ^~~~~~~~~~~ ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:75:3: error: implicit declaration of function 'disable_kernel_vsx'; did you mean 'disable_kernel_fp'? [-Werror=implicit-function-declaration] 75 | disable_kernel_vsx(); \ | ^~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:676:2: note: in expansion of macro 'DC_FP_END' 676 | DC_FP_END(); | ^~~~~~~~~ cc1: some warnings being treated as errors make[5]: *** [drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o] Error 1 This works because the caller is checking if VSX is available using cpu_has_feature(): #define DC_FP_START() { \ if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ preempt_disable(); \ enable_kernel_vsx(); \ } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ preempt_disable(); \ enable_kernel_altivec(); \ } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ preempt_disable(); \ enable_kernel_fp(); \ } \ When CONFIG_VSX is not selected, cpu_has_feature(CPU_FTR_VSX_COMP) constant folds to 'false' so the call to enable_kernel_vsx() is discarded and the build succeeds. Fixes: 16a9dea1 ("amdgpu: Enable initial DCN support on POWER") Cc: stable@vger.kernel.org # v5.6+ Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org> Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> [mpe: Incorporate some discussion comments into the change log] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8d7d285a027e9d21f5ff7f850fa71a2655b0c4af.1615279170.git.christophe.leroy@csgroup.eu
-
由 Daniel Axtens 提交于
Nick's patch cleaning up the SRR specifiers in exception-64s.S missed a single instance of EXC_HV_OR_STD. Clean that up. Caught by clang's integrated assembler. Fixes: 3f7fbd97 ("powerpc/64s/exception: Clean up SRR specifiers") Signed-off-by: NDaniel Axtens <dja@axtens.net> Acked-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210225031006.1204774-2-dja@axtens.net
-
由 Nicholas Piggin 提交于
This bit operation was inverted and set the low bit rather than cleared it, breaking the ability to ptrace non-volatile GPRs after exec. Fix. Only affects 64e and 32-bit. Fixes: feb9df34 ("powerpc/64s: Always has full regs, so remove remnant checks") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210308085530.3191843-1-npiggin@gmail.com
-
由 Michael Ellerman 提交于
In ppc_function_entry() we look for a specific set of instructions by masking the instructions and comparing with a known value. Currently those known values are just literal hex values, and we recently discovered one of them was wrong. Instead construct the values using the existing constants we have for defining various fields of instructions. Suggested-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Acked-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20210309071544.515303-1-mpe@ellerman.id.au
-
由 Marc Zyngier 提交于
It recently became apparent that the ARMv8 architecture has interesting rules regarding attributes being used when fetching instructions if the MMU is off at Stage-1. In this situation, the CPU is allowed to fetch from the PoC and allocate into the I-cache (unless the memory is mapped with the XN attribute at Stage-2). If we transpose this to vcpus sharing a single physical CPU, it is possible for a vcpu running with its MMU off to influence another vcpu running with its MMU on, as the latter is expected to fetch from the PoU (and self-patching code doesn't flush below that level). In order to solve this, reuse the vcpu-private TLB invalidation code to apply the same policy to the I-cache, nuking it every time the vcpu runs on a physical CPU that ran another vcpu of the same VM in the past. This involve renaming __kvm_tlb_flush_local_vmid() to __kvm_flush_cpu_context(), and inserting a local i-cache invalidation there. Cc: stable@vger.kernel.org Signed-off-by: NMarc Zyngier <maz@kernel.org> Acked-by: NWill Deacon <will@kernel.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210303164505.68492-1-maz@kernel.org
-
- 09 3月, 2021 6 次提交
-
-
由 Andrey Konovalov 提交于
When CONFIG_DEBUG_VIRTUAL is enabled, the default page_to_virt() macro implementation from include/linux/mm.h is used. That definition doesn't account for KASAN tags, which leads to no tags on page_alloc allocations. Provide an arm64-specific definition for page_to_virt() when CONFIG_DEBUG_VIRTUAL is enabled that takes care of KASAN tags. Fixes: 2813b9c0 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Cc: <stable@vger.kernel.org> Signed-off-by: NAndrey Konovalov <andreyknvl@google.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/4b55b35202706223d3118230701c6a59749d9b72.1615219501.git.andreyknvl@google.comSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Joerg Roedel 提交于
The #VC handler must run in atomic context and cannot sleep. This is a problem when it tries to fetch instruction bytes from user-space via copy_from_user(). Introduce a insn_fetch_from_user_inatomic() helper which uses __copy_from_user_inatomic() to safely copy the instruction bytes to kernel memory in the #VC handler. Fixes: 5e3427a7 ("x86/sev-es: Handle instruction fetches from user-space") Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-6-joro@8bytes.org
-
由 Joerg Roedel 提交于
Call irqentry_nmi_enter()/irqentry_nmi_exit() in the #VC handler to correctly track the IRQ state during its execution. Fixes: 0786138c ("x86/sev-es: Add a Runtime #VC Exception Handler") Reported-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-5-joro@8bytes.org
-
由 Joerg Roedel 提交于
The code in the NMI handler to adjust the #VC handler IST stack is needed in case an NMI hits when the #VC handler is still using its IST stack. But the check for this condition also needs to look if the regs->sp value is trusted, meaning it was not set by user-space. Extend the check to not use regs->sp when the NMI interrupted user-space code or the SYSCALL gap. Fixes: 315562c9 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Reported-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-3-joro@8bytes.org
-
由 Naveen N. Rao 提交于
'lis r2,N' is 'addis r2,0,N' and the instruction encoding in the macro LIS_R2 is incorrect (it currently maps to 'addis r0,r2,N'). Fix the same. Fixes: c71b7eff ("powerpc: Add ABIv2 support to ppc_function_entry") Cc: stable@vger.kernel.org # v3.16+ Reported-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: NSegher Boessenkool <segher@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210304020411.16796-1-naveen.n.rao@linux.vnet.ibm.com
-
由 Thomas Bogendoerfer 提交于
BMIPS is one of the few platforms that do change the exception base. After commit 2dcb3964 ("memblock: do not start bottom-up allocations with kernel_end") we started seeing BMIPS boards fail to boot with the built-in FDT being corrupted. Before the cited commit, early allocations would be in the [kernel_end, RAM_END] range, but after commit they would be within [RAM_START + PAGE_SIZE, RAM_END]. The custom exception base handler that is installed by bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the memory region allocated by unflatten_and_copy_device_tree() thus corrupting the FDT used by the kernel. To fix this, we need to perform an early reservation of the custom exception space. Additional we reserve the first 4k (1k for R3k) for either normal exception vector space (legacy CPUs) or special vectors like cache exceptions. Huge thanks to Serge for analysing and proposing a solution to this issue. Fixes: 2dcb3964 ("memblock: do not start bottom-up allocations with kernel_end") Reported-by: NKamal Dasu <kdasu.kdev@gmail.com> Debugged-by: NSerge Semin <Sergey.Semin@baikalelectronics.ru> Acked-by: NMike Rapoport <rppt@linux.ibm.com> Tested-by: NFlorian Fainelli <f.fainelli@gmail.com> Reviewed-by: NSerge Semin <fancer.lancer@gmail.com> Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>
-