- 27 12月, 2019 40 次提交
-
-
由 Marc Zyngier 提交于
[ Upstream commit 03fdfb26 ] At the moment, the way we reset system registers is mildly insane: We write junk to them, call the reset functions, and then check that we have something else in them. The "fun" thing is that this can happen while the guest is running (PSCI, for example). If anything in KVM has to evaluate the state of a system register while junk is in there, bad thing may happen. Let's stop doing that. Instead, we track that we have called a reset function for that register, and assume that the reset function has done something. This requires fixing a couple of sysreg refinition in the trap table. In the end, the very need of this reset check is pretty dubious, as it doesn't check everything (a lot of the sysregs leave outside of the sys_regs[] array). It may well be axed in the near future. Tested-by: NZenghui Yu <yuzenghui@huawei.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
commit b6143d10d23ebb4a77af311e8b8b7f019d0163e6 upstream. The initial support for dynamic ftrace trampolines in modules made use of an indirect branch which loaded its target from the beginning of a special section (e71a4e1b ("arm64: ftrace: add support for far branches to dynamic ftrace")). Since no instructions were being patched, no cache maintenance was needed. However, later in be0f272b ("arm64: ftrace: emit ftrace-mod.o contents through code") this code was reworked to output the trampoline instructions directly into the PLT entry but, unfortunately, the necessary cache maintenance was overlooked. Add a call to __flush_icache_range() after writing the new trampoline instructions but before patching in the branch to the trampoline. Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: James Morse <james.morse@arm.com> Cc: <stable@vger.kernel.org> Fixes: be0f272b ("arm64: ftrace: emit ftrace-mod.o contents through code") Signed-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Anders Roxell 提交于
commit 3d584a3c upstream. When fall-through warnings was enabled by default, commit d93512ef0f0e ("Makefile: Globally enable fall-through warning"), the following warnings was starting to show up: In file included from ../arch/arm64/include/asm/kvm_emulate.h:19, from ../arch/arm64/kvm/regmap.c:13: ../arch/arm64/kvm/regmap.c: In function ‘vcpu_write_spsr32’: ../arch/arm64/include/asm/kvm_hyp.h:31:3: warning: this statement may fall through [-Wimplicit-fallthrough=] asm volatile(ALTERNATIVE(__msr_s(r##nvh, "%x0"), \ ^~~ ../arch/arm64/include/asm/kvm_hyp.h:46:31: note: in expansion of macro ‘write_sysreg_elx’ #define write_sysreg_el1(v,r) write_sysreg_elx(v, r, _EL1, _EL12) ^~~~~~~~~~~~~~~~ ../arch/arm64/kvm/regmap.c:180:3: note: in expansion of macro ‘write_sysreg_el1’ write_sysreg_el1(v, SYS_SPSR); ^~~~~~~~~~~~~~~~ ../arch/arm64/kvm/regmap.c:181:2: note: here case KVM_SPSR_ABT: ^~~~ In file included from ../arch/arm64/include/asm/cputype.h:132, from ../arch/arm64/include/asm/cache.h:8, from ../include/linux/cache.h:6, from ../include/linux/printk.h:9, from ../include/linux/kernel.h:15, from ../include/asm-generic/bug.h:18, from ../arch/arm64/include/asm/bug.h:26, from ../include/linux/bug.h:5, from ../include/linux/mmdebug.h:5, from ../include/linux/mm.h:9, from ../arch/arm64/kvm/regmap.c:11: ../arch/arm64/include/asm/sysreg.h:837:2: warning: this statement may fall through [-Wimplicit-fallthrough=] asm volatile("msr " __stringify(r) ", %x0" \ ^~~ ../arch/arm64/kvm/regmap.c:182:3: note: in expansion of macro ‘write_sysreg’ write_sysreg(v, spsr_abt); ^~~~~~~~~~~~ ../arch/arm64/kvm/regmap.c:183:2: note: here case KVM_SPSR_UND: ^~~~ Rework to add a 'break;' in the swich-case since it didn't have that, leading to an interresting set of bugs. Cc: stable@vger.kernel.org # v4.17+ Fixes: a8928195 ("KVM: arm64: Prepare to handle deferred save/restore of 32-bit registers") Signed-off-by: NAnders Roxell <anders.roxell@linaro.org> [maz: reworked commit message, fixed stable range] Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Qian Cai 提交于
[ Upstream commit 7d4e2dcf ] GCC throws a warning, arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page': arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used [-Wunused-but-set-variable] pud_t pud; ^~~ because pud_table() is a macro and compiled away. Fix it by making it a static inline function and for pud_sect() as well. Signed-off-by: NQian Cai <cai@lca.pw> Signed-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Masami Hiramatsu 提交于
[ Upstream commit ee07b93e ] Prohibit probing on return_address() and subroutines which is called from return_address(), since the it is invoked from trace_hardirqs_off() which is also kprobe blacklisted. Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NWill Deacon <will@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Qian Cai 提交于
[ Upstream commit f1d48362 ] GCC throws out this warning on arm64. drivers/firmware/efi/libstub/arm-stub.c: In function 'efi_entry': drivers/firmware/efi/libstub/arm-stub.c:132:22: warning: variable 'si' set but not used [-Wunused-but-set-variable] Fix it by making free_screen_info() a static inline function. Acked-by: NWill Deacon <will@kernel.org> Signed-off-by: NQian Cai <cai@lca.pw> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yang Yingliang 提交于
hulk inclusion category: feature bugzilla: 13227 CVE: NA --------------------------- enable CONFIG_NUMA_AWARE_SPINLOCKS in hulk_defconfig and storage_ci_defconfig Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Hanjun Guo 提交于
hulk inclusion category: feature bugzilla: 13227 CVE: NA ------------------------------------------------- Set numa-aware qspinlock default off and enable it by passing using_numa_aware_qspinlock in the boot cmdline. Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NWei Li <liwei391@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Hanjun Guo 提交于
hulk inclusion category: feature bugzilla: 13227 CVE: NA ------------------------------------------------- Enabling CNA is controlled via a new configuration option (NUMA_AWARE_SPINLOCKS). Add it for arm64 support. Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NWei Li <liwei391@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v5.3-rc2 commit cbdf8a189a66001c36007bf0f5c975d0376c5c3a category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- On a CPU that doesn't support SSBS, PSTATE[12] is RES0. In a system where only some of the CPUs implement SSBS, we end-up losing track of the SSBS bit across task migration. To address this issue, let's force the SSBS bit on context switch. Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3") Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> [will: inverted logic and added comments] Signed-off-by: NWill Deacon <will@kernel.org> Conflicts: arch/arm64/kernel/process.c [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Mark Rutland 提交于
mainline inclusion from mainline-v5.0-rc8 commit f54dada8274643e3ff4436df0ea124aeedc43cae category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- In valid_user_regs() we treat SSBS as a RES0 bit, and consequently it is unexpectedly cleared when we restore a sigframe or fiddle with GPRs via ptrace. This patch fixes valid_user_regs() to account for this, updating the function to refer to the latest ARM ARM (ARM DDI 0487D.a). For AArch32 tasks, SSBS appears in bit 23 of SPSR_EL1, matching its position in the AArch32-native PSR format, and we don't need to translate it as we have to for DIT. There are no other bit assignments that we need to account for today. As the recent documentation describes the DIT bit, we can drop our comment regarding DIT. While removing SSBS from the RES0 masks, existing inconsistent whitespace is corrected. Fixes: d71be2b6c0e19180 ("arm64: cpufeature: Detect SSBS and advertise to userspace") Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit b8925ee2 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- The cpu errata and feature enable callbacks are only called via their respective arm64_cpu_capabilities structure and therefore shouldn't exist in the global namespace. Move the PAN, RAS and cache maintenance emulation enable callbacks into the same files as their corresponding arm64_cpu_capabilities structures, making them static in the process. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/kernel/cpu_errata.c arch/arm64/kernel/cpufeature.c [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit 7c36447a category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- When running without VHE, it is necessary to set SCTLR_EL2.DSSBS if SSBD has been forcefully disabled on the kernel command-line. Acked-by: NChristoffer Dall <christoffer.dall@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit 8f04e8e6 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- On CPUs with support for PSTATE.SSBS, the kernel can toggle the SSBD state without needing to call into firmware. This patch hooks into the existing SSBD infrastructure so that SSBS is used on CPUs that support it, but it's all made horribly complicated by the very real possibility of big/little systems that don't uniformly provide the new capability. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/kernel/process.c arch/arm64/kernel/ssbd.c arch/arm64/kernel/cpufeature.c [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit 2d1b2a91d56b19636b740ea70c8399d1df249f20 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- Now that we're all merged nicely into mainline, there's no need to check to see if PR_SPEC_STORE_BYPASS is defined. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit d71be2b6c0e19180b5f80a6d42039cc074a693a2 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- Armv8.5 introduces a new PSTATE bit known as Speculative Store Bypass Safe (SSBS) which can be used as a mitigation against Spectre variant 4. Additionally, a CPU may provide instructions to manipulate PSTATE.SSBS directly, so that userspace can toggle the SSBS control without trapping to the kernel. This patch probes for the existence of SSBS and advertise the new instructions to userspace if they exist. Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/kernel/cpufeature.c arch/arm64/include/asm/cpucaps.h [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit ca7f686a category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- I was passing through and figuered I'd fix this up: featuer -> feature Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Nathan Chancellor 提交于
mainline inclusion from mainline-5.0 commit 0738c8b5 category: bugfix bugzilla: 11024 CVE: NA ------------------------------------------------- After commit cc9f8349 ("arm64: crypto: add NEON accelerated XOR implementation"), Clang builds for arm64 started failing with the following error message. arch/arm64/lib/xor-neon.c:58:28: error: incompatible pointer types assigning to 'const unsigned long *' from 'uint64_t *' (aka 'unsigned long long *') [-Werror,-Wincompatible-pointer-types] v3 = veorq_u64(vld1q_u64(dp1 + 6), vld1q_u64(dp2 + 6)); ^~~~~~~~ /usr/lib/llvm-9/lib/clang/9.0.0/include/arm_neon.h:7538:47: note: expanded from macro 'vld1q_u64' __ret = (uint64x2_t) __builtin_neon_vld1q_v(__p0, 51); \ ^~~~ There has been quite a bit of debate and triage that has gone into figuring out what the proper fix is, viewable at the link below, which is still ongoing. Ard suggested disabling this warning with Clang with a pragma so no neon code will have this type of error. While this is not at all an ideal solution, this build error is the only thing preventing KernelCI from having successful arm64 defconfig and allmodconfig builds on linux-next. Getting continuous integration running is more important so new warnings/errors or boot failures can be caught and fixed quickly. Link: https://github.com/ClangBuiltLinux/linux/issues/283Suggested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> (cherry picked from commit 0738c8b5) Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jackie Liu 提交于
mainline inclusion from mainline-5.0-rc1 commit: cc9f8349 category: feature feature: NEON accelerated XOR bugzilla: 11024 CVE: NA -------------------------------------------------- This is a NEON acceleration method that can improve performance by approximately 20%. I got the following data from the centos 7.5 on Huawei's HISI1616 chip: [ 93.837726] xor: measuring software checksum speed [ 93.874039] 8regs : 7123.200 MB/sec [ 93.914038] 32regs : 7180.300 MB/sec [ 93.954043] arm64_neon: 9856.000 MB/sec [ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec) I believe this code can bring some optimization for all arm64 platform. thanks for Ard Biesheuvel's suggestions. Signed-off-by: NJackie Liu <liuyun01@kylinos.cn> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jackie Liu 提交于
mainline inclusion from mainline-5.0-rc1 commit: 21e28547 category: feature feature: NEON accelerated XOR bugzilla: 11024 CVE: NA -------------------------------------------------- In a way similar to ARM commit 09096f6a ("ARM: 7822/1: add workaround for ambiguous C99 stdint.h types"), this patch redefines the macros that are used in stdint.h so its definitions of uint64_t and int64_t are compatible with those of the kernel. This patch comes from: https://patchwork.kernel.org/patch/3540001/ Wrote by: Ard Biesheuvel <ard.biesheuvel@linaro.org> We mark this file as a private file and don't have to override asm/types.h Signed-off-by: NJackie Liu <liuyun01@kylinos.cn> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.0 commit: efdb25efc7645b326cd5eb82be5feeabe167c24e category: perf bugzilla: 20886 CVE: NA lib/crc32test result: [root@localhost build]# rmmod crc32test && insmod lib/crc32test.ko && dmesg | grep cycles [83170.153209] CPU7: use cycles 26243990 [83183.122137] CPU7: use cycles 26151290 [83309.691628] CPU7: use cycles 26122830 [83312.415559] CPU7: use cycles 26232600 [83313.191479] CPU8: use cycles 26082350 rmmod crc32test && insmod lib/crc32test.ko && dmesg | grep cycles [ 1023.539931] CPU25: use cycles 12256730 [ 1024.850360] CPU24: use cycles 12249680 [ 1025.463622] CPU25: use cycles 12253330 [ 1025.862925] CPU25: use cycles 12269720 [ 1026.376038] CPU26: use cycles 12222480 Based on 13702: arm64/lib: improve CRC32 performance for deep pipelines crypto: arm64/crc32 - remove PMULL based CRC32 driver arm64/lib: add accelerated crc32 routines arm64: cpufeature: add feature for CRC32 instructions lib/crc32: make core crc32() routines weak so they can be overridden ---------------------------------------------- Improve the performance of the crc32() asm routines by getting rid of most of the branches and small sized loads on the common path. Instead, use a branchless code path involving overlapping 16 byte loads to process the first (length % 32) bytes, and process the remainder using a loop that processes 32 bytes at a time. Tested using the following test program: #include <stdlib.h> extern void crc32_le(unsigned short, char const*, int); int main(void) { static const char buf[4096]; srand(20181126); for (int i = 0; i < 100 * 1000 * 1000; i++) crc32_le(0, buf, rand() % 1024); return 0; } On Cortex-A53 and Cortex-A57, the performance regresses but only very slightly. On Cortex-A72 however, the performance improves from $ time ./crc32 real 0m10.149s user 0m10.149s sys 0m0.000s to $ time ./crc32 real 0m7.915s user 0m7.915s sys 0m0.000s Cc: Rui Sun <sunrui26@huawei.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-4.20-rc1 commit: 598b7d41 category: feature feature: accelerated crc32 routines bugzilla: 13702 CVE: NA -------------------------------------------------- Now that the scalar fallbacks have been moved out of this driver into the core crc32()/crc32c() routines, we are left with a CRC32 crypto API driver for arm64 that is based only on 64x64 polynomial multiplication, which is an optional instruction in the ARMv8 architecture, and is less and less likely to be available on cores that do not also implement the CRC32 instructions, given that those are mandatory in the architecture as of ARMv8.1. Since the scalar instructions do not require the special handling that SIMD instructions do, and since they turn out to be considerably faster on some cores (Cortex-A53) as well, there is really no point in keeping this code around so let's just remove it. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-4.20-rc1 commit: 7481cddf29ede204b475facc40e6f65459939881 category: feature feature: accelerated crc32 routines bugzilla: 13702 CVE: NA -------------------------------------------------- Unlike crc32c(), which is wired up to the crypto API internally so the optimal driver is selected based on the platform's capabilities, crc32_le() is implemented as a library function using a slice-by-8 table based C implementation. Even though few of the call sites may be bottlenecks, calling a time variant implementation with a non-negligible D-cache footprint is a bit of a waste, given that ARMv8.1 and up mandates support for the CRC32 instructions that were optional in ARMv8.0, but are already widely available, even on the Cortex-A53 based Raspberry Pi. So implement routines that use these instructions if available, and fall back to the existing generic routines otherwise. The selection is based on alternatives patching. Note that this unconditionally selects CONFIG_CRC32 as a builtin. Since CRC32 is relied upon by core functionality such as CONFIG_OF_FLATTREE, this just codifies the status quo. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-4.20-rc1 commit: 86d0dd34eafffbc76a81aba6ae2d71927d3835a8 category: feature feature: accelerated crc32 routines bugzilla: 13702 CVE: NA -------------------------------------------------- Add a CRC32 feature bit and wire it up to the CPU id register so we will be able to use alternatives patching for CRC32 operations. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/include/asm/cpucaps.h arch/arm64/kernel/cpufeature.c Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Hongbo Yao 提交于
hulk inclusion category: feature bugzilla: 13228 CVE: NA --------------------------- enable CONFIG_KTASK in hulk_defconfig and storage_ci_defconfig Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: bugfix bugzilla: 20799 CVE: NA --------------------------- The following patch didn't consider the situation when we boot the system using device tree but 'CONFIG_ACPI' is enabled. In this situation, we also need to set all the CPU as present CPU. Fixes: 280637f70ab5 ("arm64: mark all the GICC nodes in MADT as possible cpu") Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Anshuman Khandual 提交于
mainline inclusion from mainline-v4.20-rc5 commit 77cfe950 category: bugfix bugzilla: 5611 CVE: NA The dummy node ID is marked into all memory ranges on the system. So the dummy node really extends the entire memblock.memory. Hence report correct extent information for the dummy node using memblock range helper functions instead of the range [0LLU, PFN_PHYS(max_pfn) - 1)]. Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms") Acked-by: NPunit Agrawal <punit.agrawal@arm.com> Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v4.20-rc5 commit 8c11ea5c category: bugfix bugzilla: 5654 CVE: NA The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF to BPF call offset can be too far away from core kernel in that relative encoding into imm is not sufficient and could potentially be truncated, see also fd045f6c ("arm64: add support for module PLTs") which adds spill-over space for module_alloc() and therefore bpf_jit_binary_alloc(). Therefore, use the recently added bpf_jit_get_func_addr() helper for properly fetching the address through prog->aux->func[off]->bpf_func instead. This also has the benefit to optimize normal helper calls since their address can use the optimized emission. Tested on Cavium ThunderX CN8890. Fixes: db496944 ("bpf: arm64: add JIT support for multi-function programs") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 liuyanshi 提交于
driver inclusion category: feature bugzilla: NA CVE: NA enable hisi pcie debug driver for hiarmtool. Signed-off-by: Nliuyanshi <liuyanshi@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: feature bugzilla: 20208 CVE: NA --------------------------- To support CPU hotplug, we need to implement 'acpi_(un)map_cpu()' and 'arch_(un)register_cpu()' for ARM64. These functions are called in 'acpi_processor_hotadd_init()/acpi_processor_remove()' when the CPU is hot added into or hot removed from the system. Note: This patch only support core hotplug and does not support socket hotplug because we don't support live configuration of GIC. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: feature bugzilla: 20208 CVE: NA --------------------------- We set 'cpu_possible_mask' based on the enabled GICC node in MADT. If the GICC node is disabled, we will skip initializing the kernel data structure for that CPU. To support CPU hotplug, we need to initialize some CPU related data structure in advance. This patch mark all the GICC nodes as possible CPU and only these enabled GICC nodes as present CPU. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yang Yingliang 提交于
hulk inclusion category: bugfix bugzilla: 4979 CVE: NA ------------------------------------------------- Disable CONFIG_ISCSI_IBFT by default.
-
由 Jeremy Linton 提交于
mainline inclusion from mainline-5.3-rc1 commit d24a0c7099b3 category: feature bugzilla: 16072 CVE: NA --------------------------- ACPI 6.3 adds additional fields to the MADT GICC structure to describe SPE PPI's. We pick these out of the cached reference to the madt_gicc structure similarly to the core PMU code. We then create a platform device referring to the IRQ and let the user/module loader decide whether to load the SPE driver. Tested-by: NHanjun Gou <gouhanjun@huawei.com> Reviewed-by: NSudeep Holla <sudeep.holla@arm.com> Reviewed-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Hongbo Yao 提交于
hulk inclusion category: feature bugzilla: 16072 CVE: NA --------------------------- This reverts commit 556b16f5ad7e910c3784bb02b33c2af6ca9c9a4b. In Linux 5.3.0, SPE ACPI enablement has been upstreamed. SPE patches in hulk-4.19 are the old version, and they need to be reverted to the mainline version. Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-5.2 commit: 01d57485fcdb9f9101a10a18e32d5f8b023cab86 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Since commit 3d65b6bbc01e ("arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE"), we resort to per-ASID invalidation when attempting to perform more than PTRS_PER_PTE invalidation instructions in a single call to __flush_tlb_range(). Whilst this is beneficial, the mmu_gather code does not ensure that the end address of the range is rounded-up to the stride when freeing intermediate page tables in pXX_free_tlb(), which defeats our range checking. Align the bounds passed into __flush_tlb_range(). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: NHanjun Guo <guohanjun@huawei.com> Tested-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.21 commit: 3d65b6bbc01ecece8142e62a8a5f1d48ba41a240 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- In order to reduce the possibility of soft lock-ups, we bound the maximum number of TLBI operations performed by a single call to flush_tlb_range() to an arbitrary constant of 1024. Whilst this does the job of avoiding lock-ups, we can actually be a bit smarter by defining this as PTRS_PER_PTE. Due to the structure of our page tables, using PTRS_PER_PTE means that an outer loop calling flush_tlb_range() for entire table entries will end up performing just a single TLBI operation for each entry. As an example, mremap()ing a 1GB range mapped using 4k pages now requires only 512 TLBI operations when moving the page tables as opposed to 262144 operations (512*512) when using the current threshold of 1024. Cc: Joel Fernandes <joel@joelfernandes.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alex Van Brunt 提交于
mainline inclusion from mainline-4.21 commit: 3403e56b category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- When transitioning a PTE from young to old as part of page aging, we can avoid waiting for the TLB invalidation to complete and therefore drop the subsequent DSB instruction. Whilst this opens up a race with page reclaim, where a PTE in active use via a stale, young TLB entry does not update the underlying descriptor, the worst thing that happens is that the page is reclaimed and then immediately faulted back in. Given that we have a DSB in our context-switch path, the window for a spurious reclaim is fairly limited and eliding the barrier claims to boost NVMe/SSD accesses by over 10% on some platforms. A similar optimisation was made for x86 in commit b13b1d2d ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB"). Signed-off-by: NAlex Van Brunt <avanbrunt@nvidia.com> Signed-off-by: NAshish Mhetre <amhetre@nvidia.com> [will: rewrote patch] Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: 7f08872774eb971693ba79eeb2d4db364c9f5bfb category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Peter Z asked me to justify the barrier usage in asm/tlbflush.h, but actually that whole block comment needs to be rewritten. Reported-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: ace8cb754539077ed75f3f15b77b2b51b5b7a431 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- By selecting HAVE_RCU_TABLE_INVALIDATE, we can rely on tlb_flush() being called if we fail to batch table pages for freeing. This in turn allows us to postpone walk-cache invalidation until tlb_finish_mmu(), which avoids lots of unnecessary DSBs and means we can shoot down the ASID if the range is large enough. Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: f270ab88fdf205be1a7a46ccb61f4a343be543a2 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Now that the core mmu_gather code keeps track of both the levels of page table cleared and also whether or not these entries correspond to intermediate entries, we can use this in our tlb_flush() callback to reduce the number of invalidations we issue as well as their scope. Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-