- 27 12月, 2019 40 次提交
-
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit 8f04e8e6 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- On CPUs with support for PSTATE.SSBS, the kernel can toggle the SSBD state without needing to call into firmware. This patch hooks into the existing SSBD infrastructure so that SSBS is used on CPUs that support it, but it's all made horribly complicated by the very real possibility of big/little systems that don't uniformly provide the new capability. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/kernel/process.c arch/arm64/kernel/ssbd.c arch/arm64/kernel/cpufeature.c [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit 2d1b2a91d56b19636b740ea70c8399d1df249f20 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- Now that we're all merged nicely into mainline, there's no need to check to see if PR_SPEC_STORE_BYPASS is defined. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit d71be2b6c0e19180b5f80a6d42039cc074a693a2 category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- Armv8.5 introduces a new PSTATE bit known as Speculative Store Bypass Safe (SSBS) which can be used as a mitigation against Spectre variant 4. Additionally, a CPU may provide instructions to manipulate PSTATE.SSBS directly, so that userspace can toggle the SSBS control without trapping to the kernel. This patch probes for the existence of SSBS and advertise the new instructions to userspace if they exist. Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/kernel/cpufeature.c arch/arm64/include/asm/cpucaps.h [yyl: adjust context] Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-v4.20-rc1 commit ca7f686a category: feature bugzilla: 20806 CVE: NA ------------------------------------------------- I was passing through and figuered I'd fix this up: featuer -> feature Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Nathan Chancellor 提交于
mainline inclusion from mainline-5.0 commit 0738c8b5 category: bugfix bugzilla: 11024 CVE: NA ------------------------------------------------- After commit cc9f8349 ("arm64: crypto: add NEON accelerated XOR implementation"), Clang builds for arm64 started failing with the following error message. arch/arm64/lib/xor-neon.c:58:28: error: incompatible pointer types assigning to 'const unsigned long *' from 'uint64_t *' (aka 'unsigned long long *') [-Werror,-Wincompatible-pointer-types] v3 = veorq_u64(vld1q_u64(dp1 + 6), vld1q_u64(dp2 + 6)); ^~~~~~~~ /usr/lib/llvm-9/lib/clang/9.0.0/include/arm_neon.h:7538:47: note: expanded from macro 'vld1q_u64' __ret = (uint64x2_t) __builtin_neon_vld1q_v(__p0, 51); \ ^~~~ There has been quite a bit of debate and triage that has gone into figuring out what the proper fix is, viewable at the link below, which is still ongoing. Ard suggested disabling this warning with Clang with a pragma so no neon code will have this type of error. While this is not at all an ideal solution, this build error is the only thing preventing KernelCI from having successful arm64 defconfig and allmodconfig builds on linux-next. Getting continuous integration running is more important so new warnings/errors or boot failures can be caught and fixed quickly. Link: https://github.com/ClangBuiltLinux/linux/issues/283Suggested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> (cherry picked from commit 0738c8b5) Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jackie Liu 提交于
mainline inclusion from mainline-5.0-rc1 commit: cc9f8349 category: feature feature: NEON accelerated XOR bugzilla: 11024 CVE: NA -------------------------------------------------- This is a NEON acceleration method that can improve performance by approximately 20%. I got the following data from the centos 7.5 on Huawei's HISI1616 chip: [ 93.837726] xor: measuring software checksum speed [ 93.874039] 8regs : 7123.200 MB/sec [ 93.914038] 32regs : 7180.300 MB/sec [ 93.954043] arm64_neon: 9856.000 MB/sec [ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec) I believe this code can bring some optimization for all arm64 platform. thanks for Ard Biesheuvel's suggestions. Signed-off-by: NJackie Liu <liuyun01@kylinos.cn> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Jackie Liu 提交于
mainline inclusion from mainline-5.0-rc1 commit: 21e28547 category: feature feature: NEON accelerated XOR bugzilla: 11024 CVE: NA -------------------------------------------------- In a way similar to ARM commit 09096f6a ("ARM: 7822/1: add workaround for ambiguous C99 stdint.h types"), this patch redefines the macros that are used in stdint.h so its definitions of uint64_t and int64_t are compatible with those of the kernel. This patch comes from: https://patchwork.kernel.org/patch/3540001/ Wrote by: Ard Biesheuvel <ard.biesheuvel@linaro.org> We mark this file as a private file and don't have to override asm/types.h Signed-off-by: NJackie Liu <liuyun01@kylinos.cn> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-5.0 commit: efdb25efc7645b326cd5eb82be5feeabe167c24e category: perf bugzilla: 20886 CVE: NA lib/crc32test result: [root@localhost build]# rmmod crc32test && insmod lib/crc32test.ko && dmesg | grep cycles [83170.153209] CPU7: use cycles 26243990 [83183.122137] CPU7: use cycles 26151290 [83309.691628] CPU7: use cycles 26122830 [83312.415559] CPU7: use cycles 26232600 [83313.191479] CPU8: use cycles 26082350 rmmod crc32test && insmod lib/crc32test.ko && dmesg | grep cycles [ 1023.539931] CPU25: use cycles 12256730 [ 1024.850360] CPU24: use cycles 12249680 [ 1025.463622] CPU25: use cycles 12253330 [ 1025.862925] CPU25: use cycles 12269720 [ 1026.376038] CPU26: use cycles 12222480 Based on 13702: arm64/lib: improve CRC32 performance for deep pipelines crypto: arm64/crc32 - remove PMULL based CRC32 driver arm64/lib: add accelerated crc32 routines arm64: cpufeature: add feature for CRC32 instructions lib/crc32: make core crc32() routines weak so they can be overridden ---------------------------------------------- Improve the performance of the crc32() asm routines by getting rid of most of the branches and small sized loads on the common path. Instead, use a branchless code path involving overlapping 16 byte loads to process the first (length % 32) bytes, and process the remainder using a loop that processes 32 bytes at a time. Tested using the following test program: #include <stdlib.h> extern void crc32_le(unsigned short, char const*, int); int main(void) { static const char buf[4096]; srand(20181126); for (int i = 0; i < 100 * 1000 * 1000; i++) crc32_le(0, buf, rand() % 1024); return 0; } On Cortex-A53 and Cortex-A57, the performance regresses but only very slightly. On Cortex-A72 however, the performance improves from $ time ./crc32 real 0m10.149s user 0m10.149s sys 0m0.000s to $ time ./crc32 real 0m7.915s user 0m7.915s sys 0m0.000s Cc: Rui Sun <sunrui26@huawei.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-4.20-rc1 commit: 598b7d41 category: feature feature: accelerated crc32 routines bugzilla: 13702 CVE: NA -------------------------------------------------- Now that the scalar fallbacks have been moved out of this driver into the core crc32()/crc32c() routines, we are left with a CRC32 crypto API driver for arm64 that is based only on 64x64 polynomial multiplication, which is an optional instruction in the ARMv8 architecture, and is less and less likely to be available on cores that do not also implement the CRC32 instructions, given that those are mandatory in the architecture as of ARMv8.1. Since the scalar instructions do not require the special handling that SIMD instructions do, and since they turn out to be considerably faster on some cores (Cortex-A53) as well, there is really no point in keeping this code around so let's just remove it. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-4.20-rc1 commit: 7481cddf29ede204b475facc40e6f65459939881 category: feature feature: accelerated crc32 routines bugzilla: 13702 CVE: NA -------------------------------------------------- Unlike crc32c(), which is wired up to the crypto API internally so the optimal driver is selected based on the platform's capabilities, crc32_le() is implemented as a library function using a slice-by-8 table based C implementation. Even though few of the call sites may be bottlenecks, calling a time variant implementation with a non-negligible D-cache footprint is a bit of a waste, given that ARMv8.1 and up mandates support for the CRC32 instructions that were optional in ARMv8.0, but are already widely available, even on the Cortex-A53 based Raspberry Pi. So implement routines that use these instructions if available, and fall back to the existing generic routines otherwise. The selection is based on alternatives patching. Note that this unconditionally selects CONFIG_CRC32 as a builtin. Since CRC32 is relied upon by core functionality such as CONFIG_OF_FLATTREE, this just codifies the status quo. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Ard Biesheuvel 提交于
mainline inclusion from mainline-4.20-rc1 commit: 86d0dd34eafffbc76a81aba6ae2d71927d3835a8 category: feature feature: accelerated crc32 routines bugzilla: 13702 CVE: NA -------------------------------------------------- Add a CRC32 feature bit and wire it up to the CPU id register so we will be able to use alternatives patching for CRC32 operations. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/include/asm/cpucaps.h arch/arm64/kernel/cpufeature.c Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NXie XiuQi <xiexiuqi@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Hongbo Yao 提交于
hulk inclusion category: feature bugzilla: 13228 CVE: NA --------------------------- enable CONFIG_KTASK in hulk_defconfig and storage_ci_defconfig Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Daniel Jordan 提交于
hulk inclusion category: feature bugzilla: 13228 CVE: NA --------------------------- Currently, mmap_sem must be held as writer to modify the locked_vm field in mm_struct. This creates a bottleneck when multithreading VFIO page pinning because each thread holds the mmap_sem as reader for the majority of the pinning time but also takes mmap_sem as writer regularly, for short times, when modifying locked_vm. The problem gets worse when other workloads compete for CPU with ktask threads doing page pinning because the other workloads force ktask threads that hold mmap_sem as writer off the CPU, blocking ktask threads trying to get mmap_sem as reader for an excessively long time (the mmap_sem reader wait time grows linearly with the thread count). Requiring mmap_sem for locked_vm also abuses mmap_sem by making it protect data that could be synchronized separately. So, decouple locked_vm from mmap_sem by making locked_vm an atomic_long_t. locked_vm's old type was unsigned long and changing it to a signed type makes it lose half its capacity, but that's only a concern for 32-bit systems and LONG_MAX * PAGE_SIZE is 8T on x86 in that case, so there's headroom. Now that mmap_sem is not taken as writer here, ktask threads holding mmap_sem as reader can run more often. Performance results appear later in the series. On powerpc, this was cross-compiled-tested only. [XXX Can send separately.] Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Tested-by: NHongbo Yao <yaohongbo@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: bugfix bugzilla: 20799 CVE: NA --------------------------- The following patch didn't consider the situation when we boot the system using device tree but 'CONFIG_ACPI' is enabled. In this situation, we also need to set all the CPU as present CPU. Fixes: 280637f70ab5 ("arm64: mark all the GICC nodes in MADT as possible cpu") Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Anshuman Khandual 提交于
mainline inclusion from mainline-v4.20-rc5 commit 77cfe950 category: bugfix bugzilla: 5611 CVE: NA The dummy node ID is marked into all memory ranges on the system. So the dummy node really extends the entire memblock.memory. Hence report correct extent information for the dummy node using memblock range helper functions instead of the range [0LLU, PFN_PHYS(max_pfn) - 1)]. Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms") Acked-by: NPunit Agrawal <punit.agrawal@arm.com> Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NZhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v4.20-rc5 commit 8c11ea5c category: bugfix bugzilla: 5654 CVE: NA The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF to BPF call offset can be too far away from core kernel in that relative encoding into imm is not sufficient and could potentially be truncated, see also fd045f6c ("arm64: add support for module PLTs") which adds spill-over space for module_alloc() and therefore bpf_jit_binary_alloc(). Therefore, use the recently added bpf_jit_get_func_addr() helper for properly fetching the address through prog->aux->func[off]->bpf_func instead. This also has the benefit to optimize normal helper calls since their address can use the optimized emission. Tested on Cavium ThunderX CN8890. Fixes: db496944 ("bpf: arm64: add JIT support for multi-function programs") Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Daniel Borkmann 提交于
mainline inclusion from mainline-v4.20-rc5 commit e2c95a61 category: bugfix bugzilla: 5654 CVE: NA Make fetching of the BPF call address from ppc64 JIT generic. ppc64 was using a slightly different variant rather than through the insns' imm field encoding as the target address would not fit into that space. Therefore, the target subprog number was encoded into the insns' offset and fetched through fp->aux->func[off]->bpf_func instead. Given there are other JITs with this issue and the mechanism of fetching the address is JIT-generic, move it into the core as a helper instead. On the JIT side, we get information on whether the retrieved address is a fixed one, that is, not changing through JIT passes, or a dynamic one. For the former, JITs can optimize their imm emission because this doesn't change jump offsets throughout JIT process. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Reviewed-by: NSandipan Das <sandipan@linux.ibm.com> Tested-by: NSandipan Das <sandipan@linux.ibm.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NXuefeng Wang <wxf.wang@hisilicon.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 liuyanshi 提交于
driver inclusion category: feature bugzilla: NA CVE: NA enable hisi pcie debug driver for hiarmtool. Signed-off-by: Nliuyanshi <liuyanshi@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: feature bugzilla: 20208 CVE: NA --------------------------- To support CPU hotplug, we need to implement 'acpi_(un)map_cpu()' and 'arch_(un)register_cpu()' for ARM64. These functions are called in 'acpi_processor_hotadd_init()/acpi_processor_remove()' when the CPU is hot added into or hot removed from the system. Note: This patch only support core hotplug and does not support socket hotplug because we don't support live configuration of GIC. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: feature bugzilla: 20208 CVE: NA --------------------------- We set 'cpu_possible_mask' based on the enabled GICC node in MADT. If the GICC node is disabled, we will skip initializing the kernel data structure for that CPU. To support CPU hotplug, we need to initialize some CPU related data structure in advance. This patch mark all the GICC nodes as possible CPU and only these enabled GICC nodes as present CPU. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Wanpeng Li 提交于
commit 17e433b54393a6269acbcb792da97791fe1592d8 upstream. After commit d73eb57b80b (KVM: Boost vCPUs that are delivering interrupts), a five years old bug is exposed. Running ebizzy benchmark in three 80 vCPUs VMs on one 80 pCPUs Skylake server, a lot of rcu_sched stall warning splatting in the VMs after stress testing: INFO: rcu_sched detected stalls on CPUs/tasks: { 4 41 57 62 77} (detected by 15, t=60004 jiffies, g=899, c=898, q=15073) Call Trace: flush_tlb_mm_range+0x68/0x140 tlb_flush_mmu.part.75+0x37/0xe0 tlb_finish_mmu+0x55/0x60 zap_page_range+0x142/0x190 SyS_madvise+0x3cd/0x9c0 system_call_fastpath+0x1c/0x21 swait_active() sustains to be true before finish_swait() is called in kvm_vcpu_block(), voluntarily preempted vCPUs are taken into account by kvm_vcpu_on_spin() loop greatly increases the probability condition kvm_arch_vcpu_runnable(vcpu) is checked and can be true, when APICv is enabled the yield-candidate vCPU's VMCS RVI field leaks(by vmx_sync_pir_to_irr()) into spinning-on-a-taken-lock vCPU's current VMCS. This patch fixes it by checking conservatively a subset of events. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Marc Zyngier <Marc.Zyngier@arm.com> Cc: stable@vger.kernel.org Fixes: 98f4a146 (KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop) Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Nick Desaulniers 提交于
commit 4ce97317f41d38584fb93578e922fcd19e535f5b upstream. Implementing memcpy and memset in terms of __builtin_memcpy and __builtin_memset is problematic. GCC at -O2 will replace calls to the builtins with calls to memcpy and memset (but will generate an inline implementation at -Os). Clang will replace the builtins with these calls regardless of optimization level. $ llvm-objdump -dr arch/x86/purgatory/string.o | tail 0000000000000339 memcpy: 339: 48 b8 00 00 00 00 00 00 00 00 movabsq $0, %rax 000000000000033b: R_X86_64_64 memcpy 343: ff e0 jmpq *%rax 0000000000000345 memset: 345: 48 b8 00 00 00 00 00 00 00 00 movabsq $0, %rax 0000000000000347: R_X86_64_64 memset 34f: ff e0 Such code results in infinite recursion at runtime. This is observed when doing kexec. Instead, reuse an implementation from arch/x86/boot/compressed/string.c. This requires to implement a stub function for warn(). Also, Clang may lower memcmp's that compare against 0 to bcmp's, so add a small definition, too. See also: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp") Fixes: 8fc5b4d4 ("purgatory: core purgatory functionality") Reported-by: NVaibhav Rustagi <vaibhavrustagi@google.com> Debugged-by: NVaibhav Rustagi <vaibhavrustagi@google.com> Debugged-by: NManoj Gupta <manojgupta@google.com> Suggested-by: NAlistair Delva <adelva@google.com> Signed-off-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NVaibhav Rustagi <vaibhavrustagi@google.com> Cc: stable@vger.kernel.org Link: https://bugs.chromium.org/p/chromium/issues/detail?id=984056 Link: https://lkml.kernel.org/r/20190807221539.94583-1-ndesaulniers@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Halil Pasic 提交于
[ Upstream commit 1a2dcff8 ] On s390 ZONE_DMA is up to 2G, i.e. ARCH_ZONE_DMA_BITS should be 31 bits. The current value is 24 and makes __dma_direct_alloc_pages() take a wrong turn first (but __dma_direct_alloc_pages() recovers then). Let's correct ARCH_ZONE_DMA_BITS value and avoid wrong turns. Signed-off-by: NHalil Pasic <pasic@linux.ibm.com> Reported-by: NPetr Tesarik <ptesarik@suse.cz> Fixes: c61e9637 ("dma-direct: add support for allocation from ZONE_DMA and ZONE_DMA32") Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Arnd Bergmann 提交于
[ Upstream commit 3a9d2569 ] The mdio-bus-mux has no #address-cells/#size-cells property, which causes a few dtc warnings: arch/arm/boot/dts/bcm47094-linksys-panamera.dts:129.4-18: Warning (reg_format): /mdio-bus-mux/mdio@200:reg: property has invalid length (4 bytes) (#address-cells == 2, #size-cells == 1) arch/arm/boot/dts/bcm47094-linksys-panamera.dtb: Warning (pci_device_bus_num): Failed prerequisite 'reg_format' arch/arm/boot/dts/bcm47094-linksys-panamera.dtb: Warning (i2c_bus_reg): Failed prerequisite 'reg_format' arch/arm/boot/dts/bcm47094-linksys-panamera.dtb: Warning (spi_bus_reg): Failed prerequisite 'reg_format' arch/arm/boot/dts/bcm47094-linksys-panamera.dts:128.22-132.5: Warning (avoid_default_addr_size): /mdio-bus-mux/mdio@200: Relying on default #address-cells value arch/arm/boot/dts/bcm47094-linksys-panamera.dts:128.22-132.5: Warning (avoid_default_addr_size): /mdio-bus-mux/mdio@200: Relying on default #size-cells value Add the normal cell numbers. Link: https://lore.kernel.org/r/20190722145618.1155492-1-arnd@arndb.de Fixes: 2bebdfcd ("ARM: dts: BCM5301X: Add support for Linksys EA9500") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Arnd Bergmann 提交于
[ Upstream commit d64b212e ] When building a multiplatform kernel that includes armv4 support, the default target CPU does not support the blx instruction, which leads to a build failure: arch/arm/mach-davinci/sleep.S: Assembler messages: arch/arm/mach-davinci/sleep.S:56: Error: selected processor does not support `blx ip' in ARM mode Add a .arch statement in the sources to make this file build. Link: https://lore.kernel.org/r/20190722145211.1154785-1-arnd@arndb.deAcked-by: NSekhar Nori <nsekhar@ti.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Nick Desaulniers 提交于
commit b059f801a937d164e03b33c1848bb3dca67c0b04 upstream. KBUILD_CFLAGS is very carefully built up in the top level Makefile, particularly when cross compiling or using different build tools. Resetting KBUILD_CFLAGS via := assignment is an antipattern. The comment above the reset mentions that -pg is problematic. Other Makefiles use `CFLAGS_REMOVE_file.o = $(CC_FLAGS_FTRACE)` when CONFIG_FUNCTION_TRACER is set. Prefer that pattern to wiping out all of the important KBUILD_CFLAGS then manually having to re-add them. Seems also that __stack_chk_fail references are generated when using CONFIG_STACKPROTECTOR or CONFIG_STACKPROTECTOR_STRONG. Fixes: 8fc5b4d4 ("purgatory: core purgatory functionality") Reported-by: NVaibhav Rustagi <vaibhavrustagi@google.com> Suggested-by: NPeter Zijlstra <peterz@infradead.org> Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Tested-by: NVaibhav Rustagi <vaibhavrustagi@google.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190807221539.94583-2-ndesaulniers@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Joerg Roedel 提交于
commit 8e998fc24de47c55b47a887f6c95ab91acd4a720 upstream. With huge-page ioremap areas the unmappings also need to be synced between all page-tables. Otherwise it can cause data corruption when a region is unmapped and later re-used. Make the vmalloc_sync_one() function ready to sync unmappings and make sure vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD is found. Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Joerg Roedel 提交于
commit 51b75b5b563a2637f9d8dc5bd02a31b2ff9e5ea0 upstream. Do not require a struct page for the mapped memory location because it might not exist. This can happen when an ioremapped region is mapped with 2MB pages. Fixes: 5d72b4fb ('x86, mm: support huge I/O mapping capability I/F') Signed-off-by: NJoerg Roedel <jroedel@suse.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NDave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Yang Yingliang 提交于
hulk inclusion category: bugfix bugzilla: 4979 CVE: NA ------------------------------------------------- Disable CONFIG_ISCSI_IBFT by default.
-
由 Jeremy Linton 提交于
mainline inclusion from mainline-5.3-rc1 commit d24a0c7099b3 category: feature bugzilla: 16072 CVE: NA --------------------------- ACPI 6.3 adds additional fields to the MADT GICC structure to describe SPE PPI's. We pick these out of the cached reference to the madt_gicc structure similarly to the core PMU code. We then create a platform device referring to the IRQ and let the user/module loader decide whether to load the SPE driver. Tested-by: NHanjun Gou <gouhanjun@huawei.com> Reviewed-by: NSudeep Holla <sudeep.holla@arm.com> Reviewed-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NJeremy Linton <jeremy.linton@arm.com> Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Hongbo Yao 提交于
hulk inclusion category: feature bugzilla: 16072 CVE: NA --------------------------- This reverts commit 556b16f5ad7e910c3784bb02b33c2af6ca9c9a4b. In Linux 5.3.0, SPE ACPI enablement has been upstreamed. SPE patches in hulk-4.19 are the old version, and they need to be reverted to the mainline version. Signed-off-by: NHongbo Yao <yaohongbo@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-5.2 commit: 01d57485fcdb9f9101a10a18e32d5f8b023cab86 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Since commit 3d65b6bbc01e ("arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE"), we resort to per-ASID invalidation when attempting to perform more than PTRS_PER_PTE invalidation instructions in a single call to __flush_tlb_range(). Whilst this is beneficial, the mmu_gather code does not ensure that the end address of the range is rounded-up to the stride when freeing intermediate page tables in pXX_free_tlb(), which defeats our range checking. Align the bounds passed into __flush_tlb_range(). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: NHanjun Guo <guohanjun@huawei.com> Tested-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.21 commit: 3d65b6bbc01ecece8142e62a8a5f1d48ba41a240 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- In order to reduce the possibility of soft lock-ups, we bound the maximum number of TLBI operations performed by a single call to flush_tlb_range() to an arbitrary constant of 1024. Whilst this does the job of avoiding lock-ups, we can actually be a bit smarter by defining this as PTRS_PER_PTE. Due to the structure of our page tables, using PTRS_PER_PTE means that an outer loop calling flush_tlb_range() for entire table entries will end up performing just a single TLBI operation for each entry. As an example, mremap()ing a 1GB range mapped using 4k pages now requires only 512 TLBI operations when moving the page tables as opposed to 262144 operations (512*512) when using the current threshold of 1024. Cc: Joel Fernandes <joel@joelfernandes.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Alex Van Brunt 提交于
mainline inclusion from mainline-4.21 commit: 3403e56b category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- When transitioning a PTE from young to old as part of page aging, we can avoid waiting for the TLB invalidation to complete and therefore drop the subsequent DSB instruction. Whilst this opens up a race with page reclaim, where a PTE in active use via a stale, young TLB entry does not update the underlying descriptor, the worst thing that happens is that the page is reclaimed and then immediately faulted back in. Given that we have a DSB in our context-switch path, the window for a spurious reclaim is fairly limited and eliding the barrier claims to boost NVMe/SSD accesses by over 10% on some platforms. A similar optimisation was made for x86 in commit b13b1d2d ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB"). Signed-off-by: NAlex Van Brunt <avanbrunt@nvidia.com> Signed-off-by: NAshish Mhetre <amhetre@nvidia.com> [will: rewrote patch] Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: 7f08872774eb971693ba79eeb2d4db364c9f5bfb category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Peter Z asked me to justify the barrier usage in asm/tlbflush.h, but actually that whole block comment needs to be rewritten. Reported-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: ace8cb754539077ed75f3f15b77b2b51b5b7a431 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- By selecting HAVE_RCU_TABLE_INVALIDATE, we can rely on tlb_flush() being called if we fail to batch table pages for freeing. This in turn allows us to postpone walk-cache invalidation until tlb_finish_mmu(), which avoids lots of unnecessary DSBs and means we can shoot down the ASID if the range is large enough. Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: f270ab88fdf205be1a7a46ccb61f4a343be543a2 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Now that the core mmu_gather code keeps track of both the levels of page table cleared and also whether or not these entries correspond to intermediate entries, we can use this in our tlb_flush() callback to reduce the number of invalidations we issue as well as their scope. Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: 07212cd47efef1df971e2289a89c087a5f6f6a2b category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- If there's one thing the RCU-based table freeing doesn't need, it's more ifdeffery. Remove the redundant !CONFIG_HAVE_RCU_TABLE_FREE code, since this option is unconditionally selected in our Kconfig. Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: 67a902ac598dca056366a7342f401aa6f605072f category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- When we are unmapping intermediate page-table entries or huge pages, we don't need to issue a TLBI instruction for every PAGE_SIZE chunk in the VA range being unmapped. Allow the invalidation stride to be passed to __flush_tlb_range(), and adjust our "just nuke the ASID" heuristic to take this into account. Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-
由 Will Deacon 提交于
mainline inclusion from mainline-4.20-rc1 commit: d8289d3a5854a2a0ae144bff106a78738fe63050 category: feature feature: Reduce synchronous TLB invalidation on ARM64 bugzilla: NA CVE: NA -------------------------------------------------- Add a comment to explain why we can't get away with last-level invalidation in flush_tlb_range() Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NHanjun Guo <guohanjun@huawei.com> Reviewed-by: NXuefeng Wang <wxf.wang@hisilicon.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
-