- 28 10月, 2022 4 次提交
-
-
由 Andrew Jones 提交于
Commit 78e5a339 ("cpumask: fix checking valid cpu range") has started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1 are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's start and next seq operations implement a pattern like n = cpumask_next(n - 1, mask); show(n); while (1) { ++n; n = cpumask_next(n - 1, mask); if (n >= nr_cpu_ids) break; show(n); } which will issue the warning when reading /proc/cpuinfo. Ensure no warning is generated by validating the cpu index before calling cpumask_next(). [*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled. Signed-off-by: NAndrew Jones <ajones@ventanamicro.com> Reviewed-by: NAnup Patel <anup@brainfault.org> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Tested-by: NConor Dooley <conor.dooley@microchip.com> Acked-by: NYury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20221014155845.1986223-2-ajones@ventanamicro.com/ Fixes: 78e5a339 ("cpumask: fix checking valid cpu range") Cc: stable@vger.kernel.org Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Conor Dooley 提交于
It is not sufficient to check if a toolchain supports a particular extension without checking if the linker supports that extension too. For example, Clang 15 supports Zihintpause but GNU bintutils 2.35.2 does not, leading build errors like so: riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zihintpause' Add a TOOLCHAIN_HAS_ZIHINTPAUSE which checks if each of the compiler, assembler and linker support the extension. Replace the ifdef in the vdso with one depending on this new symbol. Fixes: 8eb060e1 ("arch/riscv: add Zihintpause support") Signed-off-by: NConor Dooley <conor.dooley@microchip.com> Reviewed-by: NHeiko Stuebner <heiko@sntech.de> Reviewed-by: NNathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221006173520.1785507-3-conor@kernel.orgSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Conor Dooley 提交于
It is not sufficient to check if a toolchain supports a particular extension without checking if the linker supports that extension too. For example, Clang 15 supports Zicbom but GNU bintutils 2.35.2 does not, leading build errors like so: riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicbom1p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zicbom' Convert CC_HAS_ZICBOM to TOOLCHAIN_HAS_ZICBOM & check if the linker also supports Zicbom. Reported-by: NKevin Hilman <khilman@baylibre.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1714 Link: https://storage.kernelci.org/next/master/next-20220920/riscv/defconfig+CONFIG_EFI=n/clang-16/logs/kernel.log Fixes: 1631ba12 ("riscv: Add support for non-coherent devices using zicbom extension") Signed-off-by: NConor Dooley <conor.dooley@microchip.com> Reviewed-by: NHeiko Stuebner <heiko@sntech.de> Reviewed-by: NNathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221006173520.1785507-2-conor@kernel.org [Palmer: Check for ld-2.38, not 2.39, as 2.38 no longer errors.] Signed-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
由 Qinglin Pan 提交于
Hi Atish, It seems that the panic is due to the missing memcpy during kasan_init. Could you please check whether this patch is helpful? When doing kasan_populate, the new allocated base_pud/base_p4d should contain kasan_early_shadow_{pud, p4d}'s content. Add the missing memcpy to avoid page fault when read/write kasan shadow region. Tested on: - qemu with sv57 and CONFIG_KASAN on. - qemu with sv48 and CONFIG_KASAN on. Signed-off-by: NQinglin Pan <panqinglin2020@iscas.ac.cn> Tested-by: NAtish Patra <atishp@rivosinc.com> Fixes: 8fbdccd2 ("riscv: mm: Support kasan for sv57") Link: https://lore.kernel.org/r/20221009083050.3814850-1-panqinglin2020@iscas.ac.cnSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
- 26 10月, 2022 7 次提交
-
-
由 Nicholas Piggin 提交于
Commit a4cb3651 ("powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context") fixed the problem of pending irqs being cleared when clearing the HARD_DIS bit, but then it didn't clear the bit at all. This change clears HARD_DIS without affecting other bits in the mask. When an interrupt hits in a soft-masked section that has MSR[EE]=1, it can hard disable and set PACA_IRQS_HARD_DIS, which must be cleared when returning to the EE=1 caller (unless it was set due to a MUST_HARD_MASK interrupt becoming pending). Failure to clear this leaves the returned-to context running with MSR[EE]=1 and PACA_IRQS_HARD_DIS, which confuses irq assertions and could be dangerous for code that might test the flag. This was observed in a hash MMU kernel where a kernel hash fault hits in a local_irqs_disabled region that has EE=1. The hash fault also runs with EE=1, then as it returns, a decrementer hits in the restart section and the irq restart code hard-masks which sets the PACA_IRQ_HARD_DIS flag, which is not clear when the original context is returned to. Reported-by: NSachin Sant <sachinp@linux.ibm.com> Fixes: a4cb3651 ("powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context") Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Tested-by: NSachin Sant <sachinp@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221022052207.471328-1-npiggin@gmail.com
-
由 Thomas Richter 提交于
Commit 838d9bb6 ("perf: Use sample_flags for raw_data") changed the way the raw data of an event is collected. Adjust the PMU pai_ext to the new scheme. Fixes: 838d9bb6 ("perf: Use sample_flags for raw_data") Signed-off-by: NThomas Richter <tmricht@linux.ibm.com> Acked-by: NSumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Peter Oberparleiter 提交于
This patch enhances the kernel image adding a trailer as required for secure boot by future firmware versions. Cc: <stable@vger.kernel.org> # 5.2+ Signed-off-by: NPeter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: NSven Schnelle <svens@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Heiko Carstens 提交于
For some exception types the instruction address points behind the instruction that caused the exception. Take that into account and add the missing exception table entry. Cc: <stable@vger.kernel.org> Fixes: f058599e ("s390/pci: Fix s390_mmio_read/write with MIO") Reviewed-by: NNiklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: NHeiko Carstens <hca@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Heiko Carstens 提交于
For some exception types the instruction address points behind the instruction that caused the exception. Take that into account and add the missing exception table entry. Cc: <stable@vger.kernel.org> Reviewed-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NHeiko Carstens <hca@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Heiko Carstens 提交于
For some exception types the instruction address points behind the instruction that caused the exception. Take that into account and add the missing exception table entries. Cc: <stable@vger.kernel.org> Reviewed-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NHeiko Carstens <hca@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Jisheng Zhang 提交于
Samuel reported that the static branch usage in cpu_relax() breaks building with CONFIG_CC_OPTIMIZE_FOR_SIZE: In file included from <command-line>: ./arch/riscv/include/asm/jump_label.h: In function 'cpu_relax': ././include/linux/compiler_types.h:285:33: warning: 'asm' operand 0 probably does not match constraints 285 | #define asm_volatile_goto(x...) asm goto(x) | ^~~ ./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto' 41 | asm_volatile_goto( | ^~~~~~~~~~~~~~~~~ ././include/linux/compiler_types.h:285:33: error: impossible constraint in 'asm' 285 | #define asm_volatile_goto(x...) asm goto(x) | ^~~ ./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto' 41 | asm_volatile_goto( | ^~~~~~~~~~~~~~~~~ make[1]: *** [scripts/Makefile.build:249: arch/riscv/kernel/vdso/vgettimeofday.o] Error 1 make: *** [arch/riscv/Makefile:128: vdso_prepare] Error 2 Maybe "-Os" prevents GCC from detecting that the key/branch arguments can be treated as constants and used as immediate operands. Inspired by x86's commit 864b4355("x86/jump_label: Mark arguments as const to satisfy asm constraints"), and as pointed out by Steven: "The "i" constraint needs to be a constant.", let's do similar modifications to riscv. Tested by CC_OPTIMIZE_FOR_SIZE + gcc and CC_OPTIMIZE_FOR_SIZE + clang. Link: https://lore.kernel.org/linux-riscv/20220922060958.44203-1-samuel@sholland.org/ Link: https://lore.kernel.org/all/20210212094059.5f8d05e8@gandalf.local.home/ Fixes: 8eb060e1 ("arch/riscv: add Zihintpause support") Reported-by: NSamuel Holland <samuel@sholland.org> Signed-off-by: NJisheng Zhang <jszhang@kernel.org> Reviewed-by: NAndrew Jones <ajones@ventanamicro.com> Tested-by: NHeiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20221008145437.491-1-jszhang@kernel.orgSigned-off-by: NPalmer Dabbelt <palmer@rivosinc.com>
-
- 25 10月, 2022 1 次提交
-
-
由 Steven Rostedt (Google) 提交于
Adding on the kernel command line "ftrace=function" triggered: CPA detected W^X violation: 8000000000000063 -> 0000000000000063 range: 0xffffffffc0013000 - 0xffffffffc0013fff PFN 10031b WARNING: CPU: 0 PID: 0 at arch/x86/mm/pat/set_memory.c:609 verify_rwx+0x61/0x6d Call Trace: __change_page_attr_set_clr+0x146/0x8a6 change_page_attr_set_clr+0x135/0x268 change_page_attr_clear.constprop.0+0x16/0x1c set_memory_x+0x2c/0x32 arch_ftrace_update_trampoline+0x218/0x2db ftrace_update_trampoline+0x16/0xa1 __register_ftrace_function+0x93/0xb2 ftrace_startup+0x21/0xf0 register_ftrace_function_nolock+0x26/0x40 register_ftrace_function+0x4e/0x143 function_trace_init+0x7d/0xc3 tracer_init+0x23/0x2c tracing_set_tracer+0x1d5/0x206 register_tracer+0x1c0/0x1e4 init_function_trace+0x90/0x96 early_trace_init+0x25c/0x352 start_kernel+0x424/0x6e4 x86_64_start_reservations+0x24/0x2a x86_64_start_kernel+0x8c/0x95 secondary_startup_64_no_verify+0xe0/0xeb This is because at boot up, kernel text is writable, and there's no reason to do tricks to updated it. But the verifier does not distinguish updates at boot up and at run time, and causes a warning at time of boot. Add a check for system_state == SYSTEM_BOOTING and allow it if that is the case. [ These SYSTEM_BOOTING special cases are all pretty horrid, but the x86 text_poke() code does some odd things at bootup, forcing this for now - Linus ] Link: https://lore.kernel.org/r/20221024112730.180916b3@gandalf.local.home Fixes: 652c5bf3 ("x86/mm: Refuse W^X violations") Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 10月, 2022 3 次提交
-
-
由 Alexander Graf 提交于
The KVM_X86_SET_MSR_FILTER ioctls contains a pointer in the passed in struct which means it has a different struct size depending on whether it gets called from 32bit or 64bit code. This patch introduces compat code that converts from the 32bit struct to its 64bit counterpart which then gets used going forward internally. With this applied, 32bit QEMU can successfully set MSR bitmaps when running on 64bit kernels. Reported-by: NAndrew Randrianasulu <randrianasulu@gmail.com> Fixes: 1a155254 ("KVM: x86: Introduce MSR filtering") Signed-off-by: NAlexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-4-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alexander Graf 提交于
In the next patch we want to introduce a second caller to set_msr_filter() which constructs its own filter list on the stack. Refactor the original function so it takes it as argument instead of reading it through copy_from_user(). Signed-off-by: NAlexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-3-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Chang S. Bae 提交于
When an extended state component is not present in fpstate, but in init state, the function copies from init_fpstate via copy_feature(). But, dynamic states are not present in init_fpstate because of all-zeros init states. Then retrieving them from init_fpstate will explode like this: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:memcpy_erms+0x6/0x10 ? __copy_xstate_to_uabi_buf+0x381/0x870 fpu_copy_guest_fpstate_to_uabi+0x28/0x80 kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm] ? __this_cpu_preempt_check+0x13/0x20 ? vmx_vcpu_put+0x2e/0x260 [kvm_intel] kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? __fget_light+0xd4/0x130 __x64_sys_ioctl+0xe3/0x910 ? debug_smp_processor_id+0x17/0x20 ? fpregs_assert_state_consistent+0x27/0x50 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Adjust the 'mask' to zero out the userspace buffer for the features that are not available both from fpstate and from init_fpstate. The dynamic features depend on the compacted XSAVE format. Ensure it is enabled before reading XCOMP_BV in init_fpstate. Fixes: 2308ee57 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: NYuan Yao <yuan.yao@intel.com> Suggested-by: NDave Hansen <dave.hansen@intel.com> Signed-off-by: NChang S. Bae <chang.seok.bae@intel.com> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Tested-by: NYuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/ Link: https://lkml.kernel.org/r/20221021185844.13472-1-chang.seok.bae@intel.com
-
- 21 10月, 2022 6 次提交
-
-
由 Chen Zhongjin 提交于
When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2ab ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a4 ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2ab ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: NChen Zhongjin <chenzhongjin@huawei.com> [jpoimboe: rewrite commit log] Signed-off-by: NJosh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
-
由 Nathan Huckleberry 提交于
crypto_tfm::__crt_ctx is not guaranteed to be 16-byte aligned on x86-64. This causes crashes due to movaps instructions in clmul_polyval_update. Add logic to align polyval_tfm_ctx to 16 bytes. Cc: <stable@vger.kernel.org> Fixes: 34f7f6c3 ("crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL") Reported-by: NBruno Goncalves <bgoncalv@redhat.com> Signed-off-by: NNathan Huckleberry <nhuck@google.com> Reviewed-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Charlotte Tan 提交于
arch_rmrr_sanity_check() warns if the RMRR is not covered by an ACPI Reserved region, but it seems like it should accept an NVS region as well. The ACPI spec https://uefi.org/specs/ACPI/6.5/15_System_Address_Map_Interfaces.html uses similar wording for "Reserved" and "NVS" region types; for NVS regions it says "This range of addresses is in use or reserved by the system and must not be used by the operating system." There is an old comment on this mailing list that also suggests NVS regions should pass the arch_rmrr_sanity_check() test: The warnings come from arch_rmrr_sanity_check() since it checks whether the region is E820_TYPE_RESERVED. However, if the purpose of the check is to detect RMRR has regions that may be used by OS as free memory, isn't E820_TYPE_NVS safe, too? This patch overlaps with another proposed patch that would add the region type to the log since sometimes the bug reporter sees this log on the console but doesn't know to include the kernel log: https://lore.kernel.org/lkml/20220611204859.234975-3-atomlin@redhat.com/ Here's an example of the "Firmware Bug" apparent false positive (wrapped for line length): DMAR: [Firmware Bug]: No firmware reserved region can cover this RMRR [0x000000006f760000-0x000000006f762fff], contact BIOS vendor for fixes DMAR: [Firmware Bug]: Your BIOS is broken; bad RMRR [0x000000006f760000-0x000000006f762fff] This is the snippet from the e820 table: BIOS-e820: [mem 0x0000000068bff000-0x000000006ebfefff] reserved BIOS-e820: [mem 0x000000006ebff000-0x000000006f9fefff] ACPI NVS BIOS-e820: [mem 0x000000006f9ff000-0x000000006fffefff] ACPI data Fixes: f036c7fa ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved") Cc: Will Mortensen <will@extrahop.com> Link: https://lore.kernel.org/linux-iommu/64a5843d-850d-e58c-4fc2-0a0eeeb656dc@nec.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=216443Signed-off-by: NCharlotte Tan <charlotte@extrahop.com> Reviewed-by: NAaron Tomlin <atomlin@redhat.com> Link: https://lore.kernel.org/r/20220929044449.32515-1-charlotte@extrahop.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Anup Patel 提交于
The kvm_riscv_vcpu_timer_pending() checks per-VCPU next_cycles and per-VCPU software injected VS timer interrupt. This function returns incorrect value when Sstc is available because the per-VCPU next_cycles are only updated by kvm_riscv_vcpu_timer_save() called from kvm_arch_vcpu_put(). As a result, when Sstc is available the VCPU does not block properly upon WFI traps. To fix the above issue, we introduce kvm_riscv_vcpu_timer_sync() which will update per-VCPU next_cycles upon every VM exit instead of kvm_riscv_vcpu_timer_save(). Fixes: 8f5cb44b ("RISC-V: KVM: Support sstc extension") Signed-off-by: NAnup Patel <apatel@ventanamicro.com> Reviewed-by: NAtish Patra <atishp@rivosinc.com> Signed-off-by: NAnup Patel <anup@brainfault.org>
-
由 Andrew Jones 提交于
riscv_cbom_block_size and riscv_init_cbom_blocksize() should always be available and riscv_init_cbom_blocksize() should always be invoked, even when compiling without RISCV_ISA_ZICBOM enabled. This is because disabling RISCV_ISA_ZICBOM means "don't use zicbom instructions in the kernel" not "pretend there isn't zicbom, even when there is". When zicbom is available, whether the kernel enables its use with RISCV_ISA_ZICBOM or not, KVM will offer it to guests. Ensure we can build KVM and that the block size is initialized even when compiling without RISCV_ISA_ZICBOM. Fixes: 8f7e001e ("RISC-V: Clean up the Zicbom block size probing") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NAndrew Jones <ajones@ventanamicro.com> Signed-off-by: NAnup Patel <apatel@ventanamicro.com> Reviewed-by: NConor Dooley <conor.dooley@microchip.com> Reviewed-by: NHeiko Stuebner <heiko@sntech.de> Tested-by: NHeiko Stuebner <heiko@sntech.de> Signed-off-by: NAnup Patel <anup@brainfault.org>
-
由 Jiri Olsa 提交于
The patchable_function_entry(5) might output 5 single nop instructions (depends on toolchain), which will clash with bpf_arch_text_poke check for 5 bytes nop instruction. Adding early init call for dispatcher that checks and change the patchable entry into expected 5 nop instruction if needed. There's no need to take text_mutex, because we are using it in early init call which is called at pre-smp time. Fixes: ceea991a ("bpf: Move bpf_dispatcher function out of ftrace locations") Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221018075934.574415-1-jolsa@kernel.orgSigned-off-by: NAlexei Starovoitov <ast@kernel.org>
-
- 20 10月, 2022 3 次提交
-
-
由 Maxim Levitsky 提交于
clear_cpu_cap(&boot_cpu_data) is very similar to setup_clear_cpu_cap() except that the latter also sets a bit in 'cpu_caps_cleared' which later clears the same cap in secondary cpus, which is likely what is meant here. Fixes: 47125db2 ("perf/x86/intel/lbr: Support Architectural LBR") Signed-off-by: NMaxim Levitsky <mlevitsk@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20220718141123.136106-2-mlevitsk@redhat.com
-
由 Peter Zijlstra 提交于
Different function signatures means they needs to be different functions; otherwise CFI gets upset. As triggered by the ftrace boot tests: [] CFI failure at ftrace_return_to_handler+0xac/0x16c (target: ftrace_stub+0x0/0x14; expected type: 0x0a5d5347) Fixes: 3c516f89 ("x86: Add support for CONFIG_CFI_CLANG") Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NMark Rutland <mark.rutland@arm.com> Tested-by: NMark Rutland <mark.rutland@arm.com> Link: https://lkml.kernel.org/r/Y06dg4e1xF6JTdQq@hirez.programming.kicks-ass.net
-
由 Peter Zijlstra 提交于
Remove the weird jumps to RET and simply use RET. This then promotes ftrace_stub() to a real function; which becomes important for kcfi. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111148.719080593@infradead.orgSigned-off-by: NPeter Zijlstra <peterz@infradead.org>
-
- 19 10月, 2022 1 次提交
-
-
由 Babu Moger 提交于
AMD systems support zero CBM (capacity bit mask) for cache allocation. That is reflected in rdt_init_res_defs_amd() by: r->cache.arch_has_empty_bitmaps = true; However given the unified code in cbm_validate(), checking for: val == 0 && !arch_has_empty_bitmaps is not enough because of another check in cbm_validate(): if ((zero_bit - first_bit) < r->cache.min_cbm_bits) The default value of r->cache.min_cbm_bits = 1. Leading to: $ cd /sys/fs/resctrl $ mkdir foo $ cd foo $ echo L3:0=0 > schemata -bash: echo: write error: Invalid argument $ cat /sys/fs/resctrl/info/last_cmd_status Need at least 1 bits in the mask Initialize the min_cbm_bits to 0 for AMD. Also, remove the default setting of min_cbm_bits and initialize it separately. After the fix: $ cd /sys/fs/resctrl $ mkdir foo $ cd foo $ echo L3:0=0 > schemata $ cat /sys/fs/resctrl/info/last_cmd_status ok Fixes: 316e7f90 ("x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps") Co-developed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NBabu Moger <babu.moger@amd.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NJames Morse <james.morse@arm.com> Reviewed-by: NReinette Chatre <reinette.chatre@intel.com> Reviewed-by: NFenghua Yu <fenghua.yu@intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com
-
- 18 10月, 2022 15 次提交
-
-
由 Nicholas Piggin 提交于
NMI interrupts should exit with EXCEPTION_RESTORE_REGS not with interrupt_return_srr, which is what the perf NMI handler currently does. This breaks if a PMI hits after interrupt_exit_user_prepare_main() has switched the context tracking to user mode, then the CT_WARN_ON() in interrupt_exit_kernel_prepare() fires because it returns to kernel with context set to user. This could possibly be solved by soft-disabling PMIs in the exit path, but that reduces our ability to profile that code. The warning could be removed, but it's potentially useful. All other NMIs and soft-NMIs return using EXCEPTION_RESTORE_REGS, so this makes perf interrupts consistent with that and seems like the best fix. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> [mpe: Squash in fixups from Nick] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221006140413.126443-3-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
NMI PMIs really should not return using the normal interrupt_return function. If such a PMI hits in code returning to user with the context switched to user mode, this warning can fire. This was enough to cause crashes when reproducing on 64s, because another perf interrupt would hit while reporting bug, and that would cause another bug, and so on until smashing the stack. Work around that particular crash for now by just disabling that context warning for PMIs. This is a hack and not a complete fix, there could be other such problems lurking in corners. But it does fix the known crash. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221014030729.2077151-3-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
The context tracking code in PR-KVM and BookE implementations is not complete, and can cause host crashes if context tracking is enabled. Make these implementations depend on !CONTEXT_TRACKING_USER. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221014030729.2077151-2-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
schedule must not be explicitly called while KUAP is unlocked, because the AMR register will not be saved across the context switch on 64s (preemption is allowed because that is driven by interrupts which do save the AMR). exit_vmx_usercopy() runs inside an unlocked user access region, and it calls preempt_enable() which will call schedule() if need_resched() was set while non-preemptible. This can cause tasks to run unprotected when the should not, and can cause the user copy to be improperly blocked when scheduling back to it. Fix this by avoiding the explicit resched for preempt kernels by generating an interrupt to reschedule the context if need_resched() got set. Reported-by: NSamuel Holland <samuel@sholland.org> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013151647.1857994-3-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
stop_machine_cpuslocked takes a mutex so it must be called in a preemptible context, so it can't simply be fixed by disabling preemption. This is not a bug, because CPU hotplug is locked, so this processor will call in to the stop machine function. So raw_smp_processor_id() could be used. This leaves a small chance that this thread will be migrated to another CPU, so the master work would be done by a CPU from a different context. Better for test coverage to make that a common case by just having the first CPU to call in become the master. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013151647.1857994-2-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
apply_to_page_range on kernel pages does not disable preemption, which is a requirement for hash's lazy mmu mode, which keeps track of the TLBs to flush with a per-cpu array. Reported-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013151647.1857994-1-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
This lock is taken while the raw kfence_freelist_lock is held, so it must also be a raw spinlock, as reported by lockdep when raw lock nesting checking is enabled. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013230710.1987253-3-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
With kfence enabled, there are several cases where HPTE and TLBIE locks are called from softirq context, for example: WARNING: inconsistent lock state 6.0.0-11845-g0cbbc95b12ac #1 Tainted: G N -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. swapper/0/1 [HC0[0]:SC0[0]:HE1:SE1] takes: c000000002734de8 (native_tlbie_lock){+.?.}-{2:2}, at: .native_hpte_updateboltedpp+0x1a4/0x600 {IN-SOFTIRQ-W} state was registered at: .lock_acquire+0x20c/0x520 ._raw_spin_lock+0x4c/0x70 .native_hpte_invalidate+0x62c/0x840 .hash__kernel_map_pages+0x450/0x640 .kfence_protect+0x58/0xc0 .kfence_guarded_free+0x374/0x5a0 .__slab_free+0x3d0/0x630 .put_cred_rcu+0xcc/0x120 .rcu_core+0x3c4/0x14e0 .__do_softirq+0x1dc/0x7dc .do_softirq_own_stack+0x40/0x60 Fix this by consistently disabling irqs while taking either of these locks. Don't just disable bh because several of the more common cases already disable irqs, so this just makes the locks always irq-safe. Reported-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013230710.1987253-2-npiggin@gmail.com
-
由 Nicholas Piggin 提交于
Add lockdep annotation for the HPTE bit-spinlock. Modern systems don't take the tlbie lock, so this shows up some of the same lockdep warnings that were being reported by the ppc970. And they're not taken in exactly the same places so this is nice to have in its own right. Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013230710.1987253-1-npiggin@gmail.com
-
由 Haren Myneni 提交于
The hypervisor assigns VAS (Virtual Accelerator Switchboard) windows depends on cores configured in LPAR. The kernel uses OF reconfig notifier to reconfig VAS windows for DLPAR CPU event. In the case of shared CPU mode partition, the hypervisor assigns VAS windows depends on CPU entitled capacity, not based on vcpus. When the user changes CPU entitled capacity for the partition, drmgr uses /proc/ppc64/lparcfg interface to notify the kernel. This patch adds the following changes to update VAS resources for shared mode: - Call vas reconfig windows from lparcfg_write() - Ignore reconfig changes in the VAS notifier Signed-off-by: NHaren Myneni <haren@linux.ibm.com> [mpe: Rework error handling, report any errors as EIO] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/efa9c16e4a78dda4567a16f13dabfd73cb4674a2.camel@linux.ibm.com
-
由 Haren Myneni 提交于
irq_default_primary_handler() can be used only with IRQF_ONESHOT flag, but the flag disables IRQ before executing the thread handler and enables it after the interrupt is handled. But this IRQ disable sets the VAS IRQ OFF state in the hypervisor. In case if NX faults during this window, the hypervisor will not deliver the fault interrupt to the partition and the user space may wait continuously for the CSB update. So use VAS specific IRQ handler instead of calling the default primary handler. Increment pending_faults counter in IRQ handler and the bottom thread handler will process all faults based on this counter. In case if the another interrupt is received while the thread is running, it will be processed using this counter. The synchronization of top and bottom handlers will be done with IRQTF_RUNTHREAD flag and will re-enter to bottom half if this flag is set. Signed-off-by: NHaren Myneni <haren@linux.ibm.com> Reviewed-by: NFrederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aaad8813b4762a6753cfcd0b605a7574a5192ec7.camel@linux.ibm.com
-
由 Borislav Petkov 提交于
Currently, the patch application logic checks whether the revision needs to be applied on each logical CPU (SMT thread). Therefore, on SMT designs where the microcode engine is shared between the two threads, the application happens only on one of them as that is enough to update the shared microcode engine. However, there are microcode patches which do per-thread modification, see Link tag below. Therefore, drop the revision check and try applying on each thread. This is what the BIOS does too so this method is very much tested. Btw, change only the early paths. On the late loading paths, there's no point in doing per-thread modification because if is it some case like in the bugzilla below - removing a CPUID flag - the kernel cannot go and un-use features it has detected are there early. For that, one should use early loading anyway. [ bp: Fixes does not contain the oldest commit which did check for equality but that is good enough. ] Fixes: 8801b3fc ("x86/microcode/AMD: Rework container parsing") Reported-by: NȘtefan Talpalaru <stefantalpalaru@yahoo.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Tested-by: NȘtefan Talpalaru <stefantalpalaru@yahoo.com> Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211
-
由 Pavel Kozlov 提交于
Since commit d9820ff7 ("ARC: mm: switch pgtable_t back to struct page *") a memory leakage problem occurs. Memory allocated for page table entries not released during process termination. This issue can be reproduced by a small program that allocates a large amount of memory. After several runs, you'll see that the amount of free memory has reduced and will continue to reduce after each run. All ARC CPUs are effected by this issue. The issue was introduced since the kernel stable release v5.15-rc1. As described in commit d9820ff7 after switch pgtable_t back to struct page *, a pointer to "struct page" and appropriate functions are used to allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't changed and returns the direct virtual address from the PMD (PGD) entry. Than this address used as a parameter in the __pte_free() and as a result this function couldn't release memory page allocated for PTEs. Fix this issue by changing the pmd_pgtable macro and returning pointer to struct page. Fixes: d9820ff7 ("ARC: mm: switch pgtable_t back to struct page *") Cc: Mike Rapoport <rppt@kernel.org> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: NPavel Kozlov <pavel.kozlov@synopsys.com> Signed-off-by: NVineet Gupta <vgupta@kernel.org>
-
由 Lukas Bulwahn 提交于
Clean up config files by: - removing configs that were deleted in the past - removing configs not in tree and without recently pending patches - adding new configs that are replacements for old configs in the file For some detailed information, see Link. Link: https://lore.kernel.org/kernel-janitors/20220929090645.1389-1-lukas.bulwahn@gmail.com/Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: NVineet Gupta <vgupta@kernel.org>
-
由 Randy Dunlap 提交于
Add 'volatile' to iounmap()'s argument to prevent build warnings. This make it the same as other major architectures. Placates these warnings: (12 such warnings) ../drivers/video/fbdev/riva/fbdev.c: In function 'rivafb_probe': ../drivers/video/fbdev/riva/fbdev.c:2067:42: error: passing argument 1 of 'iounmap' discards 'volatile' qualifier from pointer target type [-Werror=discarded-qualifiers] 2067 | iounmap(default_par->riva.PRAMIN); Fixes: 1162b070 ("ARC: I/O and DMA Mappings") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Vineet Gupta <vgupta@kernel.org> Cc: linux-snps-arc@lists.infradead.org Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: NVineet Gupta <vgupta@kernel.org>
-