- 17 1月, 2020 1 次提交
-
-
由 Shile Zhang 提交于
commit 1091670637be8bd34a39dd1ddcc0a10a7c88d4e2 upstream. Use a more generic name for additional table sorting usecases, such as the upcoming ORC table sorting feature. This tool is not tied to exception table sorting anymore. No functional changes intended. [ mingo: Rewrote the changelog. ] Signed-off-by: NShile Zhang <shile.zhang@linux.alibaba.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: linux-kbuild@vger.kernel.org Link: https://lkml.kernel.org/r/20191204004633.88660-6-shile.zhang@linux.alibaba.comSigned-off-by: NIngo Molnar <mingo@kernel.org> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
- 27 12月, 2019 1 次提交
-
-
由 Johannes Weiner 提交于
commit 8508cf3ffad4defa202b303e5b6379efc4cd9054 upstream. There are several definitions of those functions/macros in places that mess with fixed-point load averages. Provide an official version. [akpm@linux-foundation.org: fix missed conversion in block/blk-iolatency.c] Link: http://lkml.kernel.org/r/20180828172258.3185-5-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Tested-by: NSuren Baghdasaryan <surenb@google.com> Tested-by: NDaniel Drake <drake@endlessm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Weiner <jweiner@fb.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Enderborg <peter.enderborg@sony.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> [Joseph: use stat.mean instead of stat->rqs.mean to solve conflict] Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com> Conflicts: block/blk-iolatency.c
-
- 18 12月, 2019 6 次提交
-
-
由 Vincenzo Frascino 提交于
[ Upstream commit 552263456215ada7ee8700ce022d12b0cffe4802 ] clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or at run time. Fix the powerpc vdso implementation of clock_getres keeping a copy of hrtimer_resolution in vdso data and using that directly. Fixes: a7f290da ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") Cc: stable@vger.kernel.org Signed-off-by: NVincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: NChristophe Leroy <christophe.leroy@c-s.fr> Acked-by: NShuah Khan <skhan@linuxfoundation.org> [chleroy: changed CLOCK_REALTIME_RES to CLOCK_HRTIMER_RES] Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a55eca3a5e85233838c2349783bcb5164dae1d09.1575273217.git.christophe.leroy@c-s.frSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 Nathan Chancellor 提交于
[ Upstream commit c9029ef9c95765e7b63c4d9aa780674447db1ec0 ] Commit aea447141c7e ("powerpc: Disable -Wbuiltin-requires-header when setjmp is used") disabled -Wbuiltin-requires-header because of a warning about the setjmp and longjmp declarations. r367387 in clang added another diagnostic around this, complaining that there is no jmp_buf declaration. In file included from ../arch/powerpc/xmon/xmon.c:47: ../arch/powerpc/include/asm/setjmp.h:10:13: error: declaration of built-in function 'setjmp' requires the declaration of the 'jmp_buf' type, commonly provided in the header <setjmp.h>. [-Werror,-Wincomplete-setjmp-declaration] extern long setjmp(long *); ^ ../arch/powerpc/include/asm/setjmp.h:11:13: error: declaration of built-in function 'longjmp' requires the declaration of the 'jmp_buf' type, commonly provided in the header <setjmp.h>. [-Werror,-Wincomplete-setjmp-declaration] extern void longjmp(long *, long); ^ 2 errors generated. We are not using the standard library's longjmp/setjmp implementations for obvious reasons; make this clear to clang by using -ffreestanding on these files. Cc: stable@vger.kernel.org # 4.14+ Suggested-by: NSegher Boessenkool <segher@kernel.crashing.org> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191119045712.39633-3-natechancellor@gmail.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 Cédric Le Goater 提交于
commit b67a95f2abff0c34e5667c15ab8900de73d8d087 upstream. The PCI INTx interrupts and other LSI interrupts are handled differently under a sPAPR platform. When the interrupt source characteristics are queried, the hypervisor returns an H_INT_ESB flag to inform the OS that it should be using the H_INT_ESB hcall for interrupt management and not loads and stores on the interrupt ESB pages. A default -1 value is returned for the addresses of the ESB pages. The driver ignores this condition today and performs a bogus IO mapping. Recent changes and the DEBUG_VM configuration option make the bug visible with : kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=1024 NUMA pSeries Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-0.rc6.git0.1.fc32.ppc64le #1 NIP: c000000000f63294 LR: c000000000f62e44 CTR: 0000000000000000 REGS: c0000000fa45f0d0 TRAP: 0700 Not tainted (5.4.0-0.rc6.git0.1.fc32.ppc64le) ... NIP ioremap_page_range+0x4c4/0x6e0 LR ioremap_page_range+0x74/0x6e0 Call Trace: ioremap_page_range+0x74/0x6e0 (unreliable) do_ioremap+0x8c/0x120 __ioremap_caller+0x128/0x140 ioremap+0x30/0x50 xive_spapr_populate_irq_data+0x170/0x260 xive_irq_domain_map+0x8c/0x170 irq_domain_associate+0xb4/0x2d0 irq_create_mapping+0x1e0/0x3b0 irq_create_fwspec_mapping+0x27c/0x3e0 irq_create_of_mapping+0x98/0xb0 of_irq_parse_and_map_pci+0x168/0x230 pcibios_setup_device+0x88/0x250 pcibios_setup_bus_devices+0x54/0x100 __of_scan_bus+0x160/0x310 pcibios_scan_phb+0x330/0x390 pcibios_init+0x8c/0x128 do_one_initcall+0x60/0x2c0 kernel_init_freeable+0x290/0x378 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x80 Fixes: bed81ee1 ("powerpc/xive: introduce H_INT_ESB hcall") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: NCédric Le Goater <clg@kaod.org> Tested-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191203163642.2428-1-clg@kaod.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alastair D'Silva 提交于
commit 29430fae82073d39b1b881a3cd507416a56a363f upstream. When calling flush_icache_range with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: NAlastair D'Silva <alastair@d-silva.org> Cc: stable@vger.kernel.org Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191104023305.9581-2-alastair@au1.ibm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Cédric Le Goater 提交于
commit 1ca3dec2b2dff9d286ce6cd64108bda0e98f9710 upstream. When the machine crash handler is invoked, all interrupts are masked but interrupts which have not been started yet do not have an ESB page mapped in the Linux address space. This crashes the 'crash kexec' sequence on sPAPR guests. To fix, force the mapping of the ESB page when an interrupt is being mapped in the Linux IRQ number space. This is done by setting the initial state of the interrupt to OFF which is not necessarily the case on PowerNV. Fixes: 243e2511 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: NCédric Le Goater <clg@kaod.org> Reviewed-by: NGreg Kurz <groug@kaod.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031063100.3864-1-clg@kaod.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alastair D'Silva 提交于
commit f9ec11165301982585e5e5f606739b5bae5331f3 upstream. When calling __kernel_sync_dicache with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: NAlastair D'Silva <alastair@d-silva.org> Cc: stable@vger.kernel.org Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191104023305.9581-3-alastair@au1.ibm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 13 12月, 2019 1 次提交
-
-
由 Joel Stanley 提交于
[ Upstream commit b682c8692442711684befe413cf93cf01c5324ea ] The add_ssaaaa, sub_ddmmss, umul_ppmm and udiv_qrnnd macros originate from GCC's longlong.h which in turn was copied from GMP's longlong.h a few decades ago. This was found when compiling with clang: arch/powerpc/math-emu/fnmsub.c:46:2: error: invalid use of a cast in a inline asm context requiring an l-value: remove the cast or build with -fheinous-gnu-extensions FP_ADD_D(R, T, B); ^~~~~~~~~~~~~~~~~ ... ./arch/powerpc/include/asm/sfp-machine.h:283:27: note: expanded from macro 'sub_ddmmss' : "=r" ((USItype)(sh)), \ ~~~~~~~~~~^~~ Segher points out: this was fixed in GCC over 16 years ago ( https://gcc.gnu.org/r56600 ), and in GMP (where it comes from) presumably before that. Update the add_ssaaaa, sub_ddmmss, umul_ppmm and udiv_qrnnd macros to the latest GCC version in order to git rid of the invalid casts. These were taken as-is from GCC's longlong in order to make future syncs obvious. Other parts of sfp-machine.h were left as-is as the file contains more features than present in longlong.h. Link: https://github.com/ClangBuiltLinux/linux/issues/260Signed-off-by: NJoel Stanley <joel@jms.id.au> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Reviewed-by: NSegher Boessenkool <segher@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 05 12月, 2019 13 次提交
-
-
由 Gen Zhang 提交于
[ Upstream commit efa9ace68e487ddd29c2b4d6dd23242158f1f607 ] In dlpar_parse_cc_property(), 'prop->name' is allocated by kstrdup(). kstrdup() may return NULL, so it should be checked and handle error. And prop should be freed if 'prop->name' is NULL. Signed-off-by: NGen Zhang <blackgod016574@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Benjamin Herrenschmidt 提交于
[ Upstream commit 505a314fb28ce122091691c51426fa85c084e115 ] HMIs will crash the kernel due to BRANCH_LINK_TO_FAR(hmi_exception_realmode) Calling into the OPD instead of the actual code. Fixes: 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts") Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Use DOTSYM() rather than #ifdef] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 47918bc68b7427e961035949cc1501a864578a69 ] In update_lmb_associativity_index() we lookup dr_node using of_find_node_by_path() which takes a reference for us. In the non-error case we forget to drop the reference. Note that find_aa_index() does modify properties of the node, but doesn't need an extra reference held once it's returned. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christophe Leroy 提交于
[ Upstream commit 0deae39cec6dab3a66794f3e9e83ca4dc30080f1 ] When the watchdog timer is set in interrupt mode, it causes a machine check when it times out. The purpose of this mode is to ease debugging, not to crash the kernel and reboot the machine. This patch implements a special handling for that, in order to not crash the kernel if the watchdog times out while in interrupt or within the idle task. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> [scottwood: added missing #include] Signed-off-by: NScott Wood <oss@buserror.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Alexey Kardashevskiy 提交于
[ Upstream commit c20577014f85f36d4e137d3d52a1f61225b4a3d2 ] The current implementation of the OPAL_PCI_EEH_FREEZE_STATUS call in skiboot's NPU driver does not touch the pci_error_type parameter so it might have garbage but the powernv code analyzes it nevertheless. This initializes pcierr and fstate to zero in all call sites. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NSam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Benjamin Herrenschmidt 提交于
[ Upstream commit 3cfb9ebe906b51f2942b1e251009bb251efd2ba6 ] The bamboo dts has a bug: it uses a non-naturally aligned range for PCI memory space. This isnt' supported by the code, thus causing PCI to break on this system. This is due to the fact that while the chip memory map has 1G reserved for PCI memory, it's only 512M aligned. The code doesn't know how to split that into 2 different PMMs and fails, so limit the region to 512M. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christophe Leroy 提交于
[ Upstream commit 49a502ea23bf9dec47f8f3c3960909ff409cd1bb ] As several other arches including x86, this patch makes it explicit that a bad page fault is a NULL pointer dereference when the fault address is lower than PAGE_SIZE In the mean time, this page makes all bad_page_fault() messages shorter so that they remain on one single line. And it prefixes them by "BUG: " so that they get easily grepped. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> [mpe: Avoid pr_cont()] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christophe Leroy 提交于
[ Upstream commit b18f0ae92b0a1db565c3e505fa87b6971ad3b641 ] This patch fixes early DEBUG messages in prom.c: - Use %px instead of %p to see the addresses - Cast memblock_phys_mem_size() with (unsigned long long) to avoid build failure when phys_addr_t is not 64 bits. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit 72e7bcc2cdf82bf03caaa5e6c9b0134c2fc2ee7d ] When building for ppc32 with clang these flags are unsupported: -ffixed-r2 and -mmultiple llvm's lib/Target/PowerPC/PPCRegisterInfo.cpp marks r2 as reserved on when building for SVR4ABI and !ppc64: // The SVR4 ABI reserves r2 and r13 if (Subtarget.isSVR4ABI()) { // We only reserve r2 if we need to use the TOC pointer. If we have no // explicit uses of the TOC pointer (meaning we're a leaf function with // no constant-pool loads, etc.) and we have no potential uses inside an // inline asm block, then we can treat r2 has an ordinary callee-saved // register. const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>(); if (!TM.isPPC64() || FuncInfo->usesTOCBasePtr() || MF.hasInlineAsm()) markSuperRegs(Reserved, PPC::R2); // System-reserved register markSuperRegs(Reserved, PPC::R13); // Small Data Area pointer register } This means we can safely omit -ffixed-r2 when building for 32-bit targets. The -mmultiple/-mno-multiple flags are not supported by clang, so platforms that might support multiple miss out on using multiple word instructions. We wrap these flags in cc-option so that when Clang gains support the kernel will be able use these flags. Clang 8 can then build a ppc44x_defconfig which boots in Qemu: make CC=clang-8 ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- ppc44x_defconfig ./scripts/config -e CONFIG_DEVTMPFS -d DEVTMPFS_MOUNT make CC=clang-8 ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- qemu-system-ppc -M bamboo \ -kernel arch/powerpc/boot/zImage \ -dtb arch/powerpc/boot/dts/bamboo.dtb \ -initrd ~/ppc32-440-rootfs.cpio \ -nographic -serial stdio -monitor pty -append "console=ttyS0" Link: https://github.com/ClangBuiltLinux/linux/issues/261 Link: https://bugs.llvm.org/show_bug.cgi?id=39556 Link: https://bugs.llvm.org/show_bug.cgi?id=39555Signed-off-by: NJoel Stanley <joel@jms.id.au> Reviewed-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Madhavan Srinivasan 提交于
[ Upstream commit 2d46d4877b1afd14059393a48bdb8ce27955174c ] Raw event code has couple of fields "unit" and "cache" in it, to capture the "unit" to monitor for a given pmcxsel and cache reload qualifier to program in MMCR1. isa207_get_constraint() refers "unit" field to update the MMCRC (L2/L3) Event bus control fields with "cache" bits of the raw event code. These are power8 specific and not supported by PowerISA v3.0 pmu. So wrap the checks to be power8 specific. Also, "cache" bit field is referred to update MMCR1[16:17] and this check can be power8 specific. Fixes: 7ffd948f ('powerpc/perf: factor out power8 pmu functions') Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christophe Leroy 提交于
[ Upstream commit 32c8c4c621897199e690760c2d57054f8b84b6e6 ] mfsrin() takes segment num from bits 31-28 (IBM bits 0-3). Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> [mpe: Clarify bit numbering] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Christophe Leroy 提交于
[ Upstream commit e93ba1b7eb5b188c749052df7af1c90821c5f320 ] This patch fixes the loop in p_block_mapped() and v_block_mapped() to scan the entire bat_addrs[] array. Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Eric Dumazet 提交于
[ Upstream commit 7de086909365cd60a5619a45af3f4152516fd75c ] We have seen many crashes on powerpc hosts while loading bpf programs. The problem here is that bpf_int_jit_compile() does a first pass to compute the program length. Then it allocates memory to store the generated program and calls bpf_jit_build_body() a second time (and a third time later) What I have observed is that the second bpf_jit_build_body() could end up using few more words than expected. If bpf_jit_binary_alloc() put the space for the program at the end of the allocated page, we then write on a non mapped memory. It appears that bpf_jit_emit_tail_call() calls bpf_jit_emit_common_epilogue() while ctx->seen might not be stable. Only after the second pass we can be sure ctx->seen wont be changed. Trying to avoid a second pass seems quite complex and probably not worth it. Fixes: ce076141 ("powerpc/bpf: Implement support for tail calls") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191101033444.143741-1-edumazet@google.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
- 01 12月, 2019 17 次提交
-
-
由 Michael Ellerman 提交于
commit af2e8c68b9c5403f77096969c516f742f5bb29e0 upstream. On some systems that are vulnerable to Spectre v2, it is up to software to flush the link stack (return address stack), in order to protect against Spectre-RSB. When exiting from a guest we do some house keeping and then potentially exit to C code which is several stack frames deep in the host kernel. We will then execute a series of returns without preceeding calls, opening up the possiblity that the guest could have poisoned the link stack, and direct speculative execution of the host to a gadget of some sort. To prevent this we add a flush of the link stack on exit from a guest. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> [dja: straightforward backport to v4.19] Signed-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Michael Ellerman 提交于
commit 39e72bf96f5847ba87cc5bd7a3ce0fed813dc9ad upstream. In commit ee13cb24 ("powerpc/64s: Add support for software count cache flush"), I added support for software to flush the count cache (indirect branch cache) on context switch if firmware told us that was the required mitigation for Spectre v2. As part of that code we also added a software flush of the link stack (return address stack), which protects against Spectre-RSB between user processes. That is all correct for CPUs that activate that mitigation, which is currently Power9 Nimbus DD2.3. What I got wrong is that on older CPUs, where firmware has disabled the count cache, we also need to flush the link stack on context switch. To fix it we create a new feature bit which is not set by firmware, which tells us we need to flush the link stack. We set that when firmware tells us that either of the existing Spectre v2 mitigations are enabled. Then we adjust the patching code so that if we see that feature bit we enable the link stack flush. If we're also told to flush the count cache in software then we fall through and do that also. On the older CPUs we don't need to do do the software count cache flush, firmware has disabled it, so in that case we patch in an early return after the link stack flush. The naming of some of the functions is awkward after this patch, because they're called "count cache" but they also do link stack. But we'll fix that up in a later commit to ease backporting. This is the fix for CVE-2019-18660. Reported-by: NAnthony Steinhauser <asteinhauser@google.com> Fixes: ee13cb24 ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Christopher M. Riedl 提交于
commit d8f0e0b073e1ec52a05f0c2a56318b47387d2f10 upstream. Add support for disabling the kernel implemented spectre v2 mitigation (count cache flush on context switch) via the nospectre_v2 and mitigations=off cmdline options. Suggested-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NChristopher M. Riedl <cmr@informatik.wtf> Reviewed-by: NAndrew Donnellan <ajd@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190524024647.381-1-cmr@informatik.wtfSigned-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 David Hildenbrand 提交于
[ Upstream commit cec1680591d6d5b10ecc10f370210089416e98af ] device_online() should be called with device_hotplug_lock() held. Link: http://lkml.kernel.org/r/20180925091457.28651-5-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Hildenbrand 提交于
[ Upstream commit 8df1d0e4a265f25dc1e7e7624ccdbcb4a6630c89 ] add_memory() currently does not take the device_hotplug_lock, however is aleady called under the lock from arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c to synchronize against CPU hot-remove and similar. In general, we should hold the device_hotplug_lock when adding memory to synchronize against online/offline request (e.g. from user space) - which already resulted in lock inversions due to device_lock() and mem_hotplug_lock - see 30467e0b ("mm, hotplug: fix concurrent memory hot-add deadlock"). add_memory()/add_memory_resource() will create memory block devices, so this really feels like the right thing to do. Holding the device_hotplug_lock makes sure that a memory block device can really only be accessed (e.g. via .online/.state) from user space, once the memory has been fully added to the system. The lock is not held yet in drivers/xen/balloon.c arch/powerpc/platforms/powernv/memtrace.c drivers/s390/char/sclp_cmd.c drivers/hv/hv_balloon.c So, let's either use the locked variants or take the lock. Don't export add_memory_resource(), as it once was exported to be used by XEN, which is never built as a module. If somebody requires it, we also have to export a locked variant (as device_hotplug_lock is never exported). Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mathieu Malaterre <malat@debian.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit 9c87156cce5a63735d1218f0096a65c50a7a32aa ] When building with clang (8 trunk, 7.0 release) the frame size limit is hit: arch/powerpc/xmon/xmon.c:452:12: warning: stack frame size of 2576 bytes in function 'xmon_core' [-Wframe-larger-than=] Some investigation by Naveen indicates this is due to clang saving the addresses to printf format strings on the stack. While this issue is investigated, bump up the frame size limit for xmon when building with clang. Link: https://github.com/ClangBuiltLinux/linux/issues/252Signed-off-by: NJoel Stanley <joel@jms.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Felipe Rechia 提交于
[ Upstream commit e901378578c62202594cba0f6c076f3df365ec91 ] Fix a bug introduced by the creation of flush_all_to_thread() for processors that have SPE (Signal Processing Engine) and use it to compute floating-point operations. >From userspace perspective, the problem was seen in attempts of computing floating-point operations which should generate exceptions. For example: fork(); float x = 0.0 / 0.0; isnan(x); // forked process returns False (should be True) The operation above also should always cause the SPEFSCR FINV bit to be set. However, the SPE floating-point exceptions were turned off after a fork(). Kernel versions prior to the bug used flush_spe_to_thread(), which first saves SPEFSCR register values in tsk->thread and then calls giveup_spe(tsk). After commit 579e633e, the save_all() function was called first to giveup_spe(), and then the SPEFSCR register values were saved in tsk->thread. This would save the SPEFSCR register values after disabling SPE for that thread, causing the bug described above. Fixes 579e633e ("powerpc: create flush_all_to_thread()") Signed-off-by: NFelipe Rechia <felipe.rechia@datacom.com.br> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Nicholas Piggin 提交于
[ Upstream commit dd76ff5af35350fd6d5bb5b069e73b6017f66893 ] Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 81d1b54dec95209ab5e5be2cf37182885f998753 ] When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. Currently we always use a small page at the text/data boundary, even when that's not necessary: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages This is because the check that the mapping crosses the __init_begin boundary is too strict, it also returns true when we map exactly up to the boundary. So fix it to check that the mapping would actually map past __init_begin, and with that we see: Mapped 0x0000000000000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 3b5657ed5b4e27ccf593a41ff3c5aa27dae8df18 ] When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. But the current logic uses small pages for the entire text section, regardless of whether a larger page size would fit. eg. with the boundary at 16M we could use 2M pages, but instead we use 64K pages up to the 16M boundary: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages This is because the test is checking if addr is < __init_begin and addr + mapping_size is >= _stext. But that is true for all pages between _stext and __init_begin. Instead what we want to check is if we are crossing the text/data boundary, which is at __init_begin. With that fixed we see: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we're correctly using 2MB pages below __init_begin, but we still drop down to 64K pages unnecessarily at the boundary. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 5c6499b7041b43807dfaeda28aa87fc0e62558f7 ] When we have CONFIG_STRICT_KERNEL_RWX enabled, we try to split the kernel linear (1:1) mapping so that the kernel text is in a separate page to kernel data, so we can mark the former read-only. We could achieve that just by always using 64K pages for the linear mapping, but we try to be smarter. Instead we use huge pages when possible, and only switch to smaller pages when necessary. However we have an off-by-one bug in that logic, which causes us to calculate the wrong boundary between text and data. For example with the end of the kernel text at 16M we see: radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we mapped from 0 to 18M with 64K pages, even though the boundary between text and data is at 16M. With the fix we see we're correctly hitting the 16M boundary: radix-mmu: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Aravinda Prasad 提交于
[ Upstream commit c6c26fb55e8e4b3fc376be5611685990a17de27a ] This patch exports the raw per-CPU VPA data via debugfs. A per-CPU file is created which exports the VPA data of that CPU to help debug some of the VPA related issues or to analyze the per-CPU VPA related statistics. v3: Removed offline CPU check. v2: Included offline CPU check and other review comments. Signed-off-by: NAravinda Prasad <aravinda@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sam Bobroff 提交于
[ Upstream commit 473af09b56dc4be68e4af33220ceca6be67aa60d ] eeh_add_to_parent_pe() sometimes removes the EEH_PE_KEEP flag, but it incorrectly removes it from pe->type, instead of pe->state. However, rather than clearing it from the correct field, remove it. Inspection of the code shows that it can't ever have had any effect (even if it had been cleared from the correct field), because the field is never tested after it is cleared by the statement in question. The clear statement was added by commit 807a827d ("powerpc/eeh: Keep PE during hotplug"), but it didn't explain why it was necessary. Signed-off-by: NSam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sam Bobroff 提交于
[ Upstream commit bcbe3730531239abd45ab6c6af4a18078b37dd47 ] If a device is removed during EEH processing (either by a driver's handler or as part of recovery), it can lead to a null dereference in eeh_pe_report_edev(). To handle this, skip devices that have been removed. Signed-off-by: NSam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit e8e132e6885962582784b6fa16a80d07ea739c0f ] This will avoid auto-vectorisation when building with higher optimisation levels. We don't know if the machine can support VSX and even if it's present it's probably not going to be enabled at this point in boot. These flag were both added prior to GCC 4.6 which is the minimum compiler version supported by upstream, thanks to Segher for the details. Signed-off-by: NJoel Stanley <joel@jms.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit 1a855eaccf353f7ed1d51a3d4b3af727ccbd81ca ] As of commit 10c77dba ("powerpc/boot: Fix build failure in 32-bit boot wrapper") the opal code is hidden behind CONFIG_PPC64_BOOT_WRAPPER, but the boot wrapper avoids include/linux, so it does not get the normal Kconfig flags. We can drop the guard entirely as in commit f8e8e69c ("powerpc/boot: Only build OPAL code when necessary") the makefile only includes opal.c in the build if CONFIG_PPC64_BOOT_WRAPPER is set. Fixes: 10c77dba ("powerpc/boot: Fix build failure in 32-bit boot wrapper") Signed-off-by: NJoel Stanley <joel@jms.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dan Carpenter 提交于
[ Upstream commit 014704e6f54189a203cc14c7c0bb411b940241bc ] The "count < sizeof(struct os_area_db)" comparison is type promoted to size_t so negative values of "count" are treated as very high values and we accidentally return success instead of a negative error code. This doesn't really change runtime much but it fixes a static checker warning. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NGeoff Levand <geoff@infradead.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 24 11月, 2019 1 次提交
-
-
由 Michael Ellerman 提交于
[ Upstream commit b4d16ab58c41ff0125822464bdff074cebd0fe47 ] In the recent commit 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") we changed the way we initialise the decrementer clockevent(s). We no longer initialise the mult & shift values of decrementer_clockevent itself. This has the effect of breaking PR KVM, because it uses those values in kvmppc_emulate_dec(). The symptom is guest kernels spin forever mid-way through boot. For now fix it by assigning back to decrementer_clockevent the mult and shift values. Fixes: 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-