- 01 12月, 2019 38 次提交
-
-
由 Michael Ellerman 提交于
commit af2e8c68b9c5403f77096969c516f742f5bb29e0 upstream. On some systems that are vulnerable to Spectre v2, it is up to software to flush the link stack (return address stack), in order to protect against Spectre-RSB. When exiting from a guest we do some house keeping and then potentially exit to C code which is several stack frames deep in the host kernel. We will then execute a series of returns without preceeding calls, opening up the possiblity that the guest could have poisoned the link stack, and direct speculative execution of the host to a gadget of some sort. To prevent this we add a flush of the link stack on exit from a guest. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> [dja: straightforward backport to v4.19] Signed-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Michael Ellerman 提交于
commit 39e72bf96f5847ba87cc5bd7a3ce0fed813dc9ad upstream. In commit ee13cb24 ("powerpc/64s: Add support for software count cache flush"), I added support for software to flush the count cache (indirect branch cache) on context switch if firmware told us that was the required mitigation for Spectre v2. As part of that code we also added a software flush of the link stack (return address stack), which protects against Spectre-RSB between user processes. That is all correct for CPUs that activate that mitigation, which is currently Power9 Nimbus DD2.3. What I got wrong is that on older CPUs, where firmware has disabled the count cache, we also need to flush the link stack on context switch. To fix it we create a new feature bit which is not set by firmware, which tells us we need to flush the link stack. We set that when firmware tells us that either of the existing Spectre v2 mitigations are enabled. Then we adjust the patching code so that if we see that feature bit we enable the link stack flush. If we're also told to flush the count cache in software then we fall through and do that also. On the older CPUs we don't need to do do the software count cache flush, firmware has disabled it, so in that case we patch in an early return after the link stack flush. The naming of some of the functions is awkward after this patch, because they're called "count cache" but they also do link stack. But we'll fix that up in a later commit to ease backporting. This is the fix for CVE-2019-18660. Reported-by: NAnthony Steinhauser <asteinhauser@google.com> Fixes: ee13cb24 ("powerpc/64s: Add support for software count cache flush") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Christopher M. Riedl 提交于
commit d8f0e0b073e1ec52a05f0c2a56318b47387d2f10 upstream. Add support for disabling the kernel implemented spectre v2 mitigation (count cache flush on context switch) via the nospectre_v2 and mitigations=off cmdline options. Suggested-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NChristopher M. Riedl <cmr@informatik.wtf> Reviewed-by: NAndrew Donnellan <ajd@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190524024647.381-1-cmr@informatik.wtfSigned-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Waiman Long 提交于
commit cd5a2aa89e847bdda7b62029d94e95488d73f6b2 upstream. Since MDS and TAA mitigations are inter-related for processors that are affected by both vulnerabilities, the followiing confusing messages can be printed in the kernel log: MDS: Vulnerable MDS: Mitigation: Clear CPU buffers To avoid the first incorrect message, defer the printing of MDS mitigation after the TAA mitigation selection has been done. However, that has the side effect of printing TAA mitigation first before MDS mitigation. [ bp: Check box is affected/mitigations are disabled first before printing and massage. ] Suggested-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: NWaiman Long <longman@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Mark Gross <mgross@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191115161445.30809-3-longman@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Waiman Long 提交于
commit 64870ed1b12e235cfca3f6c6da75b542c973ff78 upstream. For MDS vulnerable processors with TSX support, enabling either MDS or TAA mitigations will enable the use of VERW to flush internal processor buffers at the right code path. IOW, they are either both mitigated or both not. However, if the command line options are inconsistent, the vulnerabilites sysfs files may not report the mitigation status correctly. For example, with only the "mds=off" option: vulnerabilities/mds:Vulnerable; SMT vulnerable vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable The mds vulnerabilities file has wrong status in this case. Similarly, the taa vulnerability file will be wrong with mds mitigation on, but taa off. Change taa_select_mitigation() to sync up the two mitigation status and have them turned off if both "mds=off" and "tsx_async_abort=off" are present. Update documentation to emphasize the fact that both "mds=off" and "tsx_async_abort=off" have to be specified together for processors that are affected by both TAA and MDS to be effective. [ bp: Massage and add kernel-parameters.txt change too. ] Fixes: 1b42f017415b ("x86/speculation/taa: Add mitigation for TSX Async Abort") Signed-off-by: NWaiman Long <longman@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-doc@vger.kernel.org Cc: Mark Gross <mgross@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alexander Kapshuk 提交于
commit 700c1018b86d0d4b3f1f2d459708c0cdf42b521d upstream. gawk 5.0.1 generates the following regexp warnings: GEN /home/sasha/torvalds/tools/objtool/arch/x86/lib/inat-tables.c awk: ../arch/x86/tools/gen-insn-attr-x86.awk:260: warning: regexp escape sequence `\:' is not a known regexp operator awk: ../arch/x86/tools/gen-insn-attr-x86.awk:350: (FILENAME=../arch/x86/lib/x86-opcode-map.txt FNR=41) warning: regexp escape sequence `\&' is not a known regexp operator Ealier versions of gawk are not known to generate these warnings. The gawk manual referenced below does not list characters ':' and '&' as needing escaping, so 'unescape' them. See https://www.gnu.org/software/gawk/manual/html_node/Escape-Sequences.html for more info. Running diff on the output generated by the script before and after applying the patch reported no differences. [ bp: Massage commit message. ] [ Caught the respective tools header discrepancy. ] Reported-by: Nkbuild test robot <lkp@intel.com> Signed-off-by: NAlexander Kapshuk <alexander.kapshuk@gmail.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190924044659.3785-1-alexander.kapshuk@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alexey Brodkin 提交于
commit 5effc09c4907901f0e71e68e5f2e14211d9a203f upstream. 8-letter strings representing ARC perf events are stores in two 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc. And the same order of bytes in the word is used regardless CPU endianness. Which means in case of big-endian CPU core we need to swap bytes to get the same order as if it was on little-endian CPU. Otherwise we're seeing the following error message on boot: ------------------------->8---------------------- ARC perf : 8 counters (32 bits), 40 conditions, [overflow IRQ support] sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 sysfs_warn_dup+0x46/0x58 sysfs_add_file_mode_ns+0xb2/0x168 create_files+0x70/0x2a0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0 Failed to register pmu: arc_pct, reason -17 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 __warn+0x9c/0xd4 warn_slowpath_fmt+0x22/0x2c perf_event_sysfs_init+0x70/0xa0 ---[ end trace a75fb9a9837bd1ec ]--- ------------------------->8---------------------- What happens here we're trying to register more than one raw perf event with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters and encoded into two 32-bit words. In this particular case we deal with 2 events: * "IJMP____" which counts all jump & branch instructions * "IJMPC___" which counts only conditional jumps & branches Those strings are split in two 32-bit words this way "IJMP" + "____" & "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core being big-endian then we read "PMJI" + "____" & "PMJI" + "___C". And since we interpret read array of ASCII letters as a null-terminated string on big-endian CPU we end up with 2 events of the same name "PMJI". Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Chester Lin 提交于
commit 1d31999cf04c21709f72ceb17e65b54a401330da upstream. adjust_lowmem_bounds() checks every memblocks in order to find the boundary between lowmem and highmem. However some memblocks could be marked as NOMAP so they are not used by kernel, which should be skipped while calculating the boundary. Signed-off-by: NChester Lin <clin@suse.com> Reviewed-by: NMike Rapoport <rppt@linux.ibm.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NLee Jones <lee.jones@linaro.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Sean Christopherson 提交于
commit a78986aae9b2988f8493f9f65a587ee433e83bc3 upstream. Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and instead manually handle ZONE_DEVICE on a case-by-case basis. For things like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal pages, e.g. put pages grabbed via gup(). But for flows such as setting A/D bits or shifting refcounts for transparent huge pages, KVM needs to to avoid processing ZONE_DEVICE pages as the flows in question lack the underlying machinery for proper handling of ZONE_DEVICE pages. This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup() when running a KVM guest backed with /dev/dax memory, as KVM straight up doesn't put any references to ZONE_DEVICE pages acquired by gup(). Note, Dan Williams proposed an alternative solution of doing put_page() on ZONE_DEVICE pages immediately after gup() in order to simplify the auditing needed to ensure is_zone_device_page() is called if and only if the backing device is pinned (via gup()). But that approach would break kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til unmap() when accessing guest memory, unlike KVM's secondary MMU, which coordinates with mmu_notifier invalidations to avoid creating stale page references, i.e. doesn't rely on pages being pinned. [*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.plReported-by: NAdam Borowski <kilobyte@angband.pl> Analyzed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NDan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org Fixes: 3565fce3 ("mm, x86: get_user_pages() for dax mappings") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [sean: backport to 4.x; resolve conflict in mmu.c] Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Nickhu 提交于
[ Upstream commit 9aaafac8cffa1c1edb66e19a63841b7c86be07ca ] There two bitfield bug for perfomance counter in bitfield.h: PFM_CTL_offSEL1 21 --> 16 PFM_CTL_offSEL2 27 --> 22 This commit fix it. Signed-off-by: NNickhu <nickhu@andestech.com> Acked-by: NGreentime Hu <greentime@andestech.com> Signed-off-by: NGreentime Hu <greentime@andestech.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Leonard Crestez 提交于
[ Upstream commit 1ad9fb750a104f51851c092edd7b3553f0218428 ] Bindings for "fixed-regulator" only explicitly support "gpio" property, not "gpios". Fix by correcting the property name. The enet PHYs on imx6sx-sdb needs to be explicitly reset after a power cycle, this can be handled by the phy-reset-gpios property. Sadly this is not handled on suspend: the fec driver turns phy-supply off but doesn't assert phy-reset-gpios again on resume. Since additional phy-level work is required to support powering off the phy in suspend fix the problem by just marking the regulator as "boot-on" "always-on" so that it's never turned off. This behavior is equivalent to older releases. Keep the phy-reset-gpios property on fec anyway because it is a correct description of board design. This issue was exposed by commit efdfeb079cc3 ("regulator: fixed: Convert to use GPIO descriptor only") which causes the "gpios" property to also be parsed. Before that commit the "gpios" property had no effect, PHY reset was only handled in the the bootloader. This fixes linux-next boot failures previously reported here: https://lore.kernel.org/patchwork/patch/982437/#1177900 https://lore.kernel.org/patchwork/patch/994091/#1178304Signed-off-by: NLeonard Crestez <leonard.crestez@nxp.com> Signed-off-by: NShawn Guo <shawnguo@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Victor Kamensky 提交于
[ Upstream commit 98356eb0ae499c63e78073ccedd9a5fc5c563288 ] After 'a66649da arm64: fix vdso-offsets.h dependency' if one will try to build .i file in case of external kernel module, build fails complaining that prepare0 target is missing. This issue came up with SystemTap when it tries to build variety of .i files for its own generated kernel modules trying to figure given kernel features/capabilities. The issue is that prepare0 is defined in top level Makefile only if KBUILD_EXTMOD is not defined. .i file rule depends on prepare and in case KBUILD_EXTMOD defined top level Makefile contains empty rule for prepare. But after mentioned commit arch/arm64/Makefile would introduce dependency on prepare0 through its own prepare target. Fix it to put proper ifdef KBUILD_EXTMOD around code introduced by mentioned commit. It matches what top level Makefile does. Acked-by: NKevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: NVictor Kamensky <kamensky@cisco.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Hildenbrand 提交于
[ Upstream commit cec1680591d6d5b10ecc10f370210089416e98af ] device_online() should be called with device_hotplug_lock() held. Link: http://lkml.kernel.org/r/20180925091457.28651-5-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Hildenbrand 提交于
[ Upstream commit 8df1d0e4a265f25dc1e7e7624ccdbcb4a6630c89 ] add_memory() currently does not take the device_hotplug_lock, however is aleady called under the lock from arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c to synchronize against CPU hot-remove and similar. In general, we should hold the device_hotplug_lock when adding memory to synchronize against online/offline request (e.g. from user space) - which already resulted in lock inversions due to device_lock() and mem_hotplug_lock - see 30467e0b ("mm, hotplug: fix concurrent memory hot-add deadlock"). add_memory()/add_memory_resource() will create memory block devices, so this really feels like the right thing to do. Holding the device_hotplug_lock makes sure that a memory block device can really only be accessed (e.g. via .online/.state) from user space, once the memory has been fully added to the system. The lock is not held yet in drivers/xen/balloon.c arch/powerpc/platforms/powernv/memtrace.c drivers/s390/char/sclp_cmd.c drivers/hv/hv_balloon.c So, let's either use the locked variants or take the lock. Don't export add_memory_resource(), as it once was exported to be used by XEN, which is never built as a module. If somebody requires it, we also have to export a locked variant (as device_hotplug_lock is never exported). Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NRashmica Gupta <rashmica.g@gmail.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mathieu Malaterre <malat@debian.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit 9c87156cce5a63735d1218f0096a65c50a7a32aa ] When building with clang (8 trunk, 7.0 release) the frame size limit is hit: arch/powerpc/xmon/xmon.c:452:12: warning: stack frame size of 2576 bytes in function 'xmon_core' [-Wframe-larger-than=] Some investigation by Naveen indicates this is due to clang saving the addresses to printf format strings on the stack. While this issue is investigated, bump up the frame size limit for xmon when building with clang. Link: https://github.com/ClangBuiltLinux/linux/issues/252Signed-off-by: NJoel Stanley <joel@jms.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Anton Ivanov 提交于
[ Upstream commit 917e2fd2c53eb3c4162f5397555cbd394390d4bc ] This fixes a long standing bug where large amounts of output could freeze the tty (most commonly seen on stdio console). While the bug has always been there it became more pronounced after moving to the new interrupt controller. The line semantics are now changed to have true IRQ write semantics which should further improve the tty/line subsystem stability and performance Signed-off-by: NAnton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: NRichard Weinberger <richard@nod.at> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Andrey Ryabinin 提交于
[ Upstream commit 19a2ca0fb560fd7be7b5293c6b652c6d6078dcde ] ARM64 has asm implementation of memchr(), memcmp(), str[r]chr(), str[n]cmp(), str[n]len(). KASAN don't see memory accesses in asm code, thus it can potentially miss many bugs. Ifdef out __HAVE_ARCH_* defines of these functions when KASAN is enabled, so the generic implementations from lib/string.c will be used. We can't just remove the asm functions because efistub uses them. And we can't have two non-weak functions either, so declare the asm functions as weak. Link: http://lkml.kernel.org/r/20180920135631.23833-2-aryabinin@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Reported-by: NKyeongdon Kim <kyeongdon.kim@lge.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David S. Miller 提交于
[ Upstream commit 6c2fc9cddc1ffdef8ada1dc8404e5affae849953 ] Such as: fs/ocfs2/file.c: In function ‘ocfs2_file_write_iter’: ./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value] #define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr)))) and drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function ‘ixgbevf_xdp_setup’: ./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value] #define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr)))) Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Felipe Rechia 提交于
[ Upstream commit e901378578c62202594cba0f6c076f3df365ec91 ] Fix a bug introduced by the creation of flush_all_to_thread() for processors that have SPE (Signal Processing Engine) and use it to compute floating-point operations. >From userspace perspective, the problem was seen in attempts of computing floating-point operations which should generate exceptions. For example: fork(); float x = 0.0 / 0.0; isnan(x); // forked process returns False (should be True) The operation above also should always cause the SPEFSCR FINV bit to be set. However, the SPE floating-point exceptions were turned off after a fork(). Kernel versions prior to the bug used flush_spe_to_thread(), which first saves SPEFSCR register values in tsk->thread and then calls giveup_spe(tsk). After commit 579e633e, the save_all() function was called first to giveup_spe(), and then the SPEFSCR register values were saved in tsk->thread. This would save the SPEFSCR register values after disabling SPE for that thread, causing the bug described above. Fixes 579e633e ("powerpc: create flush_all_to_thread()") Signed-off-by: NFelipe Rechia <felipe.rechia@datacom.com.br> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Vincent Chen 提交于
[ Upstream commit 827a438156e4c423b6875a092e272933952a2910 ] For 32bit, the upper 32-bit of phys_addr_t will be flushed to zero after AND with PAGE_MASK because the data type of PAGE_MASK is unsigned long. To fix this problem, the page alignment is done by subtracting the page offset instead of AND with PAGE_MASK. Signed-off-by: NVincent Chen <vincentc@andestech.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NPalmer Dabbelt <palmer@sifive.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Nicholas Piggin 提交于
[ Upstream commit dd76ff5af35350fd6d5bb5b069e73b6017f66893 ] Signed-off-by: NNicholas Piggin <npiggin@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 81d1b54dec95209ab5e5be2cf37182885f998753 ] When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. Currently we always use a small page at the text/data boundary, even when that's not necessary: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages This is because the check that the mapping crosses the __init_begin boundary is too strict, it also returns true when we map exactly up to the boundary. So fix it to check that the mapping would actually map past __init_begin, and with that we see: Mapped 0x0000000000000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 3b5657ed5b4e27ccf593a41ff3c5aa27dae8df18 ] When we have CONFIG_STRICT_KERNEL_RWX enabled, we want to split the linear mapping at the text/data boundary so we can map the kernel text read only. But the current logic uses small pages for the entire text section, regardless of whether a larger page size would fit. eg. with the boundary at 16M we could use 2M pages, but instead we use 64K pages up to the 16M boundary: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages This is because the test is checking if addr is < __init_begin and addr + mapping_size is >= _stext. But that is true for all pages between _stext and __init_begin. Instead what we want to check is if we are crossing the text/data boundary, which is at __init_begin. With that fixed we see: Mapped 0x0000000000000000-0x0000000000e00000 with 2.00 MiB pages Mapped 0x0000000000e00000-0x0000000001000000 with 64.0 KiB pages Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we're correctly using 2MB pages below __init_begin, but we still drop down to 64K pages unnecessarily at the boundary. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit 5c6499b7041b43807dfaeda28aa87fc0e62558f7 ] When we have CONFIG_STRICT_KERNEL_RWX enabled, we try to split the kernel linear (1:1) mapping so that the kernel text is in a separate page to kernel data, so we can mark the former read-only. We could achieve that just by always using 64K pages for the linear mapping, but we try to be smarter. Instead we use huge pages when possible, and only switch to smaller pages when necessary. However we have an off-by-one bug in that logic, which causes us to calculate the wrong boundary between text and data. For example with the end of the kernel text at 16M we see: radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages ie. we mapped from 0 to 18M with 64K pages, even though the boundary between text and data is at 16M. With the fix we see we're correctly hitting the 16M boundary: radix-mmu: Mapped 0x0000000000000000-0x0000000001000000 with 64.0 KiB pages radix-mmu: Mapped 0x0000000001000000-0x0000000040000000 with 2.00 MiB pages radix-mmu: Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Aravinda Prasad 提交于
[ Upstream commit c6c26fb55e8e4b3fc376be5611685990a17de27a ] This patch exports the raw per-CPU VPA data via debugfs. A per-CPU file is created which exports the VPA data of that CPU to help debug some of the VPA related issues or to analyze the per-CPU VPA related statistics. v3: Removed offline CPU check. v2: Included offline CPU check and other review comments. Signed-off-by: NAravinda Prasad <aravinda@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David S. Miller 提交于
[ Upstream commit 46b8306480fb424abd525acc1763da1c63a27d8a ] If PARPORT_PC_FIFO is not enabled, do not provide the dma lock macros and lock definition. Otherwise: ./arch/sparc/include/asm/parport.h:24:24: warning: ‘dma_spin_lock’ defined but not used [-Wunused-variable] static DEFINE_SPINLOCK(dma_spin_lock); ^~~~~~~~~~~~~ ./include/linux/spinlock_types.h:81:39: note: in definition of macro ‘DEFINE_SPINLOCK’ #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Jithu Joseph 提交于
[ Upstream commit b61b8bba18fe2b63d38fdaf9b83de25e2d787dfe ] When the last CPU in an rdt_domain goes offline, its rdt_domain struct gets freed. Current pseudo-locking code is unaware of this scenario and tries to dereference the freed structure in a few places. Add checks to prevent pseudo-locking code from doing this. While further work is needed to seamlessly restore resource groups (not just pseudo-locking) to their configuration when the domain is brought back online, the immediate issue of invalid pointers is addressed here. Fixes: f4e80d67 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information") Fixes: 443810fe ("x86/intel_rdt: Create debugfs files for pseudo-locking testing") Fixes: 746e0859 ("x86/intel_rdt: Create character device exposing pseudo-locked region") Fixes: 33dc3e41 ("x86/intel_rdt: Make CPU information accessible for pseudo-locked regions") Signed-off-by: NJithu Joseph <jithu.joseph@intel.com> Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: gavin.hindman@intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/231f742dbb7b00a31cc104416860e27dba6b072d.1539384145.git.reinette.chatre@intel.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 Thomas Richter 提交于
[ Upstream commit ec0c0bb489727de0d4dca6a00be6970ab8a3b30a ] Return an error when the function debug_register() fails allocating the debug handle. Also remove the registered debug handle when the initialization fails later on. Signed-off-by: NThomas Richter <tmricht@linux.ibm.com> Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Steven Rostedt (VMware) 提交于
[ Upstream commit c2712b858187f5bcd7b042fe4daa3ba3a12635c0 ] Andy had some concerns about using regs_get_kernel_stack_nth() in a new function regs_get_kernel_argument() as if there's any error in the stack code, it could cause a bad memory access. To be on the safe side, call probe_kernel_read() on the stack address to be extra careful in accessing the memory. A helper function, regs_get_kernel_stack_nth_addr(), was added to just return the stack address (or NULL if not on the stack), that will be used to find the address (and could be used by other functions) and read the address with kernel_probe_read(). Requested-by: NAndy Lutomirski <luto@amacapital.net> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20181017165951.09119177@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Uros Bizjak 提交于
[ Upstream commit 5ebb272b2ea7e02911a03a893f8d922d49f9bb4a ] Register operand size of invvpid and invept instruction in 64-bit mode has always 64 bits. Adjust inline function argument type to reflect correct size. Signed-off-by: NUros Bizjak <ubizjak@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sean Christopherson 提交于
[ Upstream commit 7671ce21b13b9596163a29f4712cb2451a9b97dc ] In preparation of supporting checkpoint/restore for nested state, commit ca0bde28 ("kvm: nVMX: Split VMCS checks from nested_vmx_run()") modified check_vmentry_postreqs() to only perform the guest EFER consistency checks when nested_run_pending is true. But, in the normal nested VMEntry flow, nested_run_pending is only set after check_vmentry_postreqs(), i.e. the consistency check is being skipped. Alternatively, nested_run_pending could be set prior to calling check_vmentry_postreqs() in nested_vmx_run(), but placing the consistency checks in nested_vmx_enter_non_root_mode() allows us to split prepare_vmcs02() and interleave the preparation with the consistency checks without having to change the call sites of nested_vmx_enter_non_root_mode(). In other words, the rest of the consistency check code in nested_vmx_run() will be joining the postreqs checks in future patches. Fixes: ca0bde28 ("kvm: nVMX: Split VMCS checks from nested_vmx_run()") Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Cc: Jim Mattson <jmattson@google.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sean Christopherson 提交于
[ Upstream commit b7031fd40fcc741b0f9b0c04c8d844e445858b84 ] Reset the vm_{entry,exit}_controls_shadow variables as well as the segment cache after loading a new VMCS in vmx_switch_vmcs(). The shadows/cache track VMCS data, i.e. they're stale every time we switch to a new VMCS regardless of reason. This fixes a bug where stale control shadows would be consumed after a nested VMExit due to a failed consistency check. Suggested-by: NJim Mattson <jmattson@google.com> Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: NJim Mattson <jmattson@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Angelo Dureghello 提交于
[ Upstream commit 381fdd62c38344a771aed06adaf14aae65c47454 ] This patch fixes command_line array zero-terminated one byte over the end of the array, causing boot to hang. Signed-off-by: NAngelo Dureghello <angelo@sysam.it> Signed-off-by: NGreg Ungerer <gerg@linux-m68k.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sam Bobroff 提交于
[ Upstream commit 473af09b56dc4be68e4af33220ceca6be67aa60d ] eeh_add_to_parent_pe() sometimes removes the EEH_PE_KEEP flag, but it incorrectly removes it from pe->type, instead of pe->state. However, rather than clearing it from the correct field, remove it. Inspection of the code shows that it can't ever have had any effect (even if it had been cleared from the correct field), because the field is never tested after it is cleared by the statement in question. The clear statement was added by commit 807a827d ("powerpc/eeh: Keep PE during hotplug"), but it didn't explain why it was necessary. Signed-off-by: NSam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sam Bobroff 提交于
[ Upstream commit bcbe3730531239abd45ab6c6af4a18078b37dd47 ] If a device is removed during EEH processing (either by a driver's handler or as part of recovery), it can lead to a null dereference in eeh_pe_report_edev(). To handle this, skip devices that have been removed. Signed-off-by: NSam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit e8e132e6885962582784b6fa16a80d07ea739c0f ] This will avoid auto-vectorisation when building with higher optimisation levels. We don't know if the machine can support VSX and even if it's present it's probably not going to be enabled at this point in boot. These flag were both added prior to GCC 4.6 which is the minimum compiler version supported by upstream, thanks to Segher for the details. Signed-off-by: NJoel Stanley <joel@jms.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Joel Stanley 提交于
[ Upstream commit 1a855eaccf353f7ed1d51a3d4b3af727ccbd81ca ] As of commit 10c77dba ("powerpc/boot: Fix build failure in 32-bit boot wrapper") the opal code is hidden behind CONFIG_PPC64_BOOT_WRAPPER, but the boot wrapper avoids include/linux, so it does not get the normal Kconfig flags. We can drop the guard entirely as in commit f8e8e69c ("powerpc/boot: Only build OPAL code when necessary") the makefile only includes opal.c in the build if CONFIG_PPC64_BOOT_WRAPPER is set. Fixes: 10c77dba ("powerpc/boot: Fix build failure in 32-bit boot wrapper") Signed-off-by: NJoel Stanley <joel@jms.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dan Carpenter 提交于
[ Upstream commit 014704e6f54189a203cc14c7c0bb411b940241bc ] The "count < sizeof(struct os_area_db)" comparison is type promoted to size_t so negative values of "count" are treated as very high values and we accidentally return success instead of a negative error code. This doesn't really change runtime much but it fixes a static checker warning. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NGeoff Levand <geoff@infradead.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 24 11月, 2019 2 次提交
-
-
由 Reinette Chatre 提交于
[ Upstream commit 52eb74339a6233c69f4e3794b69ea7c98eeeae1b ] rdt_find_domain() returns an ERR_PTR() that is generated from a provided domain id when the value is negative. Care needs to be taken when creating an ERR_PTR() from this value because a subsequent check using IS_ERR() expects the error to be within the MAX_ERRNO range. Using an invalid domain id as an ERR_PTR() does work at this time since this is currently always -1. Using this undocumented assumption is fragile since future users of rdt_find_domain() may not be aware of thus assumption. Two related issues are addressed: - Ensure that rdt_find_domain() always returns a valid error value by forcing the error to be -ENODEV when a negative domain id is provided. - In a few instances the return value of rdt_find_domain() is just checked for NULL - fix these to include a check of ERR_PTR. Fixes: d89b7379 ("x86/intel_rdt/cqm: Add mon_data") Fixes: 521348b011d6 ("x86/intel_rdt: Introduce utility to obtain CDP peer") Signed-off-by: NReinette Chatre <reinette.chatre@intel.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: fenghua.yu@intel.com Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/b88cd4ff6a75995bf8db9b0ea546908fe50f69f3.1544479852.git.reinette.chatre@intel.comSigned-off-by: NSasha Levin <sashal@kernel.org>
-
由 Michael Ellerman 提交于
[ Upstream commit b4d16ab58c41ff0125822464bdff074cebd0fe47 ] In the recent commit 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") we changed the way we initialise the decrementer clockevent(s). We no longer initialise the mult & shift values of decrementer_clockevent itself. This has the effect of breaking PR KVM, because it uses those values in kvmppc_emulate_dec(). The symptom is guest kernels spin forever mid-way through boot. For now fix it by assigning back to decrementer_clockevent the mult and shift values. Fixes: 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NSasha Levin <sashal@kernel.org>
-