提交 · 56f7bfac531a8cdb7f171d3dfb6cb598e561661c · openanolis / cloud-kernel

29 12月, 2018 6 次提交

x86/vdso: Pass --eh-frame-hdr to the linker · 56f7bfac

由 Alistair Strachan 提交于 12月 14, 2018

commit cd01544a268ad8ee5b1dfe42c4393f1095f86879 upstream.

Commit

  379d98dd ("x86: vdso: Use $LD instead of $CC to link")

accidentally broke unwinding from userspace, because ld would strip the
.eh_frame sections when linking.

Originally, the compiler would implicitly add --eh-frame-hdr when
invoking the linker, but when this Makefile was converted from invoking
ld via the compiler, to invoking it directly (like vmlinux does),
the flag was missed. (The EH_FRAME section is important for the VDSO
shared libraries, but not for vmlinux.)

Fix the problem by explicitly specifying --eh-frame-hdr, which restores
parity with the old method.

See relevant bug reports for additional info:

  https://bugzilla.kernel.org/show_bug.cgi?id=201741
  https://bugzilla.redhat.com/show_bug.cgi?id=1659295

Fixes: 379d98dd ("x86: vdso: Use $LD instead of $CC to link")
Reported-by: NFlorian Weimer <fweimer@redhat.com>
Reported-by: NCarlos O'Donell <carlos@redhat.com>
Reported-by: N"H. J. Lu" <hjl.tools@gmail.com>
Signed-off-by: NAlistair Strachan <astrachan@google.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NLaura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Carlos O'Donell <carlos@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: kernel-team@android.com
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181214223637.35954-1-astrachan@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

56f7bfac

x86/mm: Fix decoy address handling vs 32-bit builds · 1e3b98b2

由 Dan Williams 提交于 12月 11, 2018

commit 51c3fbd89d7554caa3290837604309f8d8669d99 upstream.

A decoy address is used by set_mce_nospec() to update the cache attributes
for a page that may contain poison (multi-bit ECC error) while attempting
to minimize the possibility of triggering a speculative access to that
page.

When reserve_memtype() is handling a decoy address it needs to convert it
to its real physical alias. The conversion, AND'ing with __PHYSICAL_MASK,
is broken for a 32-bit physical mask and reserve_memtype() is passed the
last physical page. Gert reports triggering the:

    BUG_ON(start >= end);

...assertion when running a 32-bit non-PAE build on a platform that has
a driver resource at the top of physical memory:

    BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved

Given that the decoy address scheme is only targeted at 64-bit builds and
assumes that the top of physical address space is free for use as a decoy
address range, simply bypass address sanitization in the 32-bit case.

Lastly, there was no need to crash the system when this failure occurred,
and no need to crash future systems if the assumptions of decoy addresses
are ever violated. Change the BUG_ON() to a WARN() with an error return.

Fixes: 510ee090 ("x86/mm/pat: Prepare {reserve, free}_memtype() for...")
Reported-by: NGert Robben <t2@gert.gr>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NGert Robben <t2@gert.gr>
Cc: stable@vger.kernel.org
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: platform-driver-x86@vger.kernel.org
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/154454337985.789277.12133288391664677775.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1e3b98b2

x86/mtrr: Don't copy uninitialized gentry fields back to userspace · c623326a

由 Colin Ian King 提交于 12月 18, 2018

commit 32043fa065b51e0b1433e48d118821c71b5cd65d upstream.

Currently the copy_to_user of data in the gentry struct is copying
uninitiaized data in field _pad from the stack to userspace.

Fix this by explicitly memset'ing gentry to zero, this also will zero any
compiler added padding fields that may be in struct (currently there are
none).

Detected by CoverityScan, CID#200783 ("Uninitialized scalar variable")

Fixes: b263b31e ("x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NTyler Hicks <tyhicks@canonical.com>
Cc: security@kernel.org
Link: https://lkml.kernel.org/r/20181218172956.1440-1-colin.king@canonical.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c623326a

KVM: Fix UAF in nested posted interrupt processing · 1972ca04

由 Cfir Cohen 提交于 12月 18, 2018

commit c2dd5146e9fe1f22c77c1b011adf84eea0245806 upstream.

nested_get_vmcs12_pages() processes the posted_intr address in vmcs12. It
caches the kmap()ed page object and pointer, however, it doesn't handle
errors correctly: it's possible to cache a valid pointer, then release
the page and later dereference the dangling pointer.

I was able to reproduce with the following steps:

1. Call vmlaunch with valid posted_intr_desc_addr but an invalid
MSR_EFER. This causes nested_get_vmcs12_pages() to cache the kmap()ed
pi_desc_page and pi_desc. Later the invalid EFER value fails
check_vmentry_postreqs() which fails the first vmlaunch.

2. Call vmlanuch with a valid EFER but an invalid posted_intr_desc_addr
(I set it to 2G - 0x80). The second time we call nested_get_vmcs12_pages
pi_desc_page is unmapped and released and pi_desc_page is set to NULL
(the "shouldn't happen" clause). Due to the invalid
posted_intr_desc_addr, kvm_vcpu_gpa_to_page() fails and
nested_get_vmcs12_pages() returns. It doesn't return an error value so
vmlaunch proceeds. Note that at this time we have a dangling pointer in
vmx->nested.pi_desc and POSTED_INTR_DESC_ADDR in L0's vmcs.

3. Issue an IPI in L2 guest code. This triggers a call to
vmx_complete_nested_posted_interrupt() and pi_test_and_clear_on() which
dereferences the dangling pointer.

Vulnerable code requires nested and enable_apicv variables to be set to
true. The host CPU must also support posted interrupts.

Fixes: 5e2f30b7 "KVM: nVMX: get rid of nested_get_page()"
Cc: stable@vger.kernel.org
Reviewed-by: NAndy Honig <ahonig@google.com>
Signed-off-by: NCfir Cohen <cfir@google.com>
Reviewed-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1972ca04

kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs · 229468c6

由 Eduardo Habkost 提交于 12月 17, 2018

commit 0e1b869fff60c81b510c2d00602d778f8f59dd9a upstream.

Some guests OSes (including Windows 10) write to MSR 0xc001102c
on some cases (possibly while trying to apply a CPU errata).
Make KVM ignore reads and writes to that MSR, so the guest won't
crash.

The MSR is documented as "Execution Unit Configuration (EX_CFG)",
at AMD's "BIOS and Kernel Developer's Guide (BKDG) for AMD Family
15h Models 00h-0Fh Processors".

Cc: stable@vger.kernel.org
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

229468c6

KVM: X86: Fix NULL deref in vcpu_scan_ioapic · 76281d12

由 Wanpeng Li 提交于 12月 17, 2018

commit dcbd3e49c2f0b2c2d8a321507ff8f3de4af76d7c upstream.

Reported by syzkaller:

    CPU: 1 PID: 5962 Comm: syz-executor118 Not tainted 4.20.0-rc6+ #374
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline]
    RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline]
    RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline]
    RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline]
    RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074
    Call Trace:
	 kvm_vcpu_ioctl+0x5c8/0x1150 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2596
	 vfs_ioctl fs/ioctl.c:46 [inline]
	 file_ioctl fs/ioctl.c:509 [inline]
	 do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
	 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
	 __do_sys_ioctl fs/ioctl.c:720 [inline]
	 __se_sys_ioctl fs/ioctl.c:718 [inline]
	 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
	 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
	 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT14 msr
and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
However, irqchip is not initialized by this simple testcase, ioapic/apic
objects should not be accessed.

This patch fixes it by also considering whether or not apic is present.

Reported-by: syzbot+39810e6c400efadfef71@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

76281d12

21 12月, 2018 6 次提交

ARM: 8816/1: dma-mapping: fix potential uninitialized return · 80eaec9b

由 Nathan Jones 提交于 12月 04, 2018

[ Upstream commit c2a3831df6dc164af66d8d86cf356a90c021b86f ]

While trying to use the dma_mmap_*() interface, it was noticed that this
interface returns strange values when passed an incorrect length.

If neither of the if() statements fire then the return value is
uninitialized. In the worst case it returns 0 which means the caller
will think the function succeeded.

Fixes: 1655cf88 ("ARM: dma-mapping: Remove traces of NOMMU code")
Signed-off-by: NNathan Jones <nathanj439@gmail.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

80eaec9b

ARM: 8815/1: V7M: align v7m_dma_inv_range() with v7 counterpart · 5cb96671

由 Vladimir Murzin 提交于 11月 23, 2018

[ Upstream commit 3d0358d0ba048c5afb1385787aaec8fa5ad78fcc ]

Chris has discovered and reported that v7_dma_inv_range() may corrupt
memory if address range is not aligned to cache line size.

Since the whole cache-v7m.S was lifted form cache-v7.S the same
observation applies to v7m_dma_inv_range(). So the fix just mirrors
what has been done for v7 with a little specific of M-class.

Cc: Chris Cole <chris@sageembedded.com>
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

5cb96671

ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handling · b3d52556

由 Chris Cole 提交于 11月 23, 2018

[ Upstream commit a1208f6a822ac29933e772ef1f637c5d67838da9 ]

This patch addresses possible memory corruption when
v7_dma_inv_range(start_address, end_address) address parameters are not
aligned to whole cache lines. This function issues "invalidate" cache
management operations to all cache lines from start_address (inclusive)
to end_address (exclusive). When start_address and/or end_address are
not aligned, the start and/or end cache lines are first issued "clean &
invalidate" operation. The assumption is this is done to ensure that any
dirty data addresses outside the address range (but part of the first or
last cache lines) are cleaned/flushed so that data is not lost, which
could happen if just an invalidate is issued.

The problem is that these first/last partial cache lines are issued
"clean & invalidate" and then "invalidate". This second "invalidate" is
not required and worse can cause "lost" writes to addresses outside the
address range but part of the cache line. If another component writes to
its part of the cache line between the "clean & invalidate" and
"invalidate" operations, the write can get lost. This fix is to remove
the extra "invalidate" operation when unaligned addressed are used.

A kernel module is available that has a stress test to reproduce the
issue and a unit test of the updated v7_dma_inv_range(). It can be
downloaded from
http://ftp.sageembedded.com/outgoing/linux/cache-test-20181107.tgz.

v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction)
when the direction is DMA_FROM_DEVICE. One can (I believe) successfully
argue that DMA from a device to main memory should use buffers aligned
to cache line size, because the "clean & invalidate" might overwrite
data that the device just wrote using DMA. But if a driver does use
unaligned buffers, at least this fix will prevent memory corruption
outside the buffer.
Signed-off-by: NChris Cole <chris@sageembedded.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

b3d52556

ARC: io.h: Implement reads{x}()/writes{x}() · bf69dc3c

由 Jose Abreu 提交于 11月 30, 2018

[ Upstream commit 10d443431dc2bb733cf7add99b453e3fb9047a2e ]

Some ARC CPU's do not support unaligned loads/stores. Currently, generic
implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
This can lead to misfunction of some drivers as generic functions do a
plain dereference of a pointer that can be unaligned.

Let's use {get/put}_unaligned() helpers instead of plain dereference of
pointer in order to fix. The helpers allow to get and store data from an
unaligned address whilst preserving the CPU internal alignment.
According to [1], the use of these helpers are costly in terms of
performance so we added an initial check for a buffer already aligned so
that the usage of the helpers can be avoided, when possible.

[1] Documentation/unaligned-memory-access.txt

Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Tested-by: NVitor Soares <soares@synopsys.com>
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

bf69dc3c

x86/earlyprintk/efi: Fix infinite loop on some screen widths · 985dea32

由 YiFei Zhu 提交于 11月 29, 2018

[ Upstream commit 79c2206d369b87b19ac29cb47601059b6bf5c291 ]

An affected screen resolution is 1366 x 768, which width is not
divisible by 8, the default font width. On such screens, when longer
lines are earlyprintk'ed, overflow-to-next-line can never trigger,
due to the left-most x-coordinate of the next character always less
than the screen width. Earlyprintk will infinite loop in trying to
print the rest of the string but unable to, due to the line being
full.

This patch makes the trigger consider the right-most x-coordinate,
instead of left-most, as the value to compare against the screen
width threshold.
Signed-off-by: NYiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Eric Snowberg <eric.snowberg@oracle.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: Julien Thierry <julien.thierry@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20181129171230.18699-12-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>

985dea32

locking/qspinlock, x86: Provide liveness guarantee · 26586875

由 Peter Zijlstra 提交于 12月 18, 2018

commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream.

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -> (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -> (0,0,0)

 4)					lock
					  trylock -> (0,0,1)

 5)			  tas-pending -> (0,1,1)
			  load-val <- (0,1,0) from 3

 6)			  clear-pending-set-locked -> (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.
Suggested-by: NWill Deacon <will.deacon@arm.com>
Reported-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
[bigeasy: GEN_BINARY_RMWcc macro redo]
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>

26586875

20 12月, 2018 7 次提交

x86/build: Fix compiler support check for CONFIG_RETPOLINE · d92c66b3

由 Masahiro Yamada 提交于 12月 05, 2018

commit 25896d073d8a0403b07e6dec56f58e6c33678207 upstream.

It is troublesome to add a diagnostic like this to the Makefile
parse stage because the top-level Makefile could be parsed with
a stale include/config/auto.conf.

Once you are hit by the error about non-retpoline compiler, the
compilation still breaks even after disabling CONFIG_RETPOLINE.

The easiest fix is to move this check to the "archprepare" like
this commit did:

  829fe4aa ("x86: Allow generating user-space headers without a compiler")
Reported-by: NMeelis Roos <mroos@linux.ee>
Tested-by: NMeelis Roos <mroos@linux.ee>
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
Link: http://lkml.kernel.org/r/1543991239-18476-1-git-send-email-yamada.masahiro@socionext.com
Link: https://lkml.org/lkml/2018/12/4/206Signed-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Cc: Gi-Oh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d92c66b3

powerpc: Look for "stdout-path" when setting up legacy consoles · c97c353e

由 Benjamin Herrenschmidt 提交于 11月 30, 2018

commit bf3d6afbb234156749b640b6c50f714967a85964 upstream.

Commit 78e5dfea ("powerpc: dts: replace 'linux,stdout-path' with
'stdout-path'") broke the default console on a number of embedded
PowerPC systems, because it failed to also update the code in
arch/powerpc/kernel/legacy_serial.c to look for that property in
addition to the old one.

This fixes it.

Fixes: 78e5dfea ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

c97c353e

powerpc/msi: Fix NULL pointer access in teardown code · 73732de1

由 Radu Rendec 提交于 11月 27, 2018

commit 78e7b15e17ac175e7eed9e21c6f92d03d3b0a6fa upstream.

The arch_teardown_msi_irqs() function assumes that controller ops
pointers were already checked in arch_setup_msi_irqs(), but this
assumption is wrong: arch_teardown_msi_irqs() can be called even when
arch_setup_msi_irqs() returns an error (-ENOSYS).

This can happen in the following scenario:
  - msi_capability_init() calls pci_msi_setup_msi_irqs()
  - pci_msi_setup_msi_irqs() returns -ENOSYS
  - msi_capability_init() notices the error and calls free_msi_irqs()
  - free_msi_irqs() calls pci_msi_teardown_msi_irqs()

This is easier to see when CONFIG_PCI_MSI_IRQ_DOMAIN is not set and
pci_msi_setup_msi_irqs() and pci_msi_teardown_msi_irqs() are just
aliases to arch_setup_msi_irqs() and arch_teardown_msi_irqs().

The call to free_msi_irqs() upon pci_msi_setup_msi_irqs() failure
seems legit, as it does additional cleanup; e.g.
list_del(&entry->list) and kfree(entry) inside free_msi_irqs() do
happen (MSI descriptors are allocated before pci_msi_setup_msi_irqs()
is called and need to be cleaned up if that fails).

Fixes: 6b2fd7ef ("PCI/MSI/PPC: Remove arch_msi_check_device()")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: NRadu Rendec <radu.rendec@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

73732de1

ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs · bea8a160

由 Stefan Wahren 提交于 12月 03, 2018

commit e25b6783c7b1bb79103d4617336879423f86b05e upstream.

The commit b1b8f45b ("ARM: dts: bcm2837: Add missing GPIOs of Expander")
introduced a wifi power sequence. Unfortunately the polarity of the reset
GPIOs were wrong and broke the wifi support on Raspberry Pi 3 B and
later in 3 B+. This wasn't discovered before since the power sequence
takes only effect in case the relevant MMC driver is compiled as a module.

Fixes: b1b8f45b ("ARM: dts: bcm2837: Add missing GPIOs of Expander")
Cc: stable@vger.kernel.org
Reported-by: NMatthias Lueschner <lueschem@gmail.com>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911443Signed-off-by: NStefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

bea8a160

ARM: mmp/mmp2: fix cpu_is_mmp2() on mmp2-dt · 0017698d

由 Lubomir Rintel 提交于 12月 02, 2018

commit 76f4e2c3b6a560cdd7a75b87df543e04d05a9e5f upstream.

cpu_is_mmp2() was equivalent to cpu_is_pj4(), wouldn't be correct for
multiplatform kernels. Fix it by also considering mmp_chip_id, as is
done for cpu_is_pxa168() and cpu_is_pxa910() above.

Moreover, it is only available with CONFIG_CPU_MMP2 and thus doesn't work
on DT-based MMP2 machines. Enable it on CONFIG_MACH_MMP2_DT too.

Note: CONFIG_CPU_MMP2 is only used for machines that use board files
instead of DT. It should perhaps be renamed. I'm not doing it now, because
I don't have a better idea.
Signed-off-by: NLubomir Rintel <lkundrak@v3.sk>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0017698d

arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing · beb98fda

由 Robin Murphy 提交于 12月 10, 2018

commit 3238c359acee4ab57f15abb5a82b8ab38a661ee7 upstream.

We need to invalidate the caches *before* clearing the buffer via the
non-cacheable alias, else in the worst case __dma_flush_area() may
write back dirty lines over the top of our nice new zeros.

Fixes: dd65a941 ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag")
Cc: <stable@vger.kernel.org> # 4.18.x-
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

beb98fda

ARM: dts: qcom-apq8064-arrow-sd-600eval fix graph_endpoint warning · 26527312

由 Arnd Bergmann 提交于 12月 14, 2018

Upstream commit 972910948fb6 ("ARM: dts: qcom: Remove Arrow SD600
eval board") removed this file because there are no known users,
but in linux-4.19.y, we still get a compile-time warnign for it:

arch/arm/boot/dts/qcom-apq8064-arrow-sd-600eval.dtb: Warning (graph_endpoint): /soc/mdp@5100000/ports/port@3/endpoint: graph connection to node '/soc/hdmi-tx@4a00000/ports/port@0/endpoint' is not bidirectional

Address the warning by adding the remote endpoint that makes the link
bidirectional. This is the same property that other boards use.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NSasha Levin <sashal@kernel.org>

26527312

17 12月, 2018 13 次提交

Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" · a9d79a07

由 Igor Druzhinin 提交于 11月 27, 2018

[ Upstream commit 123664101aa2156d05251704fc63f9bcbf77741a ]

This reverts commit b3cf8528.

That commit unintentionally broke Xen balloon memory hotplug with
"hotplug_unpopulated" set to 1. As long as "System RAM" resource
got assigned under a new "Unusable memory" resource in IO/Mem tree
any attempt to online this memory would fail due to general kernel
restrictions on having "System RAM" resources as 1st level only.

The original issue that commit has tried to workaround fa564ad9
("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
60-7f)") also got amended by the following 03a55173 ("x86/PCI: Move
and shrink AMD 64-bit window to avoid conflict") which made the
original fix to Xen ballooning unnecessary.
Signed-off-by: NIgor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

a9d79a07

x86/kvm/vmx: fix old-style function declaration · bf1b47f3

由 Yi Wang 提交于 11月 08, 2018

[ Upstream commit 1e4329ee2c52692ea42cc677fb2133519718b34a ]

The inline keyword which is not at the beginning of the function
declaration may trigger the following build warnings, so let's fix it:

arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

bf1b47f3

KVM: x86: fix empty-body warnings · d6b1692d

由 Yi Wang 提交于 11月 08, 2018

[ Upstream commit 354cb410d87314e2eda344feea84809e4261570a ]

We get the following warnings about empty statements when building
with 'W=1':

arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]

Rework the debug helper macro to get rid of these warnings.
Signed-off-by: NYi Wang <wang.yi59@zte.com.cn>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d6b1692d

KVM: VMX: Update shared MSRs to be saved/restored on MSR_EFER.LMA changes · 3c7670d5

由 Liran Alon 提交于 11月 20, 2018

[ Upstream commit f48b4711dd6e1cf282f9dfd159c14a305909c97c ]

When guest transitions from/to long-mode by modifying MSR_EFER.LMA,
the list of shared MSRs to be saved/restored on guest<->host
transitions is updated (See vmx_set_efer() call to setup_msrs()).

On every entry to guest, vcpu_enter_guest() calls
vmx_prepare_switch_to_guest(). This function should also take care
of setting the shared MSRs to be saved/restored. However, the
function does nothing in case we are already running with loaded
guest state (vmx->loaded_cpu_state != NULL).

This means that even when guest modifies MSR_EFER.LMA which results
in updating the list of shared MSRs, it isn't being taken into account
by vmx_prepare_switch_to_guest() because it happens while we are
running with loaded guest state.

To fix above mentioned issue, add a flag to mark that the list of
shared MSRs has been updated and modify vmx_prepare_switch_to_guest()
to set shared MSRs when running with host state *OR* list of shared
MSRs has been updated.

Note that this issue was mistakenly introduced by commit
678e315e ("KVM: vmx: add dedicated utility to access guest's
kernel_gs_base") because previously vmx_set_efer() always called
vmx_load_host_state() which resulted in vmx_prepare_switch_to_guest() to
set shared MSRs.

Fixes: 678e315e ("KVM: vmx: add dedicated utility to access guest's kernel_gs_base")
Reported-by: NEyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: NMihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: NLiam Merwick <liam.merwick@oracle.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NLiran Alon <liran.alon@oracle.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

3c7670d5

ARM: dts: at91: sama5d2: use the divided clock for SMC · b3159470

由 Romain Izard 提交于 11月 20, 2018

[ Upstream commit 4ab7ca092c3c7ac8b16aa28eba723a8868f82f14 ]

The SAMA5D2 is different from SAMA5D3 and SAMA5D4, as there are two
different clocks for the peripherals in the SoC. The Static Memory
controller is connected to the divided master clock.

Unfortunately, the device tree does not correctly show this and uses the
master clock directly. This clock is then used by the code for the NAND
controller to calculate the timings for the controller, and we end up with
slow NAND Flash access.

Fix the device tree, and the performance of Flash access is improved.
Signed-off-by: NRomain Izard <romain.izard.pro@gmail.com>
Signed-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

b3159470

s390/cpum_cf: Reject request for sampling in event initialization · d70a6605

由 Thomas Richter 提交于 11月 13, 2018

[ Upstream commit 613a41b0d16e617f46776a93b975a1eeea96417c ]

On s390 command perf top fails
[root@s35lp76 perf] # ./perf top -F100000  --stdio
   Error:
   cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
   	Try 'perf stat'
[root@s35lp76 perf] #

Using event -e rb0000 works as designed.  Event rb0000 is the event
number of the sampling facility for basic sampling.

During system start up the following PMUs are installed in the kernel's
PMU list (from head to tail):
   cpum_cf --> s390 PMU counter facility device driver
   cpum_sf --> s390 PMU sampling facility device driver
   uprobe
   kprobe
   tracepoint
   task_clock
   cpu_clock

Perf top executes following functions and calls perf_event_open(2) system
call with different parameters many times:

cmd_top
--> __cmd_top
    --> perf_evlist__add_default
        --> __perf_evlist__add_default
            --> perf_evlist__new_cycles (creates event type:0 (HW)
			    		config 0 (CPU_CYCLES)
	        --> perf_event_attr__set_max_precise_ip
		    Uses perf_event_open(2) to detect correct
		    precise_ip level. Fails 3 times on s390 which is ok.

Then functions cmd_top
--> __cmd_top
    --> perf_top__start_counters
        -->perf_evlist__config
	   --> perf_can_comm_exec
               --> perf_probe_api
	           This functions test support for the following events:
		   "cycles:u", "instructions:u", "cpu-clock:u" using
		   --> perf_do_probe_api
		       --> perf_event_open_cloexec
		           Test the close on exec flag support with
			   perf_event_open(2).
	               perf_do_probe_api returns true if the event is
		       supported.
		       The function returns true because event cpu-clock is
		       supported by the PMU cpu_clock.
	               This is achieved by many calls to perf_event_open(2).

Function perf_top__start_counters now calls perf_evsel__open() for every
event, which is the default event cpu_cycles (config:0) and type HARDWARE
(type:0) which a predfined frequence of 4000.

Given the above order of the PMU list, the PMU cpum_cf gets called first
and returns 0, which indicates support for this sampling. The event is
fully allocated in the function perf_event_open (file kernel/event/core.c
near line 10521 and the following check fails:

        event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
		                 NULL, NULL, cgroup_fd);
	if (IS_ERR(event)) {
		err = PTR_ERR(event);
		goto err_cred;
	}

        if (is_sampling_event(event)) {
		if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
			err = -EOPNOTSUPP;
			goto err_alloc;
		}
	}

The check for the interrupt capabilities fails and the system call
perf_event_open() returns -EOPNOTSUPP (-95).

Add a check to return -ENODEV when sampling is requested in PMU cpum_cf.
This allows common kernel code in the perf_event_open() system call to
test the next PMU in above list.

Fixes: 97b1198f (" "s390, perf: Use common PMU interrupt disabled code")
Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d70a6605

arm64: dts: sdm845-mtp: Reserve reserved gpios · 1eb8dd51

由 Bjorn Andersson 提交于 11月 02, 2018

[ Upstream commit 5f8d3ab136d0ccb59c4d628d8f85e0d8f2761d07 ]

With the introduction of commit 3edfb7bd76bd ("gpiolib: Show correct
direction from the beginning") the gpiolib will attempt to read the
direction of all pins, which triggers a read from protected register
regions.

The pins 0 through 3 and 81 through 84 are protected, so mark these as
reserved.
Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: NStephen Boyd <sboyd@kernel.org>
Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NAndy Gross <andy.gross@linaro.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>

1eb8dd51

ARM: OMAP1: ams-delta: Fix possible use of uninitialized field · 136c5237

由 Janusz Krzysztofik 提交于 11月 07, 2018

[ Upstream commit cec83ff1241ec98113a19385ea9e9cfa9aa4125b ]

While playing with initialization order of modem device, it has been
discovered that under some circumstances (early console init, I
believe) its .pm() callback may be called before the
uart_port->private_data pointer is initialized from
plat_serial8250_port->private_data, resulting in NULL pointer
dereference.  Fix it by checking for uninitialized pointer before using
it in modem_pm().

Fixes: aabf3173 ("ARM: OMAP1: ams-delta: update the modem to use regulator API")
Signed-off-by: NJanusz Krzysztofik <jmkrzyszt@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

136c5237

ARM: dts: am3517-som: Fix WL127x Wifi interrupt · 28f3050b

由 Adam Ford 提交于 10月 28, 2018

[ Upstream commit 419b194cdedc76d0d3cd5b0900db0fa8177c4a52 ]

At the same time the AM3517 EVM was gaining WiFi support,
separate patches were introduced to move the interrupt
from HIGH to RISING.  Because they overlapped, this was not
done to the AM3517-EVM.  This patch fixes Kernel 4.19+

Fixes: 6bf5e341 ("ARM: dts: am3517-som: Add WL127x Wifi")
Signed-off-by: NAdam Ford <aford173@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

28f3050b

ARM: dts: logicpd-somlv: Fix interrupt on mmc3_dat1 · 9f7df2a3

由 Adam Ford 提交于 10月 28, 2018

[ Upstream commit 3d8b804bc528d3720ec0c39c212af92dafaf6e84 ]

The interrupt on mmc3_dat1 is wrong which prevents this from
appearing in /proc/interrupts.

Fixes: ab8dd3ae ("ARM: DTS: Add minimal Support for Logic PD
DM3730 SOM-LV") #Kernel 4.9+
Signed-off-by: NAdam Ford <aford173@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

9f7df2a3

ARM: dts: LogicPD Torpedo: Fix mmc3_dat1 interrupt · 09372f3c

由 Adam Ford 提交于 10月 28, 2018

[ Upstream commit 6809564d64ff1301d44c13334cc0e8369e825a20 ]

When the Torpedo was first introduced back at Kernel 4.2,
the interrupt extended flag has been set incorrectly.

It was subsequently moved, so this patch corrects Kernel
4.18+

Fixes: a3886730 ("ARM: dts: Move move WiFi bindings to
logicpd-torpedo-37xx-devkit") # v4.18+
Signed-off-by: NAdam Ford <aford173@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

09372f3c

ARM: dts: am3517: Fix pinmuxing for CD on MMC1 · 886e00c5

由 Adam Ford 提交于 10月 28, 2018

[ Upstream commit e7f4ffffa972db4820c23ff9831a6a4f3f6d8cc5 ]

The MMC1 is active low, not active high.  For some reason,
this worked with different combination of U-Boot and kernels,
but it's supposed to be active low and is currently broken.

Fixes: cfaa856a ("ARM: dts: am3517: Add pinmuxing, CD and
WP for MMC1") #kernel 4.18+
Signed-off-by: NAdam Ford <aford173@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

886e00c5

ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup · 5a8fbba7

由 Nathan Chancellor 提交于 10月 17, 2018

[ Upstream commit eef3dc34a1e0b01d53328b88c25237bcc7323777 ]

When building the kernel with Clang, the following section mismatch
warning appears:

WARNING: vmlinux.o(.text+0x38b3c): Section mismatch in reference from
the function omap44xx_prm_late_init() to the function
.init.text:omap44xx_prm_enable_io_wakeup()
The function omap44xx_prm_late_init() references
the function __init omap44xx_prm_enable_io_wakeup().
This is often because omap44xx_prm_late_init lacks a __init
annotation or the annotation of omap44xx_prm_enable_io_wakeup is wrong.

Remove the __init annotation from omap44xx_prm_enable_io_wakeup so there
is no more mismatch.
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

5a8fbba7

13 12月, 2018 8 次提交

x86/efi: Allocate e820 buffer before calling efi_exit_boot_service · e88ebc06

由 Eric Snowberg 提交于 11月 29, 2018

commit b84a64fad40637b1c9fa4f4dbf847a23e29e672b upstream.

The following commit:

  d6493401 ("x86/efi: Use efi_exit_boot_services()")

introduced a regression on systems with large memory maps causing them
to hang on boot. The first "goto get_map" that was removed from
exit_boot() ensured there was enough room for the memory map when
efi_call_early(exit_boot_services) was called. This happens when
(nr_desc > ARRAY_SIZE(params->e820_table).

Chain of events:

  exit_boot()
    efi_exit_boot_services()
      efi_get_memory_map                  <- at this point the mm can't grow over 8 desc
      priv_func()
        exit_boot_func()
          allocate_e820ext()              <- new mm grows over 8 desc from e820 alloc
      efi_call_early(exit_boot_services)  <- mm key doesn't match so retry
      efi_call_early(get_memory_map)      <- not enough room for new mm
      system hangs

This patch allocates the e820 buffer before calling efi_exit_boot_services()
and fixes the regression.

 [ mingo: minor cleanliness edits. ]
Signed-off-by: NEric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: Julien Thierry <julien.thierry@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20181129171230.18699-2-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e88ebc06

kprobes/x86: Fix instruction patching corruption when copying more than one... · ce74d11a

由 Masami Hiramatsu 提交于 8月 24, 2018

kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction

commit 43a1b0cb4cd6dbfd3cd9c10da663368394d299d8 upstream.

After copy_optimized_instructions() copies several instructions
to the working buffer it tries to fix up the real RIP address, but it
adjusts the RIP-relative instruction with an incorrect RIP address
for the 2nd and subsequent instructions due to a bug in the logic.

This will break the kernel pretty badly (with likely outcomes such as
a kernel freeze, a crash, or worse) because probed instructions can refer
to the wrong data.

For example putting kprobes on cpumask_next() typically hits this bug.

cpumask_next() is normally like below if CONFIG_CPUMASK_OFFSTACK=y
(in this case nr_cpumask_bits is an alias of nr_cpu_ids):

 <cpumask_next>:
	48 89 f0		mov    %rsi,%rax
	8b 35 7b fb e2 00	mov    0xe2fb7b(%rip),%esi # ffffffff82db9e64 <nr_cpu_ids>
	55			push   %rbp
...

If we put a kprobe on it and it gets jump-optimized, it gets
patched by the kprobes code like this:

 <cpumask_next>:
	e9 95 7d 07 1e		jmpq   0xffffffffa000207a
	7b fb			jnp    0xffffffff81f8a2e2 <cpumask_next+2>
	e2 00			loop   0xffffffff81f8a2e9 <cpumask_next+9>
	55			push   %rbp

This shows that the first two MOV instructions were copied to a
trampoline buffer at 0xffffffffa000207a.

Here is the disassembled result of the trampoline, skipping
the optprobe template instructions:

	# Dump of assembly code from 0xffffffffa000207a to 0xffffffffa00020ea:

	54			push   %rsp
	...
	48 83 c4 08		add    $0x8,%rsp
	9d			popfq
	48 89 f0		mov    %rsi,%rax
	8b 35 82 7d db e2	mov    -0x1d24827e(%rip),%esi # 0xffffffff82db9e67 <nr_cpu_ids+3>

This dump shows that the second MOV accesses *(nr_cpu_ids+3) instead of
the original *nr_cpu_ids. This leads to a kernel freeze because
cpumask_next() always returns 0 and for_each_cpu() never ends.

Fix this by adding 'len' correctly to the real RIP address while
copying.

[ mingo: Improved the changelog. ]
Reported-by: NMichael Rodin <michael@rodin.online>
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v4.15+
Fixes: 63fef14f ("kprobes/x86: Make insn buffer always ROX and use text_poke()")
Link: http://lkml.kernel.org/r/153504457253.22602.1314289671019919596.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ce74d11a

Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved" · bd5d1c27

由 Masayoshi Mizuma 提交于 10月 26, 2018

[ Upstream commit 9fd61bc95130d4971568b89c9548b5e0a4e18e0e ]

commit 124049de ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved") breaks movable_node kernel option because it changed
the memory gap range to reserved memblock.  So, the node is marked as
Normal zone even if the SRAT has Hot pluggable affinity.

    =====================================================================
    kernel: BIOS-e820: [mem 0x0000180000000000-0x0000180fffffffff] usable
    kernel: BIOS-e820: [mem 0x00001c0000000000-0x00001c0fffffffff] usable
    ...
    kernel: reserved[0x12]#11[0x0000181000000000-0x00001bffffffffff], 0x000003f000000000 bytes flags: 0x0
    ...
    kernel: ACPI: SRAT: Node 2 PXM 6 [mem 0x180000000000-0x1bffffffffff] hotplug
    kernel: ACPI: SRAT: Node 3 PXM 7 [mem 0x1c0000000000-0x1fffffffffff] hotplug
    ...
    kernel: Movable zone start for each node
    kernel:  Node 3: 0x00001c0000000000
    kernel: Early memory node ranges
    ...
    =====================================================================

The original issue is fixed by the former patches, so let's revert commit
124049de ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved").

Link: http://lkml.kernel.org/r/20181002143821.5112-4-msys.mizuma@gmail.comSigned-off-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>

bd5d1c27

arm64: dts: rockchip: remove vdd_log from rock960 to fix a stability issues · e92cc52e

由 Sasha Levin 提交于 12月 10, 2018

[ Upstream commit 13682e524167cbd7e2a26c5e91bec765f0f96273 ]

When the performance governor is set as default, the rock960 hangs
around one minute after booting, whatever the activity is (idle, key
pressed, loaded, ...).

Based on the commit log found at https://patchwork.kernel.org/patch/10092377/

"vdd_log has no consumer and therefore will not be set to a specific
voltage. Still the PWM output pin gets configured and thence the vdd_log
output voltage will changed from it's default. Depending on the idle
state of the PWM this will slightly over or undervoltage the logic supply
of the RK3399 and cause instability with GbE (undervoltage) and PCIe
(overvoltage). Since the default value set by a voltage divider is the
correct supply voltage and we don't need to change it during runtime we
remove the rail from the devicetree completely so the PWM pin will not
be configured."

After removing the vdd-log from the rock960's specific DT, the board
does no longer hang and shows a stable behavior.

Apply the same change for the rock960 by removing the vdd-log from the
DT.

Fixes: 874846f1 ("arm64: dts: rockchip: add 96boards RK3399 Ficus board")
Cc: stable@vger.kernel.org
Tested-by: NManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
[sl: adjust filename for 4.19]
Signed-off-by: NSasha Levin <sashal@kernel.org>

e92cc52e

ARM: 8806/1: kprobes: Fix false positive with FORTIFY_SOURCE · 3fe0c68a

由 Kees Cook 提交于 10月 30, 2018

commit e46daee53bb50bde38805f1823a182979724c229 upstream.

The arm compiler internally interprets an inline assembly label
as an unsigned long value, not a pointer. As a result, under
CONFIG_FORTIFY_SOURCE, the address of a label has a size of 4 bytes,
which was tripping the runtime checks. Instead, we can just cast the label
(as done with the size calculations earlier).

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1639397Reported-by: NWilliam Cohen <wcohen@redhat.com>
Fixes: 6974f0c4 ("include/linux/string.h: add the option of fortified string.h functions")
Cc: stable@vger.kernel.org
Acked-by: NLaura Abbott <labbott@redhat.com>
Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Tested-by: NWilliam Cohen <wcohen@redhat.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

3fe0c68a

arm64: hibernate: Avoid sending cross-calling with interrupts disabled · 1d9ca566

由 Will Deacon 提交于 12月 07, 2018

commit b4aecf78083d8c6424657c1746c7c3de6e61669f upstream.

Since commit 3b8c9f1c ("arm64: IPI each CPU after invalidating the
I-cache for kernel mappings"), a call to flush_icache_range() will use
an IPI to cross-call other online CPUs so that any stale instructions
are flushed from their pipelines. This triggers a WARN during the
hibernation resume path, where flush_icache_range() is called with
interrupts disabled and is therefore prone to deadlock:

  | Disabling non-boot CPUs ...
  | CPU1: shutdown
  | psci: CPU1 killed.
  | CPU2: shutdown
  | psci: CPU2 killed.
  | CPU3: shutdown
  | psci: CPU3 killed.
  | WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
  | Modules linked in:
  | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1

Since all secondary CPUs have been taken offline prior to invalidating
the I-cache, there's actually no need for an IPI and we can simply call
__flush_icache_range() instead.

Cc: <stable@vger.kernel.org>
Fixes: 3b8c9f1c ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Reported-by: NKunihiko Hayashi <hayashi.kunihiko@socionext.com>
Tested-by: NKunihiko Hayashi <hayashi.kunihiko@socionext.com>
Tested-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1d9ca566

parisc: Enable -ffunction-sections for modules on 32-bit kernel · 6c9ac388

由 Helge Deller 提交于 11月 29, 2018

commit 1e8249b8a4e960018e4baca6b523b8a4500af600 upstream.

Frank Schreiner reported, that since kernel 4.18 he faces sysfs-warnings
when loading modules on a 32-bit kernel. Here is one such example:

 sysfs: cannot create duplicate filename '/module/nfs/sections/.text'
 CPU: 0 PID: 98 Comm: modprobe Not tainted 4.18.0-2-parisc #1 Debian 4.18.10-2
 Backtrace:
  [<1017ce2c>] show_stack+0x3c/0x50
  [<107a7210>] dump_stack+0x28/0x38
  [<103f900c>] sysfs_warn_dup+0x88/0xac
  [<103f8b1c>] sysfs_add_file_mode_ns+0x164/0x1d0
  [<103f9e70>] internal_create_group+0x11c/0x304
  [<103fa0a0>] sysfs_create_group+0x48/0x60
  [<1022abe8>] load_module.constprop.35+0x1f9c/0x23b8
  [<1022b278>] sys_finit_module+0xd0/0x11c
  [<101831dc>] syscall_exit+0x0/0x14

This warning gets triggered by the fact, that due to commit 24b6c225
("parisc: Build kernel without -ffunction-sections") we now get multiple .text
sections in the kernel modules for which sysfs_create_group() can't create
multiple virtual files.

This patch works around the problem by re-enabling the -ffunction-sections
compiler option for modules, while keeping it disabled for the non-module
kernel code.
Reported-by: NFrank Scheiner <frank.scheiner@web.de>
Fixes: 24b6c225 ("parisc: Build kernel without -ffunction-sections")
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: NHelge Deller <deller@gmx.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

6c9ac388

riscv: fix warning in arch/riscv/include/asm/module.h · d69eb98a

由 David Abdurachmanov 提交于 11月 08, 2018

[ Upstream commit 0138ebb90c633f76bc71617f8f23635ce41c84fd ]

Fixes warning: 'struct module' declared inside parameter list will not be
visible outside of this definition or declaration
Signed-off-by: NDavid Abdurachmanov <david.abdurachmanov@gmail.com>
Acked-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d69eb98a

openanolis / cloud-kernel 10 个月 前同步成功

openanolis / cloud-kernel
10 个月前同步成功