提交 · 2e549b2ee0e358bc758480e716b881f9cabedb6a · openanolis / cloud-kernel

21 8月, 2018 1 次提交

x86/vdso: Fix vDSO build if a retpoline is emitted · 2e549b2e

由 Andy Lutomirski 提交于 8月 16, 2018

Currently, if the vDSO ends up containing an indirect branch or
call, GCC will emit the "external thunk" style of retpoline, and it
will fail to link.

Fix it by building the vDSO with inline retpoline thunks.

I haven't seen any reports of this triggering on an unpatched
kernel.

Fixes: commit 76b04384 ("x86/retpoline: Add initial retpoline support")
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NMatt Rickard <matt@softrans.com.au>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Vas Dias <jason.vas.dias@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/c76538cd3afbe19c6246c2d1715bc6a60bd63985.1534448381.git.luto@kernel.org

2e549b2e

18 8月, 2018 1 次提交

x86/speculation/l1tf: Exempt zeroed PTEs from inversion · f19f5c49

由 Sean Christopherson 提交于 8月 17, 2018

It turns out that we should *not* invert all not-present mappings,
because the all zeroes case is obviously special.

clear_page() does not undergo the XOR logic to invert the address bits,
i.e. PTE, PMD and PUD entries that have not been individually written
will have val=0 and so will trigger __pte_needs_invert(). As a result,
{pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones
(adjusted by the max PFN mask) instead of zero. A zeroed entry is ok
because the page at physical address 0 is reserved early in boot
specifically to mitigate L1TF, so explicitly exempt them from the
inversion when reading the PFN.

Manifested as an unexpected mprotect(..., PROT_NONE) failure when called
on a VMA that has VM_PFNMAP and was mmap'd to as something other than
PROT_NONE but never used. mprotect() sends the PROT_NONE request down
prot_none_walk(), which walks the PTEs to check the PFNs.
prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns
-EACCES because it thinks mprotect() is trying to adjust a high MMIO
address.

[ This is a very modified version of Sean's original patch, but all
  credit goes to Sean for doing this and also pointing out that
  sometimes the __pte_needs_invert() function only gets the protection
  bits, not the full eventual pte.  But zero remains special even in
  just protection bits, so that's ok.   - Linus ]

Fixes: f22cc87f ("x86/speculation/l1tf: Invert all not present mappings")
Signed-off-by: NSean Christopherson <sean.j.christopherson@intel.com>
Acked-by: NAndi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f19f5c49

17 8月, 2018 1 次提交

Fix kexec forbidding kernels signed with keys in the secondary keyring to boot · ea93102f

由 Yannik Sembritzki 提交于 8月 16, 2018

The split of .system_keyring into .builtin_trusted_keys and
.secondary_trusted_keys broke kexec, thereby preventing kernels signed by
keys which are now in the secondary keyring from being kexec'd.

Fix this by passing VERIFY_USE_SECONDARY_KEYRING to
verify_pefile_signature().

Fixes: d3bfe841 ("certs: Add a secondary system keyring that can be added to dynamically")
Signed-off-by: NYannik Sembritzki <yannik@sembritzki.me>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Cc: kexec@lists.infradead.org
Cc: keyrings@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ea93102f

16 8月, 2018 2 次提交

x86: i8259: Add missing include file · 0a957467

由 Guenter Roeck 提交于 8月 15, 2018

i8259.h uses inb/outb and thus needs to include asm/io.h to avoid the
following build error, as seen with x86_64:defconfig and CONFIG_SMP=n.

  In file included from drivers/rtc/rtc-cmos.c:45:0:
  arch/x86/include/asm/i8259.h: In function 'inb_pic':
  arch/x86/include/asm/i8259.h:32:24: error:
	implicit declaration of function 'inb'

  arch/x86/include/asm/i8259.h: In function 'outb_pic':
  arch/x86/include/asm/i8259.h:45:2: error:
	implicit declaration of function 'outb'
Reported-by: NSebastian Gottschall <s.gottschall@dd-wrt.com>
Suggested-by: NSebastian Gottschall <s.gottschall@dd-wrt.com>
Fixes: 447ae316 ("x86: Don't include linux/irq.h from asm/hardirq.h")
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0a957467

x86/l1tf: Fix build error seen if CONFIG_KVM_INTEL is disabled · 1eb46908

由 Guenter Roeck 提交于 8月 15, 2018

allmodconfig+CONFIG_INTEL_KVM=n results in the following build error.

  ERROR: "l1tf_vmx_mitigation" [arch/x86/kvm/kvm.ko] undefined!

Fixes: 5b76a3cf ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry")
Reported-by: NMeelis Roos <mroos@linux.ee>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1eb46908

15 8月, 2018 2 次提交

x86/smp: fix non-SMP broken build due to redefinition of apic_id_is_primary_thread · d0055f35

由 Vlastimil Babka 提交于 8月 14, 2018

The function has an inline "return false;" definition with CONFIG_SMP=n
but the "real" definition is also visible leading to "redefinition of
‘apic_id_is_primary_thread’" compiler error.

Guard it with #ifdef CONFIG_SMP
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Fixes: 6a4d2657 ("x86/smp: Provide topology_is_primary_thread()")
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0055f35

x86/init: fix build with CONFIG_SWAP=n · 792adb90

由 Vlastimil Babka 提交于 8月 14, 2018

The introduction of generic_max_swapfile_size and arch-specific versions has
broken linking on x86 with CONFIG_SWAP=n due to undefined reference to
'generic_max_swapfile_size'. Fix it by compiling the x86-specific
max_swapfile_size() only with CONFIG_SWAP=y.
Reported-by: NTomas Pruzina <pruzinat@gmail.com>
Fixes: 377eeaa8 ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

792adb90

13 8月, 2018 9 次提交

parisc: Fix and improve kernel stack unwinding · c8921d72

由 Helge Deller 提交于 8月 05, 2018

This patchset fixes and improves stack unwinding a lot:
1. Show backward stack traces with up to 30 callsites
2. Add callinfo to ENTRY_CFI() such that every assembler function will get an
   entry in the unwind table
3. Use constants instead of numbers in call_on_stack()
4. Do not depend on CONFIG_KALLSYMS to generate backtraces.
5. Speed up backtrace generation

Make sure you have this patch to GNU as installed:
https://sourceware.org/ml/binutils/2018-07/msg00474.html
Without this patch, unwind info in the kernel is often wrong for various
functions.
Signed-off-by: NHelge Deller <deller@gmx.de>

c8921d72

parisc: Remove unnecessary barriers from spinlock.h · 3b885ac1

由 John David Anglin 提交于 8月 12, 2018

Now that mb() is an instruction barrier, it will slow performance if we issue
unnecessary barriers.

The spinlock defines have a number of unnecessary barriers.  The __ldcw()
define is both a hardware and compiler barrier.  The mb() barriers in the
routines using __ldcw() serve no purpose.

The only barrier needed is the one in arch_spin_unlock().  We need to ensure
all accesses are complete prior to releasing the lock.
Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: NHelge Deller <deller@gmx.de>

3b885ac1

parisc: Remove ordered stores from syscall.S · 7797167f

由 John David Anglin 提交于 8月 12, 2018

Now that we use a sync prior to releasing the locks in syscall.S, we don't need
the PA 2.0 ordered stores used to release some locks.  Using an ordered store,
potentially slows the release and subsequent code.

There are a number of other ordered stores and loads that serve no purpose.  I
have converted these to normal stores.
Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: NHelge Deller <deller@gmx.de>

7797167f

parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions · 4a53ec1c

由 Nick Desaulniers 提交于 8月 01, 2018

As part of the effort to reduce the code duplication between _THIS_IP_
and current_text_addr(), let's consolidate callers of
current_text_addr() to use _THIS_IP_.
Signed-off-by: NNick Desaulniers <ndesaulniers@google.com>
Signed-off-by: NHelge Deller <deller@gmx.de>

4a53ec1c

parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature · 75ebedf1

由 Helge Deller 提交于 6月 28, 2018

Some parts of the HAVE_REGS_AND_STACK_ACCESS_API feature is needed for
the rseq syscall. This patch adds the most important parts, and as long
as we don't support kprobes, we should be fine.
Signed-off-by: NHelge Deller <deller@gmx.de>

75ebedf1

parisc: Drop architecture-specific ENOTSUP define · 93cb8e20

由 Helge Deller 提交于 7月 01, 2018

parisc is the only Linux architecture which has defined a value for ENOTSUP.
All other architectures #define ENOTSUP as EOPNOTSUPP in their libc headers.

Having an own value for ENOTSUP which is different than EOPNOTSUPP often gives
problems with userspace programs which expect both to be the same. One such
example is a build error in the libuv package, as can be seen in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900237.

Since we dropped HP-UX support, there is no real benefit in keeping an own
value for ENOTSUP. This patch drops the parisc value for ENOTSUP from the
kernel sources. glibc needs no patch, it reuses the exported headers.
Signed-off-by: NHelge Deller <deller@gmx.de>

93cb8e20

parisc: use generic dma_noncoherent_ops · c1f59375

由 Christoph Hellwig 提交于 6月 19, 2018

Switch to the generic noncoherent direct mapping implementation.

Fix sync_single_for_cpu to do skip the cache flush unless the transfer
is to the device to match the more tested unmap_single path which should
have the same cache coherency implications.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NHelge Deller <deller@gmx.de>

c1f59375

parisc: always use flush_kernel_dcache_range for DMA cache maintainance · 7f150105

由 Christoph Hellwig 提交于 6月 19, 2018

Current the S/G list based DMA ops use flush_kernel_vmap_range which
contains a few UP optimizations, while the rest of the DMA operations
uses flush_kernel_dcache_range.  The single vs sg operations are supposed
to have the same effect, so they should use the same routines.  Use
the more conservation version for now, but if people more familiar with
parisc think the vmap version is generally fine for DMA we should switch
all interfaces over to it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NHelge Deller <deller@gmx.de>

7f150105

parisc: merge pcx_dma_ops and pcxl_dma_ops · a34a9b96

由 Christoph Hellwig 提交于 6月 19, 2018

The only difference is that pcxl supports dma coherent allocations, while
pcx only supports non-consistent allocations and otherwise fails.

But dma_alloc* is not in the fast path, and merging these two allows an
easy migration path to the generic dma-noncoherent implementation, so
do it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NHelge Deller <deller@gmx.de>

a34a9b96

11 8月, 2018 2 次提交

lib/ubsan: remove null-pointer checks · 3ca17b1f

由 Andrey Ryabinin 提交于 8月 10, 2018

With gcc-8 fsanitize=null become very noisy.  GCC started to complain
about things like &a->b, where 'a' is NULL pointer.  There is no NULL
dereference, we just calculate address to struct member.  It's
technically undefined behavior so UBSAN is correct to report it.  But as
long as there is no real NULL-dereference, I think, we should be fine.

-fno-delete-null-pointer-checks compiler flag should protect us from any
consequences.  So let's just no use -fsanitize=null as it's not useful
for us.  If there is a real NULL-deref we will see crash.  Even if
userspace mapped something at NULL (root can do this), with things like
SMAP should catch the issue.

Link: http://lkml.kernel.org/r/20180802153209.813-1-aryabinin@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3ca17b1f

x86/mm/pti: Move user W+X check into pti_finalize() · d878efce

由 Joerg Roedel 提交于 8月 08, 2018

The user page-table gets the updated kernel mappings in pti_finalize(),
which runs after the RO+X permissions got applied to the kernel page-table
in mark_readonly().

But with CONFIG_DEBUG_WX enabled, the user page-table is already checked in
mark_readonly() for insecure mappings.  This causes false-positive
warnings, because the user page-table did not get the updated mappings yet.

Move the W+X check for the user page-table into pti_finalize() after it
updated all required mappings.

[ tglx: Folded !NX supported fix ]
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Waiman Long <llong@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca>
Cc: joro@8bytes.org
Link: https://lkml.kernel.org/r/1533727000-9172-1-git-send-email-joro@8bytes.org

d878efce

10 8月, 2018 3 次提交

x86/microcode: Allow late microcode loading with SMT disabled · 07d981ad

由 Josh Poimboeuf 提交于 8月 10, 2018

The kernel unnecessarily prevents late microcode loading when SMT is
disabled.  It should be safe to allow it if all the primary threads are
online.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>

07d981ad

MIPS: Remove remnants of UASM_ISA · 22f20a11

由 Paul Burton 提交于 8月 09, 2018

Commit 33679a50 ("MIPS: uasm: Remove needless ISA abstraction")
removed use of the MIPS_ISA preprocessor macro, but left a couple of
unused definitions of it behind.

Remove the dead code.
Signed-off-by: NPaul Burton <paul.burton@mips.com>

22f20a11

x86/relocs: Add __end_rodata_aligned to S_REL · a29dba16

由 Joerg Roedel 提交于 8月 09, 2018

This new symbol needs to be in the workaround-list for buggy
binutils, otherwise the build with gcc-4.6 fails.

Fixes: 39d668e0 ('x86/mm/pti: Make pti_clone_kernel_text() compile on 32 bit')
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linux-Next Mailing List <linux-next@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180809094449.ddmnrkz7qkvo3j2x@suse.de

a29dba16

09 8月, 2018 8 次提交

ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUG · 209b4375

由 Michael Büsch 提交于 7月 31, 2018

Use the standard WARN_ON instead.
If a small kernel is desired, WARN_ON can be disabled globally.

Also remove SSB_DEBUG. Besides WARN_ON it only adds a tiny debug check.
Include this check unconditionally.
Signed-off-by: NMichael Buesch <m@bues.ch>
Signed-off-by: NKalle Valo <kvalo@codeaurora.org>

209b4375

kbuild: remove deprecated host-progs variable · d6c6ab93

由 Masahiro Yamada 提交于 8月 09, 2018

The host-progs has been kept as an alias of hostprogs-y for a long time
(at least since the beginning of Git era), with the clear prompt:
  Usage of host-progs is deprecated. Please replace with hostprogs-y!

Enough time for the migration has passed.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: NMax Filippov <jcmvbkbc@gmail.com>

d6c6ab93

arm64 / ACPI: clean the additional checks before calling ghes_notify_sea() · 1035a078

由 Dongjiu Geng 提交于 8月 07, 2018

In order to remove the additional check before calling the
ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA.

After this cleanup, we can simply call the ghes_notify_sea() to let
APEI driver handle the SEA notification.
Signed-off-by: NDongjiu Geng <gengdongjiu@huawei.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1035a078

s390/mm: fix addressing exception after suspend/resume · 37a366fa

由 Gerald Schaefer 提交于 8月 07, 2018

Commit c9b5ad54 "s390/mm: tag normal pages vs pages used in page tables"
accidentally changed the logic in arch_set_page_states(), which is used by
the suspend/resume code. set_page_stable(page, order) was changed to
set_page_stable_dat(page, 0). After this, only the first page of higher order
pages will be set to stable, and a write to one of the unstable pages will
result in an addressing exception.

Fix this by using "order" again, instead of "0".

Fixes: c9b5ad54 ("s390/mm: tag normal pages vs pages used in page tables")
Cc: stable@vger.kernel.org # 4.14+
Reviewed-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

37a366fa

x86/mm/kmmio: Make the tracer robust against L1TF · 1063711b

由 Andi Kleen 提交于 8月 07, 2018

The mmio tracer sets io mapping PTEs and PMDs to non present when enabled
without inverting the address bits, which makes the PTE entry vulnerable
for L1TF.

Make it use the right low level macros to actually invert the address bits
to protect against L1TF.

In principle this could be avoided because MMIO tracing is not likely to be
enabled on production machines, but the fix is straigt forward and for
consistency sake it's better to get rid of the open coded PTE manipulation.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

1063711b

parisc: Define mb() and add memory barriers to assembler unlock sequences · fedb8da9

由 John David Anglin 提交于 8月 05, 2018

For years I thought all parisc machines executed loads and stores in
order. However, Jeff Law recently indicated on gcc-patches that this is
not correct. There are various degrees of out-of-order execution all the
way back to the PA7xxx processor series (hit-under-miss). The PA8xxx
series has full out-of-order execution for both integer operations, and
loads and stores.

This is described in the following article:
http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml

For this reason, we need to define mb() and to insert a memory barrier
before the store unlocking spinlocks. This ensures that all memory
accesses are complete prior to unlocking. The ldcw instruction performs
the same function on entry.
Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: NHelge Deller <deller@gmx.de>

fedb8da9

parisc: Enable CONFIG_MLONGCALLS by default · 66509a27

由 Helge Deller 提交于 7月 28, 2018

Enable the -mlong-calls compiler option by default, because otherwise in most
cases linking the vmlinux binary fails due to truncations of R_PARISC_PCREL22F
relocations. This fixes building the 64-bit defconfig.

Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: NHelge Deller <deller@gmx.de>

66509a27

MIPS: netlogic: xlr: Remove erroneous check in nlm_fmn_send() · 02eec6c9

由 Paul Burton 提交于 8月 08, 2018

In nlm_fmn_send() we have a loop which attempts to send a message
multiple times in order to handle the transient failure condition of a
lack of available credit. When examining the status register to detect
the failure we check for a condition that can never be true, which falls
foul of gcc 8's -Wtautological-compare:

  In file included from arch/mips/netlogic/common/irq.c:65:
  ./arch/mips/include/asm/netlogic/xlr/fmn.h: In function 'nlm_fmn_send':
  ./arch/mips/include/asm/netlogic/xlr/fmn.h:304:22: error: bitwise
    comparison always evaluates to false [-Werror=tautological-compare]
     if ((status & 0x2) == 1)
                        ^~

If the path taken if this condition were true all we do is print a
message to the kernel console. Since failures seem somewhat expected
here (making the console message questionable anyway) and the condition
has clearly never evaluated true we simply remove it, rather than
attempting to fix it to check status correctly.
Signed-off-by: NPaul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20174/
Cc: Ganesan Ramalingam <ganesanr@broadcom.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org

02eec6c9

08 8月, 2018 11 次提交

arm64: alternative: Use true and false for boolean values · 3c4d9137

由 Gustavo A. R. Silva 提交于 8月 07, 2018

Return statements in functions returning bool should use true or false
instead of an integer value. This code was detected with the help of
Coccinelle.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3c4d9137

x86/mm/pat: Make set_memory_np() L1TF safe · 958f79b9

由 Andi Kleen 提交于 8月 07, 2018

set_memory_np() is used to mark kernel mappings not present, but it has
it's own open coded mechanism which does not have the L1TF protection of
inverting the address bits.

Replace the open coded PTE manipulation with the L1TF protecting low level
PTE routines.

Passes the CPA self test.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

958f79b9

x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert · 0768f915

由 Andi Kleen 提交于 8月 07, 2018

Some cases in THP like:
  - MADV_FREE
  - mprotect
  - split

mark the PMD non present for temporarily to prevent races. The window for
an L1TF attack in these contexts is very small, but it wants to be fixed
for correctness sake.

Use the proper low level functions for pmd/pud_mknotpresent() to address
this.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0768f915

x86/speculation/l1tf: Invert all not present mappings · f22cc87f

由 Andi Kleen 提交于 8月 07, 2018

For kernel mappings PAGE_PROTNONE is not necessarily set for a non present
mapping, but the inversion logic explicitely checks for !PRESENT and
PROT_NONE.

Remove the PROT_NONE check and make the inversion unconditional for all not
present mappings.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

f22cc87f

MIPS: VDSO: Force link endianness · 2f002567

由 Paul Burton 提交于 8月 07, 2018

When building the VDSO with clang it appears to invoke ld without
specifying endianness, even though clang itself was provided with a -EB
or -EL flag. This results in the build failing due to a mismatch between
the objects that are the input to ld, and the output it is attempting to
create:

  VDSO    arch/mips/vdso/vdso.so.dbg.raw
  mips-linux-ld: arch/mips/vdso/elf.o: compiled for a big endian system
    and target is little endian
  mips-linux-ld: arch/mips/vdso/elf.o: endianness incompatible with that
    of the selected emulation
  mips-linux-ld: failed to merge target specific data of file
    arch/mips/vdso/elf.o
  ...

Work around this problem by explicitly specifying the link endianness
using -Wl,-EB or -Wl,-EL when -EB or -EL are part of KBUILD_CFLAGS. This
resolves the build failure when using clang, and doesn't have any
negative effect on gcc.
Signed-off-by: NPaul Burton <paul.burton@mips.com>

2f002567

MIPS: Always specify -EB or -EL when using clang · c6d6f4c5

由 Paul Burton 提交于 8月 07, 2018

When building using clang, always specify -EB or -EL in order to ensure
we target the desired endianness.

Since clang cross compiles using a single compiler build with multiple
targets, our -dumpmachine tests which don't specify clang's --target
argument check output based upon the build machine rather than the
machine our build will target. This means our detection of whether to
specify -EB fails miserably & we never do. Providing the endianness flag
unconditionally for clang resolves this issue & simplifies the clang
path somewhat.
Signed-off-by: NPaul Burton <paul.burton@mips.com>

c6d6f4c5

x86/mm/pti: Clone kernel-image on PTE level for 32 bit · 16a3fe63

由 Joerg Roedel 提交于 8月 07, 2018

On 32 bit the kernel sections are not huge-page aligned.  When we clone
them on PMD-level we unevitably map some areas that are normal kernel
memory and may contain secrets to user-space. To prevent that we need to
clone the kernel-image on PTE-level for 32 bit.

Also make the page-table cloning code more general so that it can handle
PMD and PTE level cloning. This can be generalized further in the future to
also handle clones on the P4D-level.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Waiman Long <llong@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca>
Cc: joro@8bytes.org
Link: https://lkml.kernel.org/r/1533637471-30953-4-git-send-email-joro@8bytes.org

16a3fe63

x86/mm/pti: Don't clear permissions in pti_clone_pmd() · 30514eff

由 Joerg Roedel 提交于 8月 07, 2018

The function sets the global-bit on cloned PMD entries, which only makes
sense when the permissions are identical between the user and the kernel
page-table. Further, only write-permissions are cleared for entry-text and
kernel-text sections, which are not writeable at the end of the boot
process.

The reason why this RW clearing exists is that in the early PTI
implementations the cloned kernel areas were set up during early boot
before the kernel text is set to read only and not touched afterwards.

This is not longer true. The cloned areas are still set up early to get the
entry code working for interrupts and other things, but after the kernel
text has been set RO the clone is repeated which copies the RO PMD/PTEs
over to the user visible clone. That means the initial clearing of the
writable bit can be avoided.

[ tglx: Amended changelog ]
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NDave Hansen <dave.hansen@intel.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Waiman Long <llong@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca>
Cc: joro@8bytes.org
Link: https://lkml.kernel.org/r/1533637471-30953-3-git-send-email-joro@8bytes.org

30514eff

x86/paravirt: Fix spectre-v2 mitigations for paravirt guests · 5800dc5c

由 Peter Zijlstra 提交于 8月 03, 2018

Nadav reported that on guests we're failing to rewrite the indirect
calls to CALLEE_SAVE paravirt functions. In particular the
pv_queued_spin_unlock() call is left unpatched and that is all over the
place. This obviously wrecks Spectre-v2 mitigation (for paravirt
guests) which relies on not actually having indirect calls around.

The reason is an incorrect clobber test in paravirt_patch_call(); this
function rewrites an indirect call with a direct call to the _SAME_
function, there is no possible way the clobbers can be different
because of this.

Therefore remove this clobber check. Also put WARNs on the other patch
failure case (not enough room for the instruction) which I've not seen
trigger in my (limited) testing.

Three live kernel image disassemblies for lock_sock_nested (as a small
function that illustrates the problem nicely). PRE is the current
situation for guests, POST is with this patch applied and NATIVE is with
or without the patch for !guests.

PRE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  *0xffffffff822299e8
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

POST:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    callq  0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock>
   0xffffffff817be9a5 <+53>:    xchg   %ax,%ax
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063aa0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.

NATIVE:

(gdb) disassemble lock_sock_nested
Dump of assembler code for function lock_sock_nested:
   0xffffffff817be970 <+0>:     push   %rbp
   0xffffffff817be971 <+1>:     mov    %rdi,%rbp
   0xffffffff817be974 <+4>:     push   %rbx
   0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
   0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
   0xffffffff817be981 <+17>:    mov    %rbx,%rdi
   0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
   0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
   0xffffffff817be98f <+31>:    test   %eax,%eax
   0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
   0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
   0xffffffff817be99d <+45>:    mov    %rbx,%rdi
   0xffffffff817be9a0 <+48>:    movb   $0x0,(%rdi)
   0xffffffff817be9a3 <+51>:    nopl   0x0(%rax)
   0xffffffff817be9a7 <+55>:    pop    %rbx
   0xffffffff817be9a8 <+56>:    pop    %rbp
   0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
   0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
   0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
   0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
   0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
   0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
End of assembler dump.


Fixes: 63f70270 ("[PATCH] i386: PARAVIRT: add common patching machinery")
Fixes: 3010a066 ("x86/paravirt, objtool: Annotate indirect calls")
Reported-by: NNadav Amit <namit@vmware.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org

5800dc5c

MIPS: Use dins to simplify __write_64bit_c0_split() · 36dc5b20

由 Paul Burton 提交于 8月 07, 2018

The code in __write_64bit_c0_split() is used by MIPS32 kernels running
on MIPS64 CPUs to write a 64-bit value to a 64-bit coprocessor 0
register using a single 64-bit dmtc0 instruction. It does this by
combining the 2x 32-bit registers used to hold the 64-bit value into a
single register, which in the existing code involves three steps:

  1) Zero extend register A which holds bits 31:0 of our data, since it
     may have previously held a sign-extended value.

  2) Shift register B which holds bits 63:32 of our data in bits 31:0
     left by 32 bits, such that the bits forming our data are in the
     position they'll be in the final 64-bit value & bits 31:0 of the
     register are zero.

  3) Or the two registers together to form the 64-bit value in one
     64-bit register.

From MIPS r2 onwards we have a dins instruction which can effectively
perform all 3 of those steps using a single instruction.

Add a path for MIPS r2 & beyond which uses dins to take bits 31:0 from
register B & insert them into bits 63:32 of register A, giving us our
full 64-bit value in register A with one instruction.

Since we know that MIPS r2 & above support the sel field for the dmtc0
instruction, we don't bother special casing sel==0. Omiting the sel
field would assemble to exactly the same instruction as when we
explicitly specify that it equals zero.
Signed-off-by: NPaul Burton <paul.burton@mips.com>

36dc5b20

MIPS: Use read-write output operand in __write_64bit_c0_split() · 08eeb44b

由 Paul Burton 提交于 8月 06, 2018

Commit c22c8043 ("MIPS: Fix input modify in
__write_64bit_c0_split()") modified __write_64bit_c0_split() constraints
such that we have both an input & an output which we hope to assign to
the same registers, and modify the output rather than incorrectly
clobbering an input.

The way in which we use both an output & an input parameter with the
input constrained to share the output registers is a little convoluted &
also problematic for clang, which complains if the input & output values
have different widths. For example:

  In file included from kernel/fork.c:98:
  ./arch/mips/include/asm/mmu_context.h:149:19: error: unsupported
    inline asm: input with type 'unsigned long' matching output with
    type 'unsigned long long'
          write_c0_entryhi(cpu_asid(cpu, next));
          ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
  ./arch/mips/include/asm/mmu_context.h:93:2: note: expanded from macro
    'cpu_asid'
          (cpu_context((cpu), (mm)) & cpu_asid_mask(&cpu_data[cpu]))
          ^
  ./arch/mips/include/asm/mipsregs.h:1617:65: note: expanded from macro
    'write_c0_entryhi'
  #define write_c0_entryhi(val)   __write_ulong_c0_register($10, 0, val)
                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
  ./arch/mips/include/asm/mipsregs.h:1430:39: note: expanded from macro
    '__write_ulong_c0_register'
                  __write_64bit_c0_register(reg, sel, val);               \
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
  ./arch/mips/include/asm/mipsregs.h:1400:41: note: expanded from macro
    '__write_64bit_c0_register'
                  __write_64bit_c0_split(register, sel, value);           \
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
  ./arch/mips/include/asm/mipsregs.h:1498:13: note: expanded from macro
    '__write_64bit_c0_split'
                          : "r,0" (val));                                 \
                                   ^~~

We can both fix this build failure & simplify the code somewhat by
assigning the __tmp variable with the input value in C prior to our
inline assembly, and then using a single read-write output operand (ie.
a constraint beginning with +) to provide this value to our assembly.
Signed-off-by: NPaul Burton <paul.burton@mips.com>

08eeb44b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功