提交 · 6fdc6dd90272ce7e75d744f71535cfbd8d77da81 · openanolis / cloud-kernel

11 4月, 2017 2 次提交

x86/vdso: Plug race between mapping and ELF header setup · 6fdc6dd9

由 Thomas Gleixner 提交于 4月 10, 2017

The vsyscall32 sysctl can racy against a concurrent fork when it switches
from disabled to enabled:

    arch_setup_additional_pages()
	if (vdso32_enabled)
           --> No mapping
                                        sysctl.vsysscall32()
                                          --> vdso32_enabled = true
    create_elf_tables()
      ARCH_DLINFO_IA32
        if (vdso32_enabled) {
           --> Add VDSO entry with NULL pointer

Make ARCH_DLINFO_IA32 check whether the VDSO mapping has been set up for
the newly forked process or not.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170410151723.602367196@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6fdc6dd9

x86/vdso: Ensure vdso32_enabled gets set to valid values only · c06989da

由 Mathias Krause 提交于 4月 10, 2017

vdso_enabled can be set to arbitrary integer values via the kernel command
line 'vdso32=' parameter or via 'sysctl abi.vsyscall32'.

load_vdso32() only maps VDSO if vdso_enabled == 1, but ARCH_DLINFO_IA32
merily checks for vdso_enabled != 0. As a consequence the AT_SYSINFO_EHDR
auxiliary vector for the VDSO_ENTRY is emitted with a NULL pointer which
causes a segfault when the application tries to use the VDSO.

Restrict the valid arguments on the command line and the sysctl to 0 and 1.

Fixes: b0b49f26 ("x86, vdso: Remove compat vdso support")
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Acked-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Roland McGrath <roland@redhat.com>
Link: http://lkml.kernel.org/r/1491424561-7187-1-git-send-email-minipli@googlemail.com
Link: http://lkml.kernel.org/r/20170410151723.518412863@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c06989da

05 4月, 2017 1 次提交

x86/signals: Fix lower/upper bound reporting in compat siginfo · cfac6dfa

由 Joerg Roedel 提交于 4月 04, 2017

Put the right values from the original siginfo into the
userspace compat-siginfo.

This fixes the 32-bit MPX "tabletest" testcase on 64-bit kernels.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Acked-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.8+
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a4455082 ('x86/signals: Add missing signal_compat code for x86 features')
Link: http://lkml.kernel.org/r/1491322501-5054-1-git-send-email-joro@8bytes.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

cfac6dfa

03 4月, 2017 1 次提交

nios2: reserve boot memory for device tree · 921d701e

由 Tobias Klauser 提交于 4月 02, 2017

Make sure to reserve the boot memory for the flattened device tree.
Otherwise it might get overwritten, e.g. when initial_boot_params is
copied, leading to a corrupted FDT and a boot hang/crash:

  bootconsole [early0] enabled
  Early console on uart16650 initialized at 0xf8001600
  OF: fdt: Error -11 processing FDT
  Kernel panic - not syncing: setup_cpuinfo: No CPU found in devicetree!

  ---[ end Kernel panic - not syncing: setup_cpuinfo: No CPU found in devicetree!

Guenter Roeck says:

> I think I found the problem. In unflatten_and_copy_device_tree(), with added
> debug information:
>
> OF: fdt: initial_boot_params=c861e400, dt=c861f000 size=28874 (0x70ca)
>
> ... and then initial_boot_params is copied to dt, which results in corrupted
> fdt since the memory overlaps. Looks like the initial_boot_params memory
> is not reserved and (re-)allocated by early_init_dt_alloc_memory_arch().

Cc: stable@vger.kernel.org
Reported-by: NGuenter Roeck <linux@roeck-us.net>
Reference: http://lkml.kernel.org/r/20170226210338.GA19476@roeck-us.netTested-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Acked-by: NLey Foon Tan <ley.foon.tan@intel.com>

921d701e

01 4月, 2017 4 次提交

kasan: do not sanitize kexec purgatory · 13a6798e

由 Mike Galbraith 提交于 3月 31, 2017

Fixes this:

  kexec: Undefined symbol: __asan_load8_noabort
  kexec-bzImage64: Loading purgatory failed

Link: http://lkml.kernel.org/r/1489672155.4458.7.camel@gmx.deSigned-off-by: NMike Galbraith <efault@gmx.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

13a6798e

mm: fix section name for .data..ro_after_init · 906f2a51

由 Kees Cook 提交于 3月 31, 2017

A section name for .data..ro_after_init was added by both:

    commit d07a980c ("s390: add proper __ro_after_init support")

and

    commit d7c19b06 ("mm: kmemleak: scan .data.ro_after_init")

The latter adds incorrect wrapping around the existing s390 section, and
came later.  I'd prefer the s390 naming, so this moves the s390-specific
name up to the asm-generic/sections.h and renames the section as used by
kmemleak (and in the future, kernel/extable.c).

Link: http://lkml.kernel.org/r/20170327192213.GA129375@beastSigned-off-by: NKees Cook <keescook@chromium.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	[s390 parts]
Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Cc: Eddie Kovsky <ewk@edkovsky.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

906f2a51

M
xtensa: wire up statx system call · 1493aa65
由 Max Filippov 提交于 3月 31, 2017
```
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
```
1493aa65

xtensa: fix stack dump output · e640cc30

由 Max Filippov 提交于 3月 31, 2017

Use %pB in pr_cont format string instead of calling print_symbol
separately. It turns

  [   19.166249] Call Trace:
  [   19.167265]  [<a000e50a>]
  [   19.167843]  __warn+0x92/0xa0
  [   19.169656]  [<a000e554>]
  [   19.170059]  warn_slowpath_fmt+0x3c/0x40
  [   19.171934]  [<a02d5bd8>]

back into

  [   18.123240] Call Trace:
  [   18.125039]  [<a000e4f6>] __warn+0x92/0xa0
  [   18.126961]  [<a000e540>] warn_slowpath_fmt+0x3c/0x40
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>

e640cc30

31 3月, 2017 8 次提交

x86/boot: Include missing header file · 6b1cc946

由 Zhengyi Shen 提交于 3月 29, 2017

Sparse complains about missing forward declarations:

arch/x86/boot/compressed/error.c:8:6:
	warning: symbol 'warn' was not declared. Should it be static?
arch/x86/boot/compressed/error.c:15:6:
	warning: symbol 'error' was not declared. Should it be static?

Include the missing header file.
Signed-off-by: NZhengyi Shen <shenzhengyi@gmail.com>
Acked-by: NKess Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1490770820-24472-1-git-send-email-shenzhengyi@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6b1cc946

x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs · 29f72ce3

由 Yazen Ghannam 提交于 3月 30, 2017

MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.

Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.

Tested on AMD Fam15h and Fam17h systems.

  WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
  kobject: (ffff88085bb256c0): attempted to be registered with empty name!
  ...
  Call Trace:
   kobject_add_internal
   kobject_add
   kobject_create_and_add
   threshold_create_device
   threshold_init_device
Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

29f72ce3

ARC: fix build warnings with !CONFIG_KPROBES · 4c6fabda

由 Vineet Gupta 提交于 3月 30, 2017

|   CC      lib/nmi_backtrace.o
| In file included from ../include/linux/kprobes.h:43:0,
|                  from ../lib/nmi_backtrace.c:17:
| ../arch/arc/include/asm/kprobes.h:57:13: warning: 'trap_is_kprobe' defined but not used [-Wunused-function]
|  static void trap_is_kprobe(unsigned long address, struct pt_regs *regs)
|              ^~~~~~~~~~~~~~

The warning started with 7d134b2c ("kprobes: move kprobe declarations
to asm-generic/kprobes.h") which started including <asm/kprobes.h>
unconditionally into <linux/kprobes.h> exposing a stub function for
!CONFIG_KPROBES to rest of world. Fix that by making the stub a macro
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

4c6fabda

ARCv2: SLC: Make sure busy bit is set properly on SLC flushing · c70c4733

由 Alexey Brodkin 提交于 3月 29, 2017

As reported in STAR 9001165532, an SLC control reg read (for checking
busy state) right after SLC invalidate command may incorrectly return
NOT busy causing software to NOT spin-wait while operation is underway.
(and for some reason this only happens if L1 cache is also disabled - as
required by IOC programming model)

Suggested workaround is to do an additional Control Reg read, which
ensures the 2nd read gets the right status.

Cc: stable@vger.kernel.org  #4.10
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
[vgupta: reworte changelog a bit]
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

c70c4733

xtensa: make __pa work with uncached KSEG addresses · 2b83878d

由 Max Filippov 提交于 3月 29, 2017

When __pa is applied to virtual address in uncached KSEG region the
result is incorrect. Fix it by checking if the original address is in
the uncached KSEG and adjusting the result. It looks better than masking
off bits because pfn_valid would correctly work with new __pa results
and it may be made working in noMMU case, once we get definition for
uncached memory view.

This is required for the dma_common_mmap and DMA debug code to work
correctly: they both indirectly use __pa with coherent DMA addresses.
In case of DMA debug the visible effect is false reports that an address
mapped for DMA is accessed by CPU.

Cc: stable@vger.kernel.org
Tested-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>

2b83878d

arm64: drop non-existing vdso-offsets.h from .gitignore · 9b3403ae

由 Masahiro Yamada 提交于 3月 31, 2017

Since commit a66649da ("arm64: fix vdso-offsets.h dependency"),
include/generated/vdso-offsets.h is directly generated without
arch/arm64/kernel/vdso/vdso-offsets.h.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9b3403ae

arm64: remove redundant header file in current.h · 34d04f25

由 Shaokun Zhang 提交于 3月 30, 2017

Commint 9d84fb27 ("arm64: restore get_current() optimisation") has
removed read_sysreg() and asm/sysreg.h is redundant.

This patch removes asm/sysreg.h header file.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

34d04f25

arm64: fix NULL dereference in have_cpu_die() · 335d2c2d

由 Mark Salter 提交于 3月 24, 2017

Commit 5c492c3f ("arm64: smp: Add function to determine if cpus are
stuck in the kernel") added a helper function to determine if die() is
supported in cpu_ops. This function assumes a cpu will have a valid
cpu_ops entry, but that may not be the case for cpu0 is spin-table or
parking protocol is used to boot secondary cpus. In that case, there
is a NULL dereference if have_cpu_die() is called by cpu0. So add a
check for a valid cpu_ops before dereferencing it.

Fixes: 5c492c3f ("arm64: smp: Add function to determine if cpus are stuck in the kernel")
Signed-off-by: NMark Salter <msalter@redhat.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

335d2c2d

30 3月, 2017 5 次提交

x86/build: Mostly disable '-maccumulate-outgoing-args' · 3f135e57

由 Josh Poimboeuf 提交于 3月 16, 2017

The GCC '-maccumulate-outgoing-args' flag is enabled for most configs,
mostly because of issues which are no longer relevant.  For most
configs, and with most recent versions of GCC, it's no longer needed.

Clarify which cases need it, and only enable it for those cases.  Also
produce a compile-time error for the ftrace graph + mcount + '-Os' case,
which will otherwise cause runtime failures.

The main benefit of '-maccumulate-outgoing-args' is that it prevents an
ugly prologue for functions which have aligned stacks.  But removing the
option also has some benefits: more readable argument saves, smaller
text size, and (presumably) slightly improved performance.

Here are the object size savings for 32-bit and 64-bit defconfig
kernels:

      text	   data	    bss	     dec	    hex	filename
  10006710	3543328	1773568	15323606	 e9d1d6	vmlinux.x86-32.before
   9706358	3547424	1773568	15027350	 e54c96	vmlinux.x86-32.after

      text	   data	    bss	     dec	    hex	filename
  10652105	4537576	 843776	16033457	 f4a6b1	vmlinux.x86-64.before
  10639629	4537576	 843776	16020981	 f475f5	vmlinux.x86-64.after

That comes out to a 3% text size improvement on x86-32 and a 0.1% text
size improvement on x86-64.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170316193133.zrj6gug53766m6nn@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>

3f135e57

s390/uaccess: get_user() should zero on failure (again) · d09c5373

由 Heiko Carstens 提交于 3月 27, 2017

Commit fd2d2b19 ("s390: get_user() should zero on failure")
intended to fix s390's get_user() implementation which did not zero
the target operand if the read from user space faulted. Unfortunately
the patch has no effect: the corresponding inline assembly specifies
that the operand is only written to ("=") and the previous value is
discarded.

Therefore the compiler is free to and actually does omit the zero
initialization.

To fix this simply change the contraint modifier to "+", so the
compiler cannot omit the initialization anymore.

Fixes: c9ca7841 ("s390/uaccess: provide inline variants of get_user/put_user")
Fixes: fd2d2b19 ("s390: get_user() should zero on failure")
Cc: stable@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

d09c5373

parisc: Avoid stalled CPU warnings after system shutdown · 476e75a4

由 Helge Deller 提交于 3月 29, 2017

Commit 73580dac ("parisc: Fix system shutdown halt") introduced an endless
loop for systems which don't provide a software power off function.  But the
soft lockup detector will detect this and report stalled CPUs after some time.
Avoid those unwanted warnings by disabling the soft lockup detector.

Fixes: 73580dac ("parisc: Fix system shutdown halt")
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.9+

476e75a4

parisc: Clean up fixup routines for get_user()/put_user() · d19f5e41

由 Helge Deller 提交于 3月 25, 2017

Al Viro noticed that userspace accesses via get_user()/put_user() can be
simplified a lot with regard to usage of the exception handling.

This patch implements a fixup routine for get_user() and put_user() in such
that the exception handler will automatically load -EFAULT into the register
%r8 (the error value) in case on a fault on userspace.  Additionally the fixup
routine will zero the target register on fault in case of a get_user() call.
The target register is extracted out of the faulting assembly instruction.

This patch brings a few benefits over the old implementation:
1. Exception handling gets much cleaner, easier and smaller in size.
2. Helper functions like fixup_get_user_skip_1 (all of fixup.S) can be dropped.
3. No need to hardcode %r9 as target register for get_user() any longer. This
   helps the compiler register allocator and thus creates less assembler
   statements.
4. No dependency on the exception_data contents any longer.
5. Nested faults will be handled cleanly.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: NHelge Deller <deller@gmx.de>

d19f5e41

parisc: Fix access fault handling in pa_memcpy() · 554bfece

由 Helge Deller 提交于 3月 29, 2017

pa_memcpy() is the major memcpy implementation in the parisc kernel which is
used to do any kind of userspace/kernel memory copies.

Al Viro noticed various bugs in the implementation of pa_mempcy(), most notably
that in case of faults it may report back to have copied more bytes than it
actually did.

Fixing those bugs is quite hard in the C-implementation, because the compiler
is messing around with the registers and we are not guaranteed that specific
variables are always in the same processor registers. This makes proper fault
handling complicated.

This patch implements pa_memcpy() in assembler. That way we have correct fault
handling and adding a 64-bit copy routine was quite easy.

Runtime tested with 32- and 64bit kernels.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
Signed-off-by: NHelge Deller <deller@gmx.de>

554bfece

29 3月, 2017 7 次提交

sparc/ptrace: Preserve previous registers for short regset write · d3805c54

由 Dave Martin 提交于 3月 27, 2017

Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the registers, the thread's old registers are preserved.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d3805c54

mips/ptrace: Preserve previous registers for short regset write · d614fd58

由 Dave Martin 提交于 3月 27, 2017

Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the registers, the thread's old registers are preserved.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d614fd58

metag/ptrace: Reject partial NT_METAG_RPIPE writes · 7195ee31

由 Dave Martin 提交于 3月 27, 2017

It's not clear what behaviour is sensible when doing partial write of
NT_METAG_RPIPE, so just don't bother.

This patch assumes that userspace will never rely on a partial SETREGSET
in this case, since it's not clear what should happen anyway.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Acked-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7195ee31

metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS · 5fe81fe9

由 Dave Martin 提交于 3月 27, 2017

Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill TXSTATUS, a well-defined default value is used, based on the
task's current value.
Suggested-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5fe81fe9

metag/ptrace: Preserve previous registers for short regset write · a78ce80d

由 Dave Martin 提交于 3月 27, 2017

Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the registers, the thread's old registers are preserved.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Acked-by: NJames Hogan <james.hogan@imgtec.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a78ce80d

h8300/ptrace: Fix incorrect register transfer count · 502585c7

由 Dave Martin 提交于 3月 27, 2017

regs_set() and regs_get() are vulnerable to an off-by-1 buffer overrun
if CONFIG_CPU_H8S is set, since this adds an extra entry to
register_offset[] but not to user_regs_struct.

So, iterate over user_regs_struct based on its actual size, not based on
the length of register_offset[].
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

502585c7

c6x/ptrace: Remove useless PTRACE_SETREGSET implementation · fb411b83

由 Dave Martin 提交于 3月 27, 2017

gpr_set won't work correctly and can never have been tested, and the
correct behaviour is not clear due to the endianness-dependent task
layout.

So, just remove it.  The core code will now return -EOPNOTSUPPORT when
trying to set NT_PRSTATUS on this architecture until/unless a correct
implementation is supplied.
Signed-off-by: NDave Martin <Dave.Martin@arm.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb411b83

28 3月, 2017 3 次提交

KVM: x86: cleanup the page tracking SRCU instance · 2beb6dad

由 Paolo Bonzini 提交于 3月 27, 2017

SRCU uses a delayed work item.  Skip cleaning it up, and
the result is use-after-free in the work item callbacks.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Suggested-by: NDmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: 0eb05bf2Reviewed-by: NXiao Guangrong <xiaoguangrong.eric@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

2beb6dad

KVM: nVMX: fix nested EPT detection · 7ad658b6

由 Ladi Prosek 提交于 3月 23, 2017

The nested_ept_enabled flag introduced in commit 7ca29de2 was not
computed correctly. We are interested only in L1's EPT state, not the
the combined L0+L1 value.

In particular, if L0 uses EPT but L1 does not, nested_ept_enabled must
be false to make sure that PDPSTRs are loaded based on CR3 as usual,
because the special case described in 26.3.2.4 Loading Page-Directory-
Pointer-Table Entries does not apply.

Fixes: 7ca29de2 ("KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT")
Cc: qemu-stable@nongnu.org
Reported-by: NWanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Signed-off-by: NLadi Prosek <lprosek@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7ad658b6

x86/mce: Don't print MCEs when mcelog is active · cc66afea

由 Andi Kleen 提交于 3月 27, 2017

Since:

  cd9c57ca ("x86/MCE: Dump MCE to dmesg if no consumers")

all MCEs are printed even when mcelog is running. Fix the regression to
not print to dmesg when mcelog is running as it is a consumer too.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
[ Massage commit message. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.10..
Fixes: cd9c57ca ("x86/MCE: Dump MCE to dmesg if no consumers")
Link: http://lkml.kernel.org/r/20170327093304.10683-2-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

cc66afea

25 3月, 2017 1 次提交

ARC: vdk: Fix support of UIO · ae9955ae

由 Alexey Brodkin 提交于 3月 23, 2017

MotherBoard section has its "ranges" set to 0xE000_0000-0xF000_0000.
But UIO node maps 4 different areas in different memory locations
and all outside MB's ranges.

That obviously breaks UIO mappings in runtime.

Cc: Ruud Derwig <rderwig@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

ae9955ae

24 3月, 2017 7 次提交

x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization · a46f60d7

由 Baoquan He 提交于 3月 24, 2017

Currently KASLR is enabled on three regions: the direct mapping of physical
memory, vamlloc and vmemmap. However the EFI region is also mistakenly
included for VA space randomization because of misusing EFI_VA_START macro
and assuming EFI_VA_START < EFI_VA_END.

(This breaks kexec and possibly other things that rely on stable addresses.)

The EFI region is reserved for EFI runtime services virtual mapping which
should not be included in KASLR ranges. In Documentation/x86/x86_64/mm.txt,
we can see:

  ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space

EFI uses the space from -4G to -64G thus EFI_VA_START > EFI_VA_END,
Here EFI_VA_START = -4G, and EFI_VA_END = -64G.

Changing EFI_VA_START to EFI_VA_END in mm/kaslr.c fixes this problem.
Signed-off-by: NBaoquan He <bhe@redhat.com>
Reviewed-by: NBhupesh Sharma <bhsharma@redhat.com>
Acked-by: NDave Young <dyoung@redhat.com>
Acked-by: NThomas Garnier <thgarnie@google.com>
Cc: <stable@vger.kernel.org> #4.8+
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1490331592-31860-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

a46f60d7

KVM: VMX: Fix enable VPID conditions · 08d839c4

由 Wanpeng Li 提交于 3月 23, 2017

This can be reproduced by running L2 on L1, and disable VPID on L0
if w/o commit "KVM: nVMX: Fix nested VPID vmx exec control", the L2
crash as below:

KVM: entry failed, hardware error 0x7
EAX=00000000 EBX=00000000 ECX=00000000 EDX=000306c3
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000

Reference SDM 30.3 INVVPID:

Protected Mode Exceptions
- #UD
  - If not in VMX operation.
  - If the logical processor does not support VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=0).
  - If the logical processor supports VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=1) but does
    not support the INVVPID instruction (IA32_VMX_EPT_VPID_CAP[32]=0).

So we should check both VPID enable bit in vmx exec control and INVVPID support bit
in vmx capability MSRs to enable VPID. This patch adds the guarantee to not enable
VPID if either INVVPID or single-context/all-context invalidation is not exposed in
vmx capability MSRs.
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NJim Mattson <jmattson@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

08d839c4

KVM: nVMX: Fix nested VPID vmx exec control · 63cb6d5f

由 Wanpeng Li 提交于 3月 20, 2017

This can be reproduced by running kvm-unit-tests/vmx.flat on L0 w/ vpid disabled.

Test suite: VPID
Unhandled exception 6 #UD at ip 00000000004051a6
error_code=0000      rflags=00010047      cs=00000008
rax=0000000000000000 rcx=0000000000000001 rdx=0000000000000047 rbx=0000000000402f79
rbp=0000000000456240 rsi=0000000000000001 rdi=0000000000000000
r8=000000000000000a  r9=00000000000003f8 r10=0000000080010011 r11=0000000000000000
r12=0000000000000003 r13=0000000000000708 r14=0000000000000000 r15=0000000000000000
cr0=0000000080010031 cr2=0000000000000000 cr3=0000000007fff000 cr4=0000000000002020
cr8=0000000000000000
STACK: @4051a6 40523e 400f7f 402059 40028f

We should hide and forbid VPID in L1 if it is disabled on L0. However, nested VPID
enable bit is set unconditionally during setup nested vmx exec controls though VPID
is not exposed through nested VMX capablity. This patch fixes it by don't set nested
VPID enable bit if it is disabled on L0.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 5c614b35 (KVM: nVMX: nested VPID emulation)
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

63cb6d5f

KVM: x86: correct async page present tracepoint · 24dccf83

由 Wanpeng Li 提交于 3月 20, 2017

After async pf setup successfully, there is a broadcast wakeup w/ special
token 0xffffffff which tells vCPU that it should wake up all processes
waiting for APFs though there is no real process waiting at the moment.

The async page present tracepoint print prematurely and fails to catch the
special token setup. This patch fixes it by moving the async page present
tracepoint after the special token setup.

Before patch:

qemu-system-x86-8499  [006] ...1  5973.473292: kvm_async_pf_ready: token 0x0 gva 0x0

After patch:

qemu-system-x86-8499  [006] ...1  5973.473292: kvm_async_pf_ready: token 0xffffffff gva 0x0

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

24dccf83

kvm: vmx: Flush TLB when the APIC-access address changes · fb6c8198

由 Jim Mattson 提交于 3月 16, 2017

Quoting from the Intel SDM, volume 3, section 28.3.3.4: Guidelines for
Use of the INVEPT Instruction:

If EPT was in use on a logical processor at one time with EPTP X, it
is recommended that software use the INVEPT instruction with the
"single-context" INVEPT type and with EPTP X in the INVEPT descriptor
before a VM entry on the same logical processor that enables EPT with
EPTP X and either (a) the "virtualize APIC accesses" VM-execution
control was changed from 0 to 1; or (b) the value of the APIC-access
address was changed.

In the nested case, the burden falls on L1, unless L0 enables EPT in
vmcs02 when L1 doesn't enable EPT in vmcs12.
Signed-off-by: NJim Mattson <jmattson@google.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

fb6c8198

KVM: x86: use pic/ioapic destructor when destroy vm · c761159c

由 Peter Xu 提交于 3月 15, 2017

We have specific destructors for pic/ioapic, we'd better use them when
destroying the VM as well.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

c761159c

KVM: x86: check existance before destroy · 950712eb

由 Peter Xu 提交于 3月 15, 2017

Mostly used for split irqchip mode. In that case, these two things are
not inited at all, so no need to release.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

950712eb

23 3月, 2017 1 次提交

sched/clock, x86/perf: Fix "perf test tsc" · 698eff63

由 Peter Zijlstra 提交于 3月 17, 2017

People reported that commit:

  5680d809 ("sched/clock: Provide better clock continuity")

broke "perf test tsc".

That commit added another offset to the reported clock value; so
take that into account when computing the provided offset values.
Reported-by: NAdrian Hunter <adrian.hunter@intel.com>
Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 5680d809 ("sched/clock: Provide better clock continuity")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

698eff63

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功