提交 · 6454b3bdd138dfc640deb5e7b9a0668fca2d55dd · openanolis / cloud-kernel

19 12月, 2017 1 次提交

x86/stacktrace: Make zombie stack traces reliable · 6454b3bd

由 Josh Poimboeuf 提交于 12月 18, 2017

Commit:

  1959a601 ("x86/dumpstack: Pin the target stack when dumping it")

changed the behavior of stack traces for zombies.  Before that commit,
/proc/<pid>/stack reported the last execution path of the zombie before
it died:

  [<ffffffff8105b877>] do_exit+0x6f7/0xa80
  [<ffffffff8105bc79>] do_group_exit+0x39/0xa0
  [<ffffffff8105bcf0>] __wake_up_parent+0x0/0x30
  [<ffffffff8152dd09>] system_call_fastpath+0x16/0x1b
  [<00007fd128f9c4f9>] 0x7fd128f9c4f9
  [<ffffffffffffffff>] 0xffffffffffffffff

After the commit, it just reports an empty stack trace.

The new behavior is actually probably more correct.  If the stack
refcount has gone down to zero, then the task has already gone through
do_exit() and isn't going to run anymore.  The stack could be freed at
any time and is basically gone, so reporting an empty stack makes sense.

However, save_stack_trace_tsk_reliable() treats such a missing stack
condition as an error.  That can cause livepatch transition stalls if
there are any unreaped zombies.  Instead, just treat it as a reliable,
empty stack.
Reported-and-tested-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Fixes: af085d90 ("stacktrace/x86: add function for detecting reliable stack traces")
Link: http://lkml.kernel.org/r/e4b09e630e99d0c1080528f0821fc9d9dbaeea82.1513631620.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6454b3bd

18 12月, 2017 1 次提交

x86/mm: Unbreak modules that use the DMA API · 9d5f38ba

由 Tom Lendacky 提交于 12月 15, 2017

Commit d8aa7eea ("x86/mm: Add Secure Encrypted Virtualization (SEV)
support") changed sme_active() from an inline function that referenced
sme_me_mask to a non-inlined function in order to make the sev_enabled
variable a static variable.  This function was marked EXPORT_SYMBOL_GPL
because at the time the patch was submitted, sme_me_mask was marked
EXPORT_SYMBOL_GPL.

Commit 87df2617 ("x86/mm: Unbreak modules that rely on external
PAGE_KERNEL availability") changed sme_me_mask variable from
EXPORT_SYMBOL_GPL to EXPORT_SYMBOL, allowing external modules the ability
to build with CONFIG_AMD_MEM_ENCRYPT=y.  Now, however, with sev_active()
no longer an inline function and marked as EXPORT_SYMBOL_GPL, external
modules that use the DMA API are once again broken in 4.15. Since the DMA
API is meant to be used by external modules, this needs to be changed.

Change the sme_active() and sev_active() functions from EXPORT_SYMBOL_GPL
to EXPORT_SYMBOL.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Link: https://lkml.kernel.org/r/20171215162011.14125.7113.stgit@tlendack-t1.amdoffice.net

9d5f38ba

16 12月, 2017 1 次提交

x86/build: Make isoimage work on Debian · 5f0e3fe6

由 Matthew Wilcox 提交于 11月 14, 2017

Debian does not ship a 'mkisofs' symlink to genisoimage.  All modern
distros ship genisoimage, so just use that directly.  That requires
renaming the 'genisoimage' function.  Also neaten up the 'for' loop
while I'm in here.
Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

5f0e3fe6

15 12月, 2017 6 次提交

x86/espfix/64: Fix espfix double-fault handling on 5-level systems · c739f930

由 Andy Lutomirski 提交于 12月 12, 2017

Using PGDIR_SHIFT to identify espfix64 addresses on 5-level systems
was wrong, and it resulted in panics due to unhandled double faults.
Use P4D_SHIFT instead, which is correct on 4-level and 5-level
machines.

This fixes a panic when running x86 selftests on 5-level machines.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 1d33b219 ("x86/espfix: Add support for 5-level paging")
Link: http://lkml.kernel.org/r/24c898b4f44fdf8c22d93703850fb384ef87cfdc.1513035461.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

c739f930

objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version · 215eada7

由 Ingo Molnar 提交于 12月 15, 2017

This fixes the following warning:

  warning: objtool: x86 instruction decoder differs from kernel

Note that there are cleanups queued up for v4.16 that will make this
warning more informative and will make the syncing easier as well.

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

215eada7

x86/decoder: Fix and update the opcodes map · f5b5fab1

由 Randy Dunlap 提交于 12月 11, 2017

Update x86-opcode-map.txt based on the October 2017 Intel SDM publication.
Fix INVPID to INVVPID.
Add UD0 and UD1 instruction opcodes.

Also sync the objtool and perf tooling copies of this file.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

f5b5fab1

x86/power: Make restore_processor_context() sane · 7ee18d67

由 Andy Lutomirski 提交于 12月 14, 2017

My previous attempt to fix a couple of bugs in __restore_processor_context():

  5b06bbcf ("x86/power: Fix some ordering bugs in __restore_processor_context()")

... introduced yet another bug, breaking suspend-resume.

Rather than trying to come up with a minimal fix, let's try to clean it up
for real.  This patch fixes quite a few things:

 - The old code saved a nonsensical subset of segment registers.
   The only registers that need to be saved are those that contain
   userspace state or those that can't be trivially restored without
   percpu access working.  (On x86_32, we can restore percpu access
   by writing __KERNEL_PERCPU to %fs.  On x86_64, it's easier to
   save and restore the kernel's GSBASE.)  With this patch, we
   restore hardcoded values to the kernel state where applicable and
   explicitly restore the user state after fixing all the descriptor
   tables.

 - We used to use an unholy mix of inline asm and C helpers for
   segment register access.  Let's get rid of the inline asm.

This fixes the reported s2ram hangs and make the code all around
more logical.
Analyzed-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reported-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Reported-by: NPavel Machek <pavel@ucw.cz>
Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Fixes: 5b06bbcf ("x86/power: Fix some ordering bugs in __restore_processor_context()")
Link: http://lkml.kernel.org/r/398ee68e5c0f766425a7b746becfc810840770ff.1513286253.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

7ee18d67

x86/power/32: Move SYSENTER MSR restoration to fix_processor_context() · 896c80be

由 Andy Lutomirski 提交于 12月 14, 2017

x86_64 restores system call MSRs in fix_processor_context(), and
x86_32 restored them along with segment registers.  The 64-bit
variant makes more sense, so move the 32-bit code to match the
64-bit code.

No side effects are expected to runtime behavior.
Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: http://lkml.kernel.org/r/65158f8d7ee64dd6bbc6c1c83b3b34aaa854e3ae.1513286253.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

896c80be

x86/power/64: Use struct desc_ptr for the IDT in struct saved_context · 090edbe2

由 Andy Lutomirski 提交于 12月 14, 2017

x86_64's saved_context nonsensically used separate idt_limit and
idt_base fields and then cast &idt_limit to struct desc_ptr *.

This was correct (with -fno-strict-aliasing), but it's confusing,
served no purpose, and required #ifdeffery. Simplify this by
using struct desc_ptr directly.

No change in functionality.
Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: http://lkml.kernel.org/r/967909ce38d341b01d45eff53e278e2728a3a93a.1513286253.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

090edbe2

12 12月, 2017 2 次提交

x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with CONFIG_STACKDEPOT=y · 0a373d4f

由 Andrey Ryabinin 提交于 11月 30, 2017

Stackdepot doesn't work well with CONFIG_UNWINDER_GUESS=y.
The 'guess' unwinder generate awfully large and inaccurate stacktraces,
thus stackdepot can't deduplicate stacktraces because they all look like
unique. Eventually stackdepot reaches its capacity limit:

  WARNING: CPU: 0 PID: 545 at lib/stackdepot.c:119 depot_save_stack+0x28e/0x550
  Call Trace:
   ? kasan_kmalloc+0x144/0x160
   ? depot_save_stack+0x1f5/0x550
   ? do_raw_spin_unlock+0xda/0xf0
   ? preempt_count_sub+0x13/0xc0

  <...90 lines...>

   ? do_raw_spin_unlock+0xda/0xf0

Add a STACKDEPOT=n dependency to UNWINDER_GUESS to avoid the problem.
Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: NDmitry Vyukov <dvyukov@google.com>
Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171130123554.4330-1-aryabinin@virtuozzo.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0a373d4f

x86/build: Don't verify mtools configuration file for isoimage · f79ce87f

由 Changbin Du 提交于 11月 30, 2017

If mtools.conf is not generated before, 'make isoimage' could complain:

  Kernel: arch/x86/boot/bzImage is ready  (#597)
    GENIMAGE arch/x86/boot/image.iso
   *** Missing file: arch/x86/boot/mtools.conf
  arch/x86/boot/Makefile:144: recipe for target 'isoimage' failed

mtools.conf is not used for isoimage generation, so do not check it.
Signed-off-by: NChangbin Du <changbin.du@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 4366d57a ("x86/build: Factor out fdimage/isoimage generation commands to standalone script")
Link: http://lkml.kernel.org/r/1512053480-8083-1-git-send-email-changbin.du@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

f79ce87f

11 12月, 2017 1 次提交

x86/mm/kmmio: Fix mmiotrace for page unaligned addresses · 6d60ce38

由 Karol Herbst 提交于 11月 27, 2017

If something calls ioremap() with an address not aligned to PAGE_SIZE, the
returned address might be not aligned as well. This led to a probe
registered on exactly the returned address, but the entire page was armed
for mmiotracing.

On calling iounmap() the address passed to unregister_kmmio_probe() was
PAGE_SIZE aligned by the caller leading to a complete freeze of the
machine.

We should always page align addresses while (un)registerung mappings,
because the mmiotracer works on top of pages, not mappings. We still keep
track of the probes based on their real addresses and lengths though,
because the mmiotrace still needs to know what are mapped memory regions.

Also move the call to mmiotrace_iounmap() prior page aligning the address,
so that all probes are unregistered properly, otherwise the kernel ends up
failing memory allocations randomly after disabling the mmiotracer.
Tested-by: NLyude <lyude@redhat.com>
Signed-off-by: NKarol Herbst <kherbst@redhat.com>
Acked-by: NPekka Paalanen <ppaalanen@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: nouveau@lists.freedesktop.org
Link: http://lkml.kernel.org/r/20171127075139.4928-1-kherbst@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6d60ce38

07 12月, 2017 5 次提交

x86/boot/compressed/64: Print error if 5-level paging is not supported · 6d7e0ba2

由 Kirill A. Shutemov 提交于 12月 04, 2017

If the machine does not support the paging mode for which the kernel was
compiled, the boot process cannot continue.

It's not possible to let the kernel detect the mismatch as it does not even
reach the point where cpu features can be evaluted due to a triple fault in
the KASLR setup.

Instead of instantaneous silent reboot, emit an error message which gives
the user the information why the boot fails.

Fixes: 77ef56e4 ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y")
Reported-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NBorislav Petkov <bp@suse.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: linux-mm@kvack.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20171204124059.63515-3-kirill.shutemov@linux.intel.com

6d7e0ba2

x86/boot/compressed/64: Detect and handle 5-level paging at boot-time · 08529078

由 Kirill A. Shutemov 提交于 12月 04, 2017

Prerequisite for fixing the current problem of instantaneous reboots when a
5-level paging kernel is booted on 4-level paging hardware.

At the same time this change prepares the decompression code to boot-time
switching between 4- and 5-level paging.

[ tglx: Folded the GCC < 5 fix. ]

Fixes: 77ef56e4 ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y")
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: linux-mm@kvack.org
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20171204124059.63515-2-kirill.shutemov@linux.intel.com

08529078

x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation · 947134d9

由 Prarit Bhargava 提交于 12月 04, 2017

Documentation/x86/topology.txt defines smp_num_siblings as "The number of
threads in a core".  Since commit bbb65d2d ("x86: use cpuid vector 0xb
when available for detecting cpu topology") smp_num_siblings is the
maximum number of threads in a core.  If Simultaneous MultiThreading
(SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as
expected.

Use topology_max_smt_threads(), which contains the active numer of threads,
in the __max_logical_packages calculation.

On a single socket, single core, single thread system __max_smt_threads has
not been updated when the __max_logical_packages calculation happens, so its
zero which makes the package estimate fail. Initialize it to one, which is
the minimum number of threads on a core.

[ tglx: Folded the __max_smt_threads fix in ]

Fixes: b4c0a732 ("x86/smpboot: Fix __max_logical_packages estimate")
Reported-by: NJakub Kicinski <kubakici@wp.pl>
Signed-off-by: Prarit Bhargava <prarit@redhat.com
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NJakub Kicinski <kubakici@wp.pl>
Cc: netdev@vger.kernel.org
Cc: "netdev@vger.kernel.org"
Cc: Clark Williams <williams@redhat.com>
Link: https://lkml.kernel.org/r/20171204164521.17870-1-prarit@redhat.com

947134d9

x86/vdso: Change time() prototype to match __vdso_time() · 88edb57d

由 Arnd Bergmann 提交于 12月 04, 2017

gcc-8 warns that time() is an alias for __vdso_time() but the two
have different prototypes:

  arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias]
   int time(time_t *t)
       ^~~~
  arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here

I could not figure out whether this is intentional, but I see that
changing it to return time_t avoids the warning.

Returning 'int' from time() is also a bit questionable, as it causes an
overflow in y2038 even on 64-bit architectures that use a 64-bit time_t
type. On 32-bit architecture with 64-bit time_t, time() should always
be implement by the C library by calling a (to be added) clock_gettime()
variant that takes a sufficiently wide argument.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

88edb57d

x86: Fix Sparse warnings about non-static functions · d553d03f

由 Colin Ian King 提交于 12月 06, 2017

Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common()
are local to the source and do not need to be in global scope, so make them
static.

Fixes up various sparse warnings.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NMike Travis <mike.travis@hpe.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Cc: travis@sgi.com
Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d553d03f

06 12月, 2017 4 次提交

x86/power: Fix some ordering bugs in __restore_processor_context() · 5b06bbcf

由 Andy Lutomirski 提交于 11月 30, 2017

__restore_processor_context() had a couple of ordering bugs.  It
restored GSBASE after calling load_gs_index(), and the latter can
call into tracing code.  It also tried to restore segment registers
before restoring the LDT, which is straight-up wrong.

Reorder the code so that we restore GSBASE, then the descriptor
tables, then the segments.

This fixes two bugs.  First, it fixes a regression that broke resume
under certain configurations due to irqflag tracing in
native_load_gs_index().  Second, it fixes resume when the userspace
process that initiated suspect had funny segments.  The latter can be
reproduced by compiling this:

// SPDX-License-Identifier: GPL-2.0
/*
 * ldt_echo.c - Echo argv[1] while using an LDT segment
 */

int main(int argc, char **argv)
{
	int ret;
	size_t len;
	char *buf;

	const struct user_desc desc = {
                .entry_number    = 0,
                .base_addr       = 0,
                .limit           = 0xfffff,
                .seg_32bit       = 1,
                .contents        = 0, /* Data, grow-up */
                .read_exec_only  = 0,
                .limit_in_pages  = 1,
                .seg_not_present = 0,
                .useable         = 0
        };

	if (argc != 2)
		errx(1, "Usage: %s STRING", argv[0]);

	len = asprintf(&buf, "%s\n", argv[1]);
	if (len < 0)
		errx(1, "Out of memory");

	ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc));
	if (ret < -1)
		errno = -ret;
	if (ret)
		err(1, "modify_ldt");

	asm volatile ("movw %0, %%es" :: "rm" ((unsigned short)7));
	write(1, buf, len);
	return 0;
}

and running ldt_echo >/sys/power/mem

Without the fix, the latter causes a triple fault on resume.

Fixes: ca37e57b ("x86/entry/64: Add missing irqflags tracing to native_load_gs_index()")
Reported-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/6b31721ea92f51ea839e79bd97ade4a75b1eeea2.1512057304.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

5b06bbcf

x86/PCI: Make broadcom_postcore_init() check acpi_disabled · ddec3bde

由 Rafael J. Wysocki 提交于 12月 01, 2017

acpi_os_get_root_pointer() may return a valid address even if acpi_disabled
is set, but the host bridge information from the ACPI tables is not going
to be used in that case and the Broadcom host bridge initialization should
not be skipped then, So make broadcom_postcore_init() check acpi_disabled
too to avoid this issue.

Fixes: 6361d72b (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan)
Reported-by: NDave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>

ddec3bde

x86/microcode/AMD: Add support for fam17h microcode loading · f4e9b7af

由 Tom Lendacky 提交于 11月 30, 2017

The size for the Microcode Patch Block (MPB) for an AMD family 17h
processor is 3200 bytes. Add a #define for fam17h so that it does
not default to 2048 bytes and fail a microcode load/update.
Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

f4e9b7af

x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD · e3811a3f

由 Rudolf Marek 提交于 11月 28, 2017

The latest AMD AMD64 Architecture Programmer's Manual
adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]).

If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES
/ FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers,
thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs.
Signed-off-by: NRudolf Marek <r.marek@assembler.cz>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NBorislav Petkov <bp@suse.de>
Tested-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.czSigned-off-by: NIngo Molnar <mingo@kernel.org>

e3811a3f

28 11月, 2017 2 次提交

x86/idt: Load idt early in start_secondary · 55d2d0ad

由 Chunyu Hu 提交于 11月 27, 2017

On a secondary, idt is first loaded in cpu_init() with load_current_idt(),
i.e. no exceptions can be handled before that point.

The conversion of WARN() to use UD requires the IDT being loaded earlier as
any warning between start_secondary() and load_curren_idt() in cpu_init()
will result in an unhandled @UD exception and therefore fail the bringup of
the CPU.

Install the IDT handlers right in start_secondary() before calling cpu_init().

[ tglx: Massaged changelog ]

Fixes: 9a93848f ("x86/debug: Implement __WARN() using UD0")
Signed-off-by: NChunyu Hu <chuhu@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: rostedt@goodmis.org
Cc: luto@kernel.org
Link: https://lkml.kernel.org/r/1511792499-4073-1-git-send-email-chuhu@redhat.com

55d2d0ad

x86/xen: Support early interrupts in xen pv guests · 42b3a4cb

由 Juergen Gross 提交于 11月 24, 2017

Add early interrupt handlers activated by idt_setup_early_handler() to
the handlers supported by Xen pv guests. This will allow for early
WARN() calls not crashing the guest.
Suggested-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Link: https://lkml.kernel.org/r/20171124084221.30172-1-jgross@suse.com

42b3a4cb

25 11月, 2017 2 次提交

x86/tlb: Disable interrupts when changing CR4 · 9d0b6232

由 Nadav Amit 提交于 11月 24, 2017

CR4 modifications are implemented as RMW operations which update a shadow
variable and write the result to CR4. The RMW operation is protected by
preemption disable, but there is no enforcement or debugging mechanism.

CR4 modifications happen also in interrupt context via
__native_flush_tlb_global(). This implementation does not affect a
interrupted thread context CR4 operation, because the CR4 toggle restores
the original content and does not modify the shadow variable.

So the current situation seems to be safe, but a recent patch tried to add
an actual RMW operation in interrupt context, which will cause subtle
corruptions.

To prevent that and make the CR4 handling future proof:

 - Add a lockdep assertion to __cr4_set() which will catch interrupt
   enabled invocations

 - Disable interrupts in the cr4 manipulator inlines

 - Rename cr4_toggle_bits() to cr4_toggle_bits_irqsoff(). This is called
   from __switch_to_xtra() where interrupts are already disabled and
   performance matters.

All other call sites are not performance critical, so the extra overhead of
an additional local_irq_save/restore() pair is not a problem. If new call
sites care about performance then the necessary _irqsoff() variants can be
added.

[ tglx: Condensed the patch by moving the irq protection inside the
  	manipulator functions. Updated changelog ]
Signed-off-by: NNadav Amit <namit@vmware.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Luck <tony.luck@intel.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: nadav.amit@gmail.com
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20171125032907.2241-3-namit@vmware.com

9d0b6232

x86/tlb: Refactor CR4 setting and shadow write · 0c3292ca

由 Nadav Amit 提交于 11月 24, 2017

Refactor the write to CR4 and its shadow value. This is done in
preparation for the addition of an assertion to check that IRQs are
disabled during CR4 update.

No functional change.
Signed-off-by: NNadav Amit <namit@vmware.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: nadav.amit@gmail.com
Cc: Andy Lutomirski <luto@kernel.org>
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20171125032907.2241-2-namit@vmware.com

0c3292ca

24 11月, 2017 4 次提交

x86/decoder: Add new TEST instruction pattern · 12a78d43

由 Masami Hiramatsu 提交于 11月 24, 2017

The kbuild test robot reported this build warning:

  Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c

  Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx)
  Warning: objdump says 3 bytes, but insn_get_length() says 2
  Warning: decoded and checked 1569014 instructions with 1 warnings

This sequence seems to be a new instruction not in the opcode map in the Intel SDM.

The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8.
Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of
the ModR/M Byte (bits 2,1,0 in parenthesis)"

In that table, opcodes listed by the index REG bits as:

  000         001       010 011  100        101        110         111
 TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX

So, it seems TEST Ib is assigned to 001.

Add the new pattern.
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

12a78d43

x86/PCI: Remove unused HyperTransport interrupt support · fd2fa6c1

由 Bjorn Helgaas 提交于 11月 22, 2017

There are no in-tree callers of ht_create_irq(), the driver interface for
HyperTransport interrupts, left.  Remove the unused entry point and all the
supporting code.

See 8b955b0d ("[PATCH] Initial generic hypertransport interrupt
support").
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pci@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: https://lkml.kernel.org/r/20171122221337.3877.23362.stgit@bhelgaas-glaptop.roam.corp.google.com

fd2fa6c1

x86/umip: Fix insn_get_code_seg_params()'s return value · e2a5dca7

由 Borislav Petkov 提交于 11月 23, 2017

In order to save on redundant structs definitions
insn_get_code_seg_params() was made to return two 4-bit values in a char
but clang complains:

  arch/x86/lib/insn-eval.c:780:10: warning: implicit conversion from 'int' to 'char'
	  changes value from 132 to -124 [-Wconstant-conversion]
                  return INSN_CODE_SEG_PARAMS(4, 8);
                  ~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~
  ./arch/x86/include/asm/insn-eval.h:16:57: note: expanded from macro 'INSN_CODE_SEG_PARAMS'
  #define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))

Those two values do get picked apart afterwards the opposite way of how
they were ORed so wrt to the LSByte, the return value is the same.

But this function returns -EINVAL in the error case, which is an int. So
make it return an int which is the native word size anyway and thus fix
the clang warning.
Reported-by: NKees Cook <keescook@google.com>
Reported-by: NNick Desaulniers <nick.desaulniers@gmail.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: ricardo.neri-calderon@linux.intel.com
Link: https://lkml.kernel.org/r/20171123091951.1462-1-bp@alien8.de

e2a5dca7

x86/boot/KASLR: Remove unused variable · 69550d41

由 Chao Fan 提交于 11月 23, 2017

There are two variables "rc" in mem_avoid_memmap. One at the top of the
function and another one inside the while() loop. Drop the outer one as it
is unused. Cleanup some whitespace damage while at it.
Signed-off-by: NChao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: gregkh@linuxfoundation.org
Cc: n-horiguchi@ah.jp.nec.com
Cc: keescook@chromium.org
Link: https://lkml.kernel.org/r/20171123090847.15293-1-fanc.fnst@cn.fujitsu.com

69550d41

23 11月, 2017 1 次提交

x86/entry/64: Add missing irqflags tracing to native_load_gs_index() · ca37e57b

由 Andy Lutomirski 提交于 11月 22, 2017

Running this code with IRQs enabled (where dummy_lock is a spinlock):

static void check_load_gs_index(void)
{
	/* This will fail. */
	load_gs_index(0xffff);

	spin_lock(&dummy_lock);
	spin_unlock(&dummy_lock);
}

Will generate a lockdep warning.  The issue is that the actual write
to %gs would cause an exception with IRQs disabled, and the exception
handler would, as an inadvertent side effect, update irqflag tracing
to reflect the IRQs-off status.  native_load_gs_index() would then
turn IRQs back on and return with irqflag tracing still thinking that
IRQs were off.  The dummy lock-and-unlock causes lockdep to notice the
error and warn.

Fix it by adding the missing tracing.

Apparently nothing did this in a context where it mattered.  I haven't
tried to find a code path that would actually exhibit the warning if
appropriately nasty user code were running.

I suspect that the security impact of this bug is very, very low --
production systems don't run with lockdep enabled, and the warning is
mostly harmless anyway.

Found during a quick audit of the entry code to try to track down an
unrelated bug that Ingo found in some still-in-development code.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

ca37e57b

22 11月, 2017 2 次提交

x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow · f68d62a5

由 Andrey Ryabinin 提交于 11月 15, 2017

[ Note, this commit is a cherry-picked version of:

    d17a1d97: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")

  ... for easier x86 entry code testing and back-porting. ]

The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: NPavel Tatashin <pasha.tatashin@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

f68d62a5

x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing · 548c3050

由 Andy Lutomirski 提交于 11月 21, 2017

When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it.  This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.

I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.)  I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.

Move TRACE_IRQS_OFF below the label.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 8a9949bc ("x86/xen/64: Rearrange the SYSCALL entries")
Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

548c3050

21 11月, 2017 5 次提交

x86/pkeys/selftests: Fix protection keys write() warning · 7b659ee3

由 Dave Hansen 提交于 11月 10, 2017

write() is marked as having a must-check return value.  Check it and
abort if we fail to write an error message from a signal handler.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001232.94813E58@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

7b659ee3

x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey' · 91c49c2d

由 Dave Hansen 提交于 11月 10, 2017

'si_pkey' is now #defined to be the name of the new siginfo field that
protection keys uses.  Rename it not to conflict.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001231.DFFC8285@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

91c49c2d

x86/mpx/selftests: Fix up weird arrays · a6400120

由 Dave Hansen 提交于 11月 10, 2017

The MPX hardware data structurse are defined in a weird way: they define
their size in bytes and then union that with the type with which we want
to access them.

Yes, this is weird, but it does work.  But, new GCC's complain that we
are accessing the array out of bounds.  Just make it a zero-sized array
so gcc will stop complaining.  There was not really a bug here.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001229.58A7933D@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

a6400120

x86/pkeys: Update documentation about availability · c51ff2c7

由 Dave Hansen 提交于 11月 10, 2017

Now that CPUs that implement Memory Protection Keys are publicly
available we can be a bit less oblique about where it is available.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001228.DC748A10@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c51ff2c7

x86/umip: Print a warning into the syslog if UMIP-protected instructions are used · fd11a649

由 Ricardo Neri 提交于 11月 20, 2017

Print a rate-limited warning when a user-space program attempts to execute
any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR
and SMSW).

This is useful, because when CONFIG_X86_INTEL_UMIP=y is selected and
supported by the hardware, user space programs that try to execute such
instructions will receive a SIGSEGV signal that they might not expect.

In the specific cases for which emulation is provided (instructions SGDT,
SIDT and SMSW in protected and virtual-8086 modes), no signal is
generated. However, a warning is helpful to encourage updates in such
programs to avoid the use of such instructions.

Warnings are printed via a customized printk() function that also provides
information about the program that attempted to use the affected
instructions.

Utility macros are defined to wrap umip_printk() for the error and warning
kernel log levels.

While here, replace an existing call to the generic rate-limited pr_err()
with the new umip_pr_err().
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1511233476-17088-1-git-send-email-ricardo.neri-calderon@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fd11a649

17 11月, 2017 3 次提交

x86/smpboot: Fix __max_logical_packages estimate · b4c0a732

由 Prarit Bhargava 提交于 11月 14, 2017

A system booted with a small number of cores enabled per package
panics because the estimate of __max_logical_packages is too low.

This occurs when the total number of active cores across all packages is
less than the maximum core count for a single package. e.g.:

  On a 4 package system with 20 cores/package where only 4 cores are
  enabled on each package, the value of __max_logical_packages is
  calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.

Calculate __max_logical_packages after the cpu enumeration has completed.
Use the boot cpu's data to extrapolate the number of packages.
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-4-prarit@redhat.com

b4c0a732

x86/topology: Avoid wasting 128k for package id array · 30bb9811

由 Andi Kleen 提交于 11月 14, 2017

Analyzing large early boot allocations unveiled the logical package id
storage as a prominent memory waste. Since commit 1f12e32f
("x86/topology: Create logical package id") every 64-bit system allocates a
128k array to convert logical package ids.

This happens because the array is sized for MAX_LOCAL_APIC which is always
32k on 64bit systems, and it needs 4 bytes for each entry.

This is fairly wasteful, especially for the common case of having only one
socket, which uses exactly 4 byte out of 128K.

There is no user of the package id map which is performance critical, so
the lookup is not required to be O(1). Store the logical processor id in
cpu_data and use a loop based lookup.

To keep the mapping stable accross cpu hotplug operations, add a flag to
cpu_data which is set when the CPU is brought up the first time. When the
flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on
subsequent bringups.

[ tglx: Rename the flag to 'initialized', use proper pointers instead of
  	repeated cpu_data(x) evaluation and massage changelog. ]
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com

30bb9811

perf/x86/intel/uncore: Cache logical pkg id in uncore driver · d46b4c1c

由 Andi Kleen 提交于 11月 14, 2017

The SNB-EP uncore driver is the only user of topology_phys_to_logical_pkg
in a performance critical path.

Change it query the logical pkg ID only once at initialization time and
then cache it in box structure. This allows to change the logical package
management without affecting the performance critical path.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-2-prarit@redhat.com

d46b4c1c

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功