提交 · 16a6d9be90373fb0b521850cd0185a4d460dd152 · openanolis / cloud-kernel

20 1月, 2017 1 次提交

sched/clock: Fix hotplug crash · acb04058

由 Peter Zijlstra 提交于 1月 19, 2017

Mike reported that he could trigger the WARN_ON_ONCE() in
set_sched_clock_stable() using hotplug.

This exposed a fundamental problem with the interface, we should never
mark the TSC stable if we ever find it to be unstable. Therefore
set_sched_clock_stable() is a broken interface.

The reason it existed is that not having it is a pain, it means all
relevant architecture code needs to call clear_sched_clock_stable()
where appropriate.

Of the three architectures that select HAVE_UNSTABLE_SCHED_CLOCK ia64
and parisc are trivial in that they never called
set_sched_clock_stable(), so add an unconditional call to
clear_sched_clock_stable() to them.

For x86 the story is a lot more involved, and what this patch tries to
do is ensure we preserve the status quo. So even is Cyrix or Transmeta
have usable TSC they never called set_sched_clock_stable() so they now
get an explicit mark unstable.
Reported-by: NMike Galbraith <efault@gmx.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 9881b024 ("sched/clock: Delay switching sched_clock to stable")
Link: http://lkml.kernel.org/r/20170119133633.GB6536@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

acb04058

19 1月, 2017 1 次提交

sched/x86: Remove unnecessary TBM3 check to update topology · 02cfdc95

由 Tim Chen 提交于 1月 18, 2017

Scheduling to the max performance core is enabled by
default for Turbo Boost Maxt Technology 3.0 capable platforms.

Remove the useless sysctl_sched_itmt_enabled check to
update sched topology for adding the prioritized core scheduling
flag.
Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: jolsa@redhat.com
Cc: linux-acpi@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/1484778629-4404-1-git-send-email-tim.c.chen@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

02cfdc95

18 1月, 2017 1 次提交

x86/ioapic: Restore IO-APIC irq_chip retrigger callback · 020eb3da

由 Ruslan Ruslichenko 提交于 1月 17, 2017

commit d32932d0 removed the irq_retrigger callback from the IO-APIC
chip and did not add it to the new IO-APIC-IR irq chip.

Unfortunately the software resend fallback is not enabled on X86, so edge
interrupts which are received during the lazy disabled state of the
interrupt line are not retriggered and therefor lost.

Restore the callbacks.

[ tglx: Massaged changelog ]

Fixes: d32932d0  ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Signed-off-by: NRuslan Ruslichenko <rruslich@cisco.com>
Cc: xe-linux-external@cisco.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rruslich@cisco.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

020eb3da

14 1月, 2017 2 次提交

sched/clock, clocksource: Add optional cs::mark_unstable() method · 12907fbb

由 Thomas Gleixner 提交于 12月 15, 2016

PeterZ reported that we'd fail to mark the TSC unstable when the
clocksource watchdog finds it unsuitable.

Allow a clocksource to run a custom action when its being marked
unstable and hook up the TSC unstable code.
Reported-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

12907fbb

x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc() · 695085b4

由 Len Brown 提交于 1月 13, 2017

The Intel Denverton microserver uses a 25 MHz TSC crystal,
so we can derive its exact [*] TSC frequency
using CPUID and some arithmetic, eg.:

  TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000)

[*] 'exact' is only as good as the crystal, which should be +/- 20ppm
Signed-off-by: NLen Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/306899f94804aece6d8fa8b4223ede3b48dbb59c.1484287748.git.len.brown@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

695085b4

12 1月, 2017 2 次提交

x86/unwind: Disable KASAN checks for non-current tasks · 84936118

由 Josh Poimboeuf 提交于 1月 09, 2017

There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless.  The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

In such cases, it's possible that the unwinder may read a KASAN-poisoned
region of the stack.  Account for that by using READ_ONCE_NOCHECK() when
reading the stack of another task.

Use READ_ONCE() when reading the stack of the current task, since KASAN
warnings can still be useful for finding bugs in that case.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

84936118

x86/unwind: Silence warnings for non-current tasks · 900742d8

由 Josh Poimboeuf 提交于 1月 09, 2017

There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless.  The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

Since stack "corruption" on another task's stack isn't necessarily a
bug, silence the warnings when unwinding tasks other than current.
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

900742d8

10 1月, 2017 4 次提交

x86/microcode/intel: Use correct buffer size for saving microcode data · 2e86222c

由 Junichi Nomura 提交于 1月 09, 2017

In generic_load_microcode(), curr_mc_size is the size of the last
allocated buffer and since we have this performance "optimization"
there to vmalloc a new buffer only when the current one is bigger,
curr_mc_size ends up becoming the size of the biggest buffer we've seen
so far.

However, we end up saving the microcode patch which matches our CPU
and its size is not curr_mc_size but the respective mc_size during the
iteration while we're staring at it.

So save that mc_size into a separate variable and use it to store the
previously found microcode buffer.

Without this fix, we could get oops like this:

  BUG: unable to handle kernel paging request at ffffc9000e30f000
  IP: __memcpy+0x12/0x20
  ...
  Call Trace:
  ? kmemdup+0x43/0x60
  __alloc_microcode_buf+0x44/0x70
  save_microcode_patch+0xd4/0x150
  generic_load_microcode+0x1b8/0x260
  request_microcode_user+0x15/0x20
  microcode_write+0x91/0x100
  __vfs_write+0x34/0x120
  vfs_write+0xc1/0x130
  SyS_write+0x56/0xc0
  do_syscall_64+0x6c/0x160
  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: 06b8534c ("x86/microcode: Rework microcode loading")
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/4f33cbfd-44f2-9bed-3b66-7446cd14256f@ce.jp.nec.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2e86222c

x86/microcode/intel: Fix allocation size of struct ucode_patch · 9fcf5ba2

由 Junichi Nomura 提交于 1月 09, 2017

We allocate struct ucode_patch here. @size is the size of microcode data
and used for kmemdup() later in this function.

Fixes: 06b8534c ("x86/microcode: Rework microcode loading")
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/7a730dc9-ac17-35c4-fe76-dfc94e5ecd95@ce.jp.nec.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9fcf5ba2

x86/microcode/intel: Add a helper which gives the microcode revision · 4167709b

由 Borislav Petkov 提交于 1月 09, 2017

Since on Intel we're required to do CPUID(1) first, before reading
the microcode revision MSR, let's add a special helper which does the
required steps so that we don't forget to do them next time, when we
want to read the microcode revision.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-4-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4167709b

x86/microcode: Use native CPUID to tickle out microcode revision · f3e2a51f

由 Borislav Petkov 提交于 1月 09, 2017

Intel supplies the microcode revision value in MSR 0x8b
(IA32_BIOS_SIGN_ID) after CPUID(1) has been executed. Execute it each
time before reading that MSR.

It used to do sync_core() which did do CPUID but

  c198b121 ("x86/asm: Rewrite sync_core() to use IRET-to-self")

changed the sync_core() implementation so we better make the microcode
loading case explicit, as the SDM documents it.
Reported-and-tested-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f3e2a51f

06 1月, 2017 1 次提交

x86/CPU/AMD: Fix Bulldozer topology · a33d3317

由 Borislav Petkov 提交于 1月 05, 2017

The following commit:

  8196dab4 ("x86/cpu: Get rid of compute_unit_id")

... broke the initial strategy for Bulldozer-based cores' topology,
where we consider each thread of a compute unit a standalone core
and not a HT or SMT thread.

Revert to the firmware-supplied core_id numbering and do not make
them thread siblings as we don't consider them for such even if they
technically are, more or less.
Reported-and-tested-by: NBrice Goglin <Brice.Goglin@inria.fr>
Tested-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8196dab4 ("x86/cpu: Get rid of compute_unit_id")
Link: http://lkml.kernel.org/r/20170105092638.5247-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

a33d3317

05 1月, 2017 1 次提交

x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option · dd853fd2

由 Lukasz Odzioba 提交于 12月 28, 2016

A negative number can be specified in the cmdline which will be used as
setup_clear_cpu_cap() argument. With that we can clear/set some bit in
memory predceeding boot_cpu_data/cpu_caps_cleared which may cause kernel
to misbehave. This patch adds lower bound check to setup_disablecpuid().

Boris Petkov reproduced a crash:

  [    1.234575] BUG: unable to handle kernel paging request at ffffffff858bd540
  [    1.236535] IP: memcpy_erms+0x6/0x10
Signed-off-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andi.kleen@intel.com
Cc: bp@alien8.de
Cc: dave.hansen@linux.intel.com
Cc: luto@kernel.org
Cc: slaoub@gmail.com
Fixes: ac72e788 ("x86: add generic clearcpuid=... option")
Link: http://lkml.kernel.org/r/1482933340-11857-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

dd853fd2

27 12月, 2016 1 次提交

x86/mce/AMD: Make the init code more robust · 0dad3a30

由 Thomas Gleixner 提交于 12月 26, 2016

If mce_device_init() fails then the mce device pointer is NULL and the
AMD mce code happily dereferences it.

Add a sanity check.
Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0dad3a30

25 12月, 2016 4 次提交

clocksource: Use a plain u64 instead of cycle_t · a5a1d1c2

由 Thomas Gleixner 提交于 12月 21, 2016

There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>

a5a1d1c2

cpu/hotplug: Cleanup state names · 73c1b41e

由 Thomas Gleixner 提交于 12月 21, 2016

When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

73c1b41e

x86/msr: Remove bogus cleanup from the error path · 59fefd08

由 Thomas Gleixner 提交于 12月 22, 2016

The error cleanup which is invoked when the hotplug state setup failed
tries to remove the failed state, which is broken.

Fixes: 8fba38c9 ("x86/msr: Convert to hotplug state machine")
Reported-by: Nkernel test robot <fengguang.wu@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>

59fefd08

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

24 12月, 2016 1 次提交

Revert "x86/unwind: Detect bad stack return address" · c280f773

由 Josh Poimboeuf 提交于 12月 22, 2016

Revert the following commit:

  b6959a36 ("x86/unwind: Detect bad stack return address")

... because Andrey Konovalov reported an unwinder warning:

  WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467

The unwind was initiated from an interrupt which occurred while running in the
generated code for a kprobe.  The unwinder printed the warning because it
expected regs->ip to point to a valid text address, but instead it pointed to
the generated code.

Eventually we may want come up with a way to identify generated kprobe
code so the unwinder can know that it's a valid return address.  Until
then, just remove the warning.
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c280f773

23 12月, 2016 1 次提交

x86/paravirt: Mark unused patch_default label · cef4402d

由 Peter Zijlstra 提交于 12月 16, 2016

A bugfix commit:

  45dbea5f ("x86/paravirt: Fix native_patch()")

... introduced a harmless warning:

  arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
  arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined but not used [-Werror=unused-label]

Fix it by annotating the label as __maybe_unused.
Reported-by: NArnd Bergmann <arnd@arndb.de>
Reported-by: NPiotr Gregor <piotrgregor@rsyncme.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 45dbea5f ("x86/paravirt: Fix native_patch()")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

cef4402d

21 12月, 2016 1 次提交

x86/microcode/AMD: Reload proper initrd start address · 8877ebdd

由 Borislav Petkov 提交于 12月 20, 2016

When we switch to virtual addresses and, especially after
reserve_initrd()->relocate_initrd() have run, we have the updated initrd
address in initrd_start. Use initrd_start then instead of the address
which has been passed to us through boot params. (That still gets used
when we're running the very early routines on the BSP).
Reported-and-tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20161220144012.lc4cwrg6dphqbyqu@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8877ebdd

20 12月, 2016 2 次提交

x86/alternatives: Do not use sync_core() to serialize I$ · 34bfab0e

由 Borislav Petkov 提交于 12月 03, 2016

We use sync_core() in the alternatives code to stop speculative
execution of prefetched instructions because we are potentially changing
them and don't want to execute stale bytes.

What it does on most machines is call CPUID which is a serializing
instruction. And that's expensive.

However, the instruction cache is serialized when we're on the local CPU
and are changing the data through the same virtual address. So then, we
don't need the serializing CPUID but a simple control flow change. Last
being accomplished with a CALL/RET which the noinline causes.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NAndy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161203150258.vwr5zzco7ctgc4pe@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>

34bfab0e

x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic · 59107e2f

由 Vitaly Kuznetsov 提交于 12月 02, 2016

There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).

To properly unload VMBus making it possible to start over during kdump we
need to do the following:

 - Send an 'unload' message to the hypervisor. This can be done on any CPU
   so we do this the crashing CPU.

 - Receive the 'unload finished' reply message. WS2012R2 delivers this
   message to the CPU which was used to establish VMBus connection during
   module load and this CPU may differ from the CPU sending 'unload'.

Receiving a VMBus message means the following:

 - There is a per-CPU slot in memory for one message. This slot can in
   theory be accessed by any CPU.

 - We get an interrupt on the CPU when a message was placed into the slot.

 - When we read the message we need to clear the slot and signal the fact
   to the hypervisor. In case there are more messages to this CPU pending
   the hypervisor will deliver the next message. The signaling is done by
   writing to an MSR so this can only be done on the appropriate CPU.

To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:

 - We're crashing on a CPU which is different from the one which was used
   to initially contact the hypervisor.

 - The CPU which was used for the initial contact is blocked with interrupts
   disabled and there is a message pending in the message slot.

In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.

The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.

The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: NK. Y. Srinivasan <kys@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

59107e2f

19 12月, 2016 12 次提交

swiotlb: Convert swiotlb_force from int to enum · ae7871be

由 Geert Uytterhoeven 提交于 12月 16, 2016

Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.
Suggested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ae7871be

x86, swiotlb: Simplify pci_swiotlb_detect_override() · 6c206e4d

由 Geert Uytterhoeven 提交于 12月 16, 2016

At the end of the function, the local variable use_swiotlb has always
the same value as the global variable swiotlb. Hence drop the local
variable completely.
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6c206e4d

x86/microcode/intel: Replace sync_core() with native_cpuid() · 484d0e5c

由 Andy Lutomirski 提交于 12月 09, 2016

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1".  I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.
Reported-by: NHenrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NBorislav Petkov <bp@alien8.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

484d0e5c

x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6 · 3df8d920

由 Andy Lutomirski 提交于 12月 15, 2016

A typo (or mis-merge?) resulted in leaf 6 only being probed if
cpuid_level >= 7.

Fixes: 2ccd71f1 ("x86/cpufeature: Move some of the scattered feature bits to x86_capability")
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NBorislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Link: http://lkml.kernel.org/r/6ea30c0e9daec21e488b54761881a6dfcf3e04d0.1481825597.git.luto@kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

3df8d920

x86/unwind: Dump stack data on warnings · 8b5e99f0

由 Josh Poimboeuf 提交于 12月 16, 2016

The unwinder warnings are good at finding unexpected unwinder issues,
but they often don't give enough data to be able to fully diagnose them.
Print a one-time stack dump when a warning is detected.
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/15607370e3ddb1732b6a73d5c65937864df16ac8.1481904011.git.jpoimboe@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8b5e99f0

x86/unwind: Adjust last frame check for aligned function stacks · 8023e0e2

由 Josh Poimboeuf 提交于 12月 16, 2016

Somehow, CONFIG_PARAVIRT=n convinces gcc to change the
x86_64_start_kernel() prologue from:

  0000000000000129 <x86_64_start_kernel>:
   129:	55                   	push   %rbp
   12a:	48 89 e5             	mov    %rsp,%rbp

to:

  0000000000000124 <x86_64_start_kernel>:
   124:	4c 8d 54 24 08       	lea    0x8(%rsp),%r10
   129:	48 83 e4 f0          	and    $0xfffffffffffffff0,%rsp
   12d:	41 ff 72 f8          	pushq  -0x8(%r10)
   131:	55                   	push   %rbp
   132:	48 89 e5             	mov    %rsp,%rbp

This is an unusual pattern which aligns rsp (though in this case it's
already aligned) and saves the start_cpu() return address again on the
stack before storing the frame pointer.

The unwinder assumes the last stack frame header is at a certain offset,
but the above code breaks that assumption, resulting in the following
warning:

  WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value           (null)

Fix it by checking for the last task stack frame at the aligned offset
in addition to the normal unaligned offset.

Fixes: acb4608a ("x86/unwind: Create stack frames for saved syscall registers")
Reported-by: NBorislav Petkov <bp@alien8.de>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/9d7b4eb8cf55a7d6002cb738f25c23e7429c99a0.1481904011.git.jpoimboe@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8023e0e2

x86/init: Remove i8042_detect() from platform ops · 32786fdc

由 Dmitry Torokhov 提交于 12月 09, 2016

Now that i8042 uses flag in legacy platform data, i8042_detect() is
no longer used and can be removed.
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Acked-by: NMarcos Paulo de Souza <marcos.souza.org@gmail.com>
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/1481317061-31486-4-git-send-email-dmitry.torokhov@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

32786fdc

x86/init: Add i8042 state to the platform data · 93ffa9a4

由 Dmitry Torokhov 提交于 12月 09, 2016

Add i8042 state to the platform data to help i8042 driver make decision
whether to probe for i8042 or not. We recognize 3 states: platform/subarch
ca not possible have i8042 (as is the case with Inrel MID platform),
firmware (such as ACPI) reports that i8042 is absent from the device,
or i8042 may be present and the driver should probe for it.

The intent is to allow i8042 driver abort initialization on x86 if PNP data
(absence of both keyboard and mouse PNP devices) agrees with firmware data.

It will also allow us to remove i8042_detect later.
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Acked-by: NMarcos Paulo de Souza <marcos.souza.org@gmail.com>
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/1481317061-31486-2-git-send-email-dmitry.torokhov@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

93ffa9a4

x86/microcode/AMD: Use native_cpuid() in load_ucode_amd_bsp() · 2b4c9156

由 Boris Ostrovsky 提交于 12月 18, 2016

When CONFIG_PARAVIRT is selected, cpuid() becomes a call. Since
for 32-bit kernels load_ucode_amd_bsp() is executed before paging
is enabled the call cannot be completed (as kernel virtual addresses
are not reachable yet).

Use native_cpuid() instead which is an asm wrapper for the CPUID
instruction.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Jürgen Gross <jgross@suse.com>
Link: http://lkml.kernel.org/r/1481906392-3847-1-git-send-email-boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/20161218164414.9649-5-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2b4c9156

x86/microcode/AMD: Do not load when running on a hypervisor · a15a7535

由 Borislav Petkov 提交于 12月 18, 2016

Doing so is completely void of sense for multiple reasons so prevent
it. Set dis_ucode_ldr to true and thus disable the microcode loader by
default to address xen pv guests which execute the AP path but not the
BSP path.

By having it turned off by default, the APs won't run into the loader
either.

Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
xen pv one. That one gets the aforementioned "fix".

Also, improve the detection method by caching the final decision whether
to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
then simply test that value.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Tested-by: NJuergen Gross <jgross@suse.com>
Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: NJuergen Gross <jgross@suse.com>
Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

a15a7535

x86/microcode/AMD: Sanitize apply_microcode_early_amd() · 200d3553

由 Borislav Petkov 提交于 12月 18, 2016

Make it simply return bool to denote whether it found a container or not
and return the pointer to the container and its size in the handed-in
container pointer instead, as returning a struct was just silly.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Jürgen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/20161218164414.9649-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

200d3553

x86/microcode/AMD: Make find_proper_container() sane again · 8feaa64a

由 Borislav Petkov 提交于 12月 18, 2016

Fixup signature and retvals, return the container struct through the
passed in pointer, not as a function return value.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Jürgen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/20161218164414.9649-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8feaa64a

18 12月, 2016 2 次提交

x86/tsc: Limit the adjust value further · 8c9b9d87

由 Thomas Gleixner 提交于 12月 18, 2016

Adjust value 0x80000000 and other values larger than that render the TSC
deadline timer disfunctional.

We have not yet any information about this from Intel, but experimentation
clearly proves that this is a 32/64 bit and sign extension issue.

If adjust values larger than that are actually required, which might be the
case for physical CPU hotplug, then we need to disable the deadline timer
on the affected package/CPUs and use the local APIC timer instead.

That requires some surgery in the APIC setup code, so we just limit the
ADJUST register value into the known to work range for now and revisit this
when Intel comes forth with proper information.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>

8c9b9d87

x86/tsc: Annotate printouts as firmware bug · 16588f65

由 Thomas Gleixner 提交于 12月 18, 2016

Make it more obvious that the BIOS is screwed up.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>

16588f65

15 12月, 2016 3 次提交

x86/tsc: Force TSC_ADJUST register to value >= zero · 5bae1562

由 Thomas Gleixner 提交于 12月 13, 2016

Roland reported that his DELL T5810 sports a value add BIOS which
completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with
random negative TSC_ADJUST values, different on all CPUs. That renders the
TSC useless because the sycnchronization check fails.

Roland tested the new TSC_ADJUST mechanism. While it manages to readjust
the TSCs he needs to disable the TSC deadline timer, otherwise the machine
just stops booting.

Deeper investigation unearthed that the TSC deadline timer is sensitive to
the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an
interrupt storm caused by the TSC deadline timer.

This does not make any sense and it's hard to imagine what kind of hardware
wreckage is behind that misfeature, but it's reliably reproducible on other
systems which have TSC_ADJUST and TSC deadline timer.

While it would be understandable that a big enough negative value which
moves the resulting TSC readout into the negative space could have the
described effect, this happens even with a adjust value of -1, which keeps
the TSC readout definitely in the positive space. The compare register for
the TSC deadline timer is set to a positive value larger than the TSC, but
despite not having reached the deadline the interrupt is raised
immediately. If this happens on the boot CPU, then the machine dies
silently because this setup happens before the NMI watchdog is armed.

Further experiments showed that any other adjustment of TSC_ADJUST works as
expected as long as it stays in the positive range. The direction of the
adjustment has no influence either. See the lkml link for further analysis.

Yet another proof for the theory that timers are designed by janitors and
the underlying (obviously undocumented) mechanisms which allow BIOSes to
wreckage them are considered a feature. Well done Intel - NOT!

To address this wreckage add the following sanity measures:

- If the TSC_ADJUST value on the boot cpu is not 0, set it to 0

- If the TSC_ADJUST value on any cpu is negative, set it to 0

- Prevent the cross package synchronization mechanism from setting negative
  TSC_ADJUST values.
Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Allen Hung <allen_hung@dell.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5bae1562

x86/tsc: Validate TSC_ADJUST after resume · 6a369583

由 Thomas Gleixner 提交于 12月 13, 2016

Some 'feature' BIOSes fiddle with the TSC_ADJUST register during
suspend/resume which renders the TSC unusable.

Add sanity checks into the resume path and restore the
original value if it was adjusted.
Reported-and-tested-by: NRoland Scheidegger <rscheidegger_lists@hispeed.ch>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Allen Hung <allen_hung@dell.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6a369583

x86/acpi: Use proper macro for invalid node · 4370a3ef

由 Boris Ostrovsky 提交于 12月 12, 2016

Use NUMA_NO_NODE instead of -1.
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: len.brown@intel.com
Cc: rjw@rjwysocki.net
Cc: linux-acpi@vger.kernel.org
Cc: pavel@ucw.cz
Link: http://lkml.kernel.org/r/1481570993-13941-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4370a3ef

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功