提交 · 695085b4bc7603551db0b3da897b8bf9893ca218 · OpenHarmony / kernel_linux

14 1月, 2017 1 次提交

x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc() · 695085b4

由 Len Brown 提交于 1月 13, 2017

The Intel Denverton microserver uses a 25 MHz TSC crystal,
so we can derive its exact [*] TSC frequency
using CPUID and some arithmetic, eg.:

  TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000)

[*] 'exact' is only as good as the crystal, which should be +/- 20ppm
Signed-off-by: NLen Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/306899f94804aece6d8fa8b4223ede3b48dbb59c.1484287748.git.len.brown@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

695085b4

12 1月, 2017 4 次提交

x86/entry: Fix the end of the stack for newly forked tasks · ff3f7e24

由 Josh Poimboeuf 提交于 1月 09, 2017

When unwinding a task, the end of the stack is always at the same offset
right below the saved pt_regs, regardless of which syscall was used to
enter the kernel.  That convention allows the unwinder to verify that a
stack is sane.

However, newly forked tasks don't always follow that convention, as
reported by the following unwinder warning seen by Dave Jones:

  WARNING: kernel stack frame pointer at ffffc90001443f30 in kworker/u8:8:30468 has bad value           (null)

The warning was due to the following call chain:

  (ftrace handler)
  call_usermodehelper_exec_async+0x5/0x140
  ret_from_fork+0x22/0x30

The problem is that ret_from_fork() doesn't create a stack frame before
calling other functions.  Fix that by carefully using the frame pointer
macros.

In addition to conforming to the end of stack convention, this also
makes related stack traces more sensible by making it clear to the user
that ret_from_fork() was involved.
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8854cdaab980e9700a81e9ebf0d4238e4bbb68ef.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ff3f7e24

x86/unwind: Include __schedule() in stack traces · 2c96b2fe

由 Josh Poimboeuf 提交于 1月 09, 2017

In the following commit:

  0100301b ("sched/x86: Rewrite the switch_to() code")

... the layout of the 'inactive_task_frame' struct was designed to have
a frame pointer header embedded in it, so that the unwinder could use
the 'bp' and 'ret_addr' fields to report __schedule() on the stack (or
ret_from_fork() for newly forked tasks which haven't actually run yet).

Finish the job by changing get_frame_pointer() to return a pointer to
inactive_task_frame's 'bp' field rather than 'bp' itself.  This allows
the unwinder to start one frame higher on the stack, so that it properly
reports __schedule().
Reported-by: NMiroslav Benes <mbenes@suse.cz>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/598e9f7505ed0aba86e8b9590aa528c6c7ae8dcd.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

2c96b2fe

x86/unwind: Disable KASAN checks for non-current tasks · 84936118

由 Josh Poimboeuf 提交于 1月 09, 2017

There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless.  The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

In such cases, it's possible that the unwinder may read a KASAN-poisoned
region of the stack.  Account for that by using READ_ONCE_NOCHECK() when
reading the stack of another task.

Use READ_ONCE() when reading the stack of the current task, since KASAN
warnings can still be useful for finding bugs in that case.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

84936118

x86/unwind: Silence warnings for non-current tasks · 900742d8

由 Josh Poimboeuf 提交于 1月 09, 2017

There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless.  The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

Since stack "corruption" on another task's stack isn't necessarily a
bug, silence the warnings when unwinding tasks other than current.
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

900742d8

10 1月, 2017 5 次提交

x86/microcode/intel: Use correct buffer size for saving microcode data · 2e86222c

由 Junichi Nomura 提交于 1月 09, 2017

In generic_load_microcode(), curr_mc_size is the size of the last
allocated buffer and since we have this performance "optimization"
there to vmalloc a new buffer only when the current one is bigger,
curr_mc_size ends up becoming the size of the biggest buffer we've seen
so far.

However, we end up saving the microcode patch which matches our CPU
and its size is not curr_mc_size but the respective mc_size during the
iteration while we're staring at it.

So save that mc_size into a separate variable and use it to store the
previously found microcode buffer.

Without this fix, we could get oops like this:

  BUG: unable to handle kernel paging request at ffffc9000e30f000
  IP: __memcpy+0x12/0x20
  ...
  Call Trace:
  ? kmemdup+0x43/0x60
  __alloc_microcode_buf+0x44/0x70
  save_microcode_patch+0xd4/0x150
  generic_load_microcode+0x1b8/0x260
  request_microcode_user+0x15/0x20
  microcode_write+0x91/0x100
  __vfs_write+0x34/0x120
  vfs_write+0xc1/0x130
  SyS_write+0x56/0xc0
  do_syscall_64+0x6c/0x160
  entry_SYSCALL64_slow_path+0x25/0x25

Fixes: 06b8534c ("x86/microcode: Rework microcode loading")
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/4f33cbfd-44f2-9bed-3b66-7446cd14256f@ce.jp.nec.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2e86222c

x86/microcode/intel: Fix allocation size of struct ucode_patch · 9fcf5ba2

由 Junichi Nomura 提交于 1月 09, 2017

We allocate struct ucode_patch here. @size is the size of microcode data
and used for kmemdup() later in this function.

Fixes: 06b8534c ("x86/microcode: Rework microcode loading")
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/7a730dc9-ac17-35c4-fe76-dfc94e5ecd95@ce.jp.nec.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

9fcf5ba2

x86/microcode/intel: Add a helper which gives the microcode revision · 4167709b

由 Borislav Petkov 提交于 1月 09, 2017

Since on Intel we're required to do CPUID(1) first, before reading
the microcode revision MSR, let's add a special helper which does the
required steps so that we don't forget to do them next time, when we
want to read the microcode revision.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-4-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

4167709b

x86/microcode: Use native CPUID to tickle out microcode revision · f3e2a51f

由 Borislav Petkov 提交于 1月 09, 2017

Intel supplies the microcode revision value in MSR 0x8b
(IA32_BIOS_SIGN_ID) after CPUID(1) has been executed. Execute it each
time before reading that MSR.

It used to do sync_core() which did do CPUID but

  c198b121 ("x86/asm: Rewrite sync_core() to use IRET-to-self")

changed the sync_core() implementation so we better make the microcode
loading case explicit, as the SDM documents it.
Reported-and-tested-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-3-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

f3e2a51f

x86/CPU: Add native CPUID variants returning a single datum · 5dedade6

由 Borislav Petkov 提交于 1月 09, 2017

... similarly to the cpuid_<reg>() variants.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170109114147.5082-2-bp@alien8.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5dedade6

09 1月, 2017 1 次提交

x86/boot: Add missing declaration of string functions · fac69d0e

由 Nicholas Mc Guire 提交于 1月 07, 2017

Add the missing declarations of basic string functions to string.h to allow
a clean build.

Fixes: 5be86566 ("String-handling functions for the new x86 setup code.")
Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org>
Link: http://lkml.kernel.org/r/1483781911-21399-1-git-send-email-hofrat@osadl.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

fac69d0e

06 1月, 2017 1 次提交

x86/CPU/AMD: Fix Bulldozer topology · a33d3317

由 Borislav Petkov 提交于 1月 05, 2017

The following commit:

  8196dab4 ("x86/cpu: Get rid of compute_unit_id")

... broke the initial strategy for Bulldozer-based cores' topology,
where we consider each thread of a compute unit a standalone core
and not a HT or SMT thread.

Revert to the firmware-supplied core_id numbering and do not make
them thread siblings as we don't consider them for such even if they
technically are, more or less.
Reported-and-tested-by: NBrice Goglin <Brice.Goglin@inria.fr>
Tested-by: NYazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8196dab4 ("x86/cpu: Get rid of compute_unit_id")
Link: http://lkml.kernel.org/r/20170105092638.5247-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

a33d3317

05 1月, 2017 3 次提交

x86/platform/intel-mid: Rename 'spidev' to 'mrfld_spidev' · 159d3726

由 Andy Shevchenko 提交于 1月 02, 2017

The current implementation supports only Intel Merrifield platforms. Don't mess
with the rest of the Intel MID family by not registering device with wrong
properties.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170102092450.87229-1-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

159d3726

x86/cpu: Fix typo in the comment for Anniedale · 754c73cf

由 Andy Shevchenko 提交于 1月 02, 2017

The proper spelling of Anniedale SoC with 'e' in the middle. Fix typo in the
comment line in intel-family.h header.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170102092229.87036-1-andriy.shevchenko@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

754c73cf

x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option · dd853fd2

由 Lukasz Odzioba 提交于 12月 28, 2016

A negative number can be specified in the cmdline which will be used as
setup_clear_cpu_cap() argument. With that we can clear/set some bit in
memory predceeding boot_cpu_data/cpu_caps_cleared which may cause kernel
to misbehave. This patch adds lower bound check to setup_disablecpuid().

Boris Petkov reproduced a crash:

  [    1.234575] BUG: unable to handle kernel paging request at ffffffff858bd540
  [    1.236535] IP: memcpy_erms+0x6/0x10
Signed-off-by: NLukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andi.kleen@intel.com
Cc: bp@alien8.de
Cc: dave.hansen@linux.intel.com
Cc: luto@kernel.org
Cc: slaoub@gmail.com
Fixes: ac72e788 ("x86: add generic clearcpuid=... option")
Link: http://lkml.kernel.org/r/1482933340-11857-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

dd853fd2

03 1月, 2017 1 次提交

parisc: Add line-break when printing segfault info · b4a9eb4c

由 Helge Deller 提交于 1月 02, 2017

Add a leading line break else printed line gets too long.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.9

b4a9eb4c

02 1月, 2017 1 次提交

openrisc: Add _text symbol to fix ksym build error · 086cc1c3

由 Stafford Horne 提交于 12月 14, 2016

The build robot reports:

   .tmp_kallsyms1.o: In function `kallsyms_relative_base':
>> (.rodata+0x8a18): undefined reference to `_text'

This is when using 'make alldefconfig'. Adding this _text symbol to mark
the start of the kernel as in other architecture fixes this.
Signed-off-by: NStafford Horne <shorne@gmail.com>
Acked-by: NJonas Bonn <jonas@southpole.se>

086cc1c3

30 12月, 2016 3 次提交

parisc: Drop TIF_RESTORE_SIGMASK and switch to generic code · 1fe0a7e0

由 Helge Deller 提交于 12月 27, 2016

Commit 7e781418 ("signal: consolidate {TS,TLF}_RESTORE_SIGMASK code")
introduced code with which the "restore sigmask" flag lives in task_struct
instead of ti->flags. Let's use this optimization on parisc too.
Signed-off-by: NHelge Deller <deller@gmx.de>

1fe0a7e0

parisc: Mark cr16 clocksource unstable on SMP systems · 41744213

由 Helge Deller 提交于 12月 26, 2016

The cr16 interval timer of each CPU is not syncronized to other cr16
timers in other CPUs in a SMP system. So, delay the registration of the
cr16 clocksource until all CPUs have been detected and then - if we are
on a SMP machine - mark the cr16 clocksource as unstable and lower it's
rating before registering it at the clocksource framework.

This patch fixes the stalled CPU warnings which we have seen since
introduction of the cr16 clocksource.
Signed-off-by: NHelge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.8+

41744213

mm: optimize PageWaiters bit use for unlock_page() · b91e1302

由 Linus Torvalds 提交于 12月 27, 2016

In commit 62906027 ("mm: add PageWaiters indicating tasks are
waiting for a page bit") Nick Piggin made our page locking no longer
unconditionally touch the hashed page waitqueue, which not only helps
performance in general, but is particularly helpful on NUMA machines
where the hashed wait queues can bounce around a lot.

However, the "clear lock bit atomically and then test the waiters bit"
sequence turns out to be much more expensive than it needs to be,
because you get a nasty stall when trying to access the same word that
just got updated atomically.

On architectures where locking is done with LL/SC, this would be trivial
to fix with a new primitive that clears one bit and tests another
atomically, but that ends up not working on x86, where the only atomic
operations that return the result end up being cmpxchg and xadd.  The
atomic bit operations return the old value of the same bit we changed,
not the value of an unrelated bit.

On x86, we could put the lock bit in the high bit of the byte, and use
"xadd" with that bit (where the overflow ends up not touching other
bits), and look at the other bits of the result.  However, an even
simpler model is to just use a regular atomic "and" to clear the lock
bit, and then the sign bit in eflags will indicate the resulting state
of the unrelated bit #7.

So by moving the PageWaiters bit up to bit #7, we can atomically clear
the lock bit and test the waiters bit on x86 too.  And architectures
with LL/SC (which is all the usual RISC suspects), the particular bit
doesn't matter, so they are fine with this approach too.

This avoids the extra access to the same atomic word, and thus avoids
the costly stall at page unlock time.

The only downside is that the interface ends up being a bit odd and
specialized: clear a bit in a byte, and test the sign bit.  Nick doesn't
love the resulting name of the new primitive, but I'd rather make the
name be descriptive and very clear about the limitation imposed by
trying to work across all relevant architectures than make it be some
generic thing that doesn't make the odd semantics explicit.

So this introduces the new architecture primitive

    clear_bit_unlock_is_negative_byte();

and adds the trivial implementation for x86.  We have a generic
non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
combination) which can be overridden by any architecture that can do
better.  According to Nick, Power has the same hickup x86 has, for
example, but some other architectures may not even care.

All these optimizations mean that my page locking stress-test (which is
just executing a lot of small short-lived shell scripts: "make test" in
the git source tree) no longer makes our page locking look horribly bad.
Before all these optimizations, just the unlock_page() costs were just
over 3% of all CPU overhead on "make test".  After this, it's down to
0.66%, so just a quarter of the cost it used to be.

(The difference on NUMA is bigger, but there this micro-optimization is
likely less noticeable, since the big issue on NUMA was not the accesses
to 'struct page', but the waitqueue accesses that were already removed
by Nick's earlier commit).
Acked-by: NNick Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91e1302

27 12月, 2016 2 次提交

x86/mce/AMD: Make the init code more robust · 0dad3a30

由 Thomas Gleixner 提交于 12月 26, 2016

If mce_device_init() fails then the mce device pointer is NULL and the
AMD mce code happily dereferences it.

Add a sanity check.
Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0dad3a30

arm64: don't pull uaccess.h into *.S · b4b8664d

由 Al Viro 提交于 12月 26, 2016

Split asm-only parts of arm64 uaccess.h into a new header and use that
from *.S.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b4b8664d

26 12月, 2016 2 次提交

powerpc: Fix build warning on 32-bit PPC · 8ae679c4

由 Larry Finger 提交于 12月 22, 2016

I am getting the following warning when I build kernel 4.9-git on my
PowerBook G4 with a 32-bit PPC processor:

    AS      arch/powerpc/kernel/misc_32.o
  arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]

This problem is evident after commit 989cea5c ("kbuild: prevent
lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
error that has been in the code since 2005 when this source file was
created.  That was with commit 9994a338 ("powerpc: Introduce
entry_{32,64}.S, misc_{32,64}.S, systbl.S").

The offending line does not make a lot of sense.  This error does not
seem to cause any errors in the executable, thus I am not recommending
that it be applied to any stable versions.

Thanks to Nicholas Piggin for suggesting this solution.

Fixes: 9994a338 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
Signed-off-by: NLarry Finger <Larry.Finger@lwfinger.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8ae679c4

ktime: Cleanup ktime_set() usage · 8b0e1953

由 Thomas Gleixner 提交于 12月 25, 2016

ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>

8b0e1953

25 12月, 2016 6 次提交

clocksource: Use a plain u64 instead of cycle_t · a5a1d1c2

由 Thomas Gleixner 提交于 12月 21, 2016

There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>

a5a1d1c2

cpu/hotplug: Cleanup state names · 73c1b41e

由 Thomas Gleixner 提交于 12月 21, 2016

When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

73c1b41e

x86/msr: Remove bogus cleanup from the error path · 59fefd08

由 Thomas Gleixner 提交于 12月 22, 2016

The error cleanup which is invoked when the hotplug state setup failed
tries to remove the failed state, which is broken.

Fixes: 8fba38c9 ("x86/msr: Convert to hotplug state machine")
Reported-by: Nkernel test robot <fengguang.wu@intel.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>

59fefd08

perf/x86/intel/cstate: Prevent hotplug callback leak · 834fcd29

由 Thomas Gleixner 提交于 12月 22, 2016

If the pmu registration fails the registered hotplug callbacks are not
removed. Wrong in any case, but fatal in case of a modular driver.

Replace the nonsensical state names with proper ones while at it.

Fixes: 77c34ef1 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org

834fcd29

ARM/imx/mmcd: Fix broken cpu hotplug handling · a051f220

由 Thomas Gleixner 提交于 12月 21, 2016

The cpu hotplug support of this perf driver is broken in several ways:

1) It adds a instance before setting up the state.

2) The state for the instance is different from the state of the
   callback. It's just a randomly chosen state.

3) The instance registration is not error checked so nobody noticed that
   the call can never succeed.

4) The state for the multi install callbacks is chosen randomly and
   overwrites existing state. This is now prevented by the core code so the
   call is guaranteed to fail.

5) The error exit path in the init function leaves the instance registered
   and then frees the memory which contains the enqueued hlist node.

6) The remove function is removing the state and not the instance.

Fix it by:

- Setting up the state before adding instances. Use a dynamically allocated
  state for it.

- Installing instances after the state has been set up

- Removing the instance in the error path before freeing memory

- Removing the instance not the state in the driver remove callback

While at is use raw_cpu_processor_id(), because cpu_processor_id() cannot
be used in preemptible context, and set the driver data after successful
registration of the pmu.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NShawn Guo <shawnguo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Frank Li <frank.li@nxp.com>
Cc: Zhengyu Shen <zhengyu.shen@nxp.com>
Link: http://lkml.kernel.org/r/20161221192111.596204211@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

a051f220

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

24 12月, 2016 1 次提交

Revert "x86/unwind: Detect bad stack return address" · c280f773

由 Josh Poimboeuf 提交于 12月 22, 2016

Revert the following commit:

  b6959a36 ("x86/unwind: Detect bad stack return address")

... because Andrey Konovalov reported an unwinder warning:

  WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467

The unwind was initiated from an interrupt which occurred while running in the
generated code for a kprobe.  The unwinder printed the warning because it
expected regs->ip to point to a valid text address, but instead it pointed to
the generated code.

Eventually we may want come up with a way to identify generated kprobe
code so the unwinder can know that it's a valid return address.  Until
then, just remove the warning.
Reported-by: NAndrey Konovalov <andreyknvl@google.com>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c280f773

23 12月, 2016 3 次提交

perf/x86: Fix overlap counter scheduling bug · 1134c2b5

由 Peter Zijlstra 提交于 11月 09, 2016

Jiri reported the overlap scheduling exceeding its max stack.

Looking at the constraint that triggered this, it turns out the
overlap marker isn't needed.

The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
the counter mask of such an event is not a subset of any other counter
mask of a constraint with an equal or higher weight".

Esp. that latter part is of interest here I think, our overlapping mask
is 0x0e, that has 3 bits set and is the highest weight mask in on the
PMU, therefore it will be placed last. Can we still create a scenario
where we would need to rewind that?

The scenario for AMD Fam15h is we're having masks like:

	0x3F -- 111111
	0x38 -- 111000
	0x07 -- 000111

	0x09 -- 001001

And we mark 0x09 as overlapping, because it is not a direct subset of
0x38 or 0x07 and has less weight than either of those. This means we'll
first try and place the 0x09 event, then try and place 0x38/0x07 events.
Now imagine we have:

	3 * 0x07 + 0x09

and the initial pick for the 0x09 event is counter 0, then we'll fail to
place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
event, and then re-try all 0x07 events, which will now work.

The masks on the PMU in question are:

  0x01 - 0001
  0x03 - 0011
  0x0e - 1110
  0x0c - 1100

But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
0x1) are of heavier weight, it should all work out.
Reported-by: NJiri Olsa <jolsa@kernel.org>
Tested-by: NJiri Olsa <jolsa@kernel.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Liang Kan <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20161109155153.GQ3142@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

1134c2b5

perf/x86/pebs: Fix handling of PEBS buffer overflows · daa864b8

由 Stephane Eranian 提交于 12月 22, 2016

This patch solves a race condition between PEBS and the PMU handler.

In case multiple PEBS events are sampled at the same time,
it is possible to have GLOBAL_STATUS bit 62 set indicating
PEBS buffer overflow and also seeing at most 3 PEBS counters
having their bits set in the status register. This is a sign
that there was at least one PEBS record pending at the time
of the PMU interrupt. PEBS counters must only be processed
via the drain_pebs() calls, and not via the regular sample
processing loop coming after that the function, otherwise
phony regular samples may be generated in the sampling buffer
not marked with the EXACT tag.

Another possibility is to have one PEBS event and at least
one non-PEBS event whic hoverflows while PEBS has armed. In this
case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
status bit for the PEBS counter will be on Skylake.

To avoid this problem, we systematically ignore the PEBS-enabled
counters from the GLOBAL_STATUS mask and we always process PEBS
events via drain_pebs().

The problem manifested itself by having non-exact samples when
sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
not have the EXACT flag set.

Note that this problem is only present on Skylake processor.
This fix is harmless on older processors.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

daa864b8

x86/paravirt: Mark unused patch_default label · cef4402d

由 Peter Zijlstra 提交于 12月 16, 2016

A bugfix commit:

  45dbea5f ("x86/paravirt: Fix native_patch()")

... introduced a harmless warning:

  arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
  arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined but not used [-Werror=unused-label]

Fix it by annotating the label as __maybe_unused.
Reported-by: NArnd Bergmann <arnd@arndb.de>
Reported-by: NPiotr Gregor <piotrgregor@rsyncme.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 45dbea5f ("x86/paravirt: Fix native_patch()")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

cef4402d

21 12月, 2016 6 次提交

x86/microcode/AMD: Reload proper initrd start address · 8877ebdd

由 Borislav Petkov 提交于 12月 20, 2016

When we switch to virtual addresses and, especially after
reserve_initrd()->relocate_initrd() have run, we have the updated initrd
address in initrd_start. Use initrd_start then instead of the address
which has been passed to us through boot params. (That still gets used
when we're running the very early routines on the BSP).
Reported-and-tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20161220144012.lc4cwrg6dphqbyqu@pd.tnicSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8877ebdd

ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory() · 8d3523fb

由 Lv Zheng 提交于 12月 14, 2016

Since all users are cleaned up, remove the 2 deprecated APIs due to no
users.
As a Linux variable rather than an ACPICA variable, acpi_gbl_permanent_mmap
is renamed to acpi_permanent_mmap to have a consistent coding style across
entire Linux ACPI subsystem.
Signed-off-by: NLv Zheng <lv.zheng@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8d3523fb

ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users · 6b11d1d6

由 Lv Zheng 提交于 12月 14, 2016

This patch removes the users of the deprectated APIs:
 acpi_get_table_with_size()
 early_acpi_os_unmap_memory()
The following APIs should be used instead of:
 acpi_get_table()
 acpi_put_table()

The deprecated APIs are invented to be a replacement of acpi_get_table()
during the early stage so that the early mapped pointer will not be stored
in ACPICA core and thus the late stage acpi_get_table() won't return a
wrong pointer. The mapping size is returned just because it is required by
early_acpi_os_unmap_memory() to unmap the pointer during early stage.

But as the mapping size equals to the acpi_table_header.length
(see acpi_tb_init_table_descriptor() and acpi_tb_validate_table()), when
such a convenient result is returned, driver code will start to use it
instead of accessing acpi_table_header to obtain the length.

Thus this patch cleans up the drivers by replacing returned table size with
acpi_table_header.length, and should be a no-op.
Reported-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NLv Zheng <lv.zheng@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6b11d1d6

parisc: Optimize timer interrupt function · 160494d3

由 Helge Deller 提交于 12月 20, 2016

Restructure the timer interrupt function to better cope with missed timer irqs.
Optimize the calculation when the next interrupt should happen and skip irqs if
they would happen too shortly after exit of the irq function.

The update_process_times() call is done anyway at every timer irq, so we can
safely drop the prof_counter and prof_multiplier variables from the per_cpu
structure.
Signed-off-by: NHelge Deller <deller@gmx.de>

160494d3

ARM: dts: hix5hd2: don't change the existing compatible string · 48fed73a

由 Dongpo Li 提交于 12月 20, 2016

The SoC hix5hd2 compatible string has the suffix "-gmac" and
we should not change it.
We should only add the generic compatible string "hisi-gmac-v1".

Fixes: 0855950b ("ARM: dts: hix5hd2: add gmac generic compatible and clock names")
Signed-off-by: NDongpo Li <lidongpo@hisilicon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48fed73a

powerpc: fsl/fman: remove fsl,fman from of_device_ids[] · ae6021d4

由 Madalin Bucur 提交于 12月 19, 2016

The fsl/fman drivers will use of_platform_populate() on all
supported platforms. Call of_platform_populate() to probe the
FMan sub-nodes.
Signed-off-by: NIgal Liberman <igal.liberman@freescale.com>
Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com>
Acked-by: NScott Wood <oss@buserror.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae6021d4

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多