提交 · a351e9b9fc24e982ec2f0e76379a49826036da12 · openanolis / cloud-kernel

30 4月, 2017 1 次提交

Prevent timer value 0 for MWAITX · 88d879d2

由 Janakarajan Natarajan 提交于 4月 25, 2017

Newer hardware has uncovered a bug in the software implementation of
using MWAITX for the delay function. A value of 0 for the timer is meant
to indicate that a timeout will not be used to exit MWAITX. On newer
hardware this can result in MWAITX never returning, resulting in NMI
soft lockup messages being printed. On older hardware, some of the other
conditions under which MWAITX can exit masked this issue. The AMD APM
does not currently document this and will be updated.

Please refer to http://marc.info/?l=kvm&m=148950623231140 for
information regarding NMI soft lockup messages on an AMD Ryzen 1800X.
This has been root-caused as a 0 passed to MWAITX causing it to wait
indefinitely.

This change has the added benefit of avoiding the unnecessary setup of
MONITORX/MWAITX when the delay value is zero.
Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Link: http://lkml.kernel.org/r/1493156643-29366-1-git-send-email-Janakarajan.Natarajan@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

88d879d2

24 4月, 2017 2 次提交

sparc: Update syscall tables. · f6ebf0bb

由 David S. Miller 提交于 4月 23, 2017

Hook up statx.

Ignore pkeys system calls, we don't have protection keeys
on SPARC.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6ebf0bb

D
sparc64: Fill in rest of HAVE_REGS_AND_STACK_ACCESS_API · b7c02b73
由 David S. Miller 提交于 4月 23, 2017
```
This lets us enable KPROBE_EVENTS.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b7c02b73

21 4月, 2017 1 次提交

ARCv2: entry: save Accumulator register pair (r58:59) if present · 3d5e8012

由 Vineet Gupta 提交于 4月 20, 2017

Accumulator is present in configs with FPU and/or DSP MPY (mpy > 6)

Instead of doing this in pt_regs (and thus every kernel entry/exit),
this could have been done in context switch (and for user task only) as
currently kernel doesn't clobber these registers for its own accord.
However we will soon start using 64-bit multiply instructions for kernel
which can clobber these. Also gcc folks also plan to start using these
as GPRs, hence better to always save/restore them
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

3d5e8012

19 4月, 2017 4 次提交

x86/build: convert function graph '-Os' error to warning · a5859c6d

由 Josh Poimboeuf 提交于 4月 18, 2017

For pre-4.6.0 versions of GCC, which don't have '-mfentry', the
'-maccumulate-outgoing-args' option is required for function graph
tracing in order to avoid GCC bug 42109.

However, GCC ignores '-maccumulate-outgoing-args' when '-Os' is
also set.

Currently we force a build error to prevent that scenario, but that
breaks randconfigs.  So change the error to a warning which also
disables CONFIG_CC_OPTIMIZE_FOR_SIZE.
Reported-by: NAndi Kleen <andi@firstfloor.org>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kbuild test robot <fengguang.wu@intel.com>
Cc: kbuild-all@01.org
Link: http://lkml.kernel.org/r/20170418214429.o7fbwbmf4nqosezy@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>

a5859c6d

x86/mce: Make the MCE notifier a blocking one · 0dc9c639

由 Vishal Verma 提交于 4月 18, 2017

The NFIT MCE handler callback (for handling media errors on NVDIMMs)
takes a mutex to add the location of a memory error to a list. But since
the notifier call chain for machine checks (x86_mce_decoder_chain) is
atomic, we get a lockdep splat like:

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
  in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
  [..]
  Call Trace:
   dump_stack
   ___might_sleep
   __might_sleep
   mutex_lock_nested
   ? __lock_acquire
   nfit_handle_mce
   notifier_call_chain
   atomic_notifier_call_chain
   ? atomic_notifier_call_chain
   mce_gen_pool_process

Convert the notifier to a blocking one which gets to run only in process
context.

Boris: remove the notifier call in atomic context in print_mce(). For
now, let's print the MCE on the atomic path so that we can make sure
they go out and get logged at least.

Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>
Acked-by: NTony Luck <tony.luck@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.comSigned-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0dc9c639

sparc64: Fix hugepage page table free · 544f8f93

由 Nitin Gupta 提交于 4月 17, 2017

Make sure the start adderess is aligned to PMD_SIZE
boundary when freeing page table backing a hugepage
region. The issue was causing segfaults when a region
backed by 64K pages was unmapped since such a region
is in general not PMD_SIZE aligned.
Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

544f8f93

sparc64: Use LOCKDEP_SMALL, not PROVE_LOCKING_SMALL · 395102db

由 Daniel Jordan 提交于 4月 10, 2017

CONFIG_PROVE_LOCKING_SMALL shrinks the memory usage of lockdep so the
kernel text, data, and bss fit in the required 32MB limit, but this
option is not set for every config that enables lockdep.

A 4.10 kernel fails to boot with the console output

    Kernel: Using 8 locked TLB entries for main kernel image.
    hypervisor_tlb_lock[2000000:0:8000000071c007c3:1]: errors with f
    Program terminated

with these config options

    CONFIG_LOCKDEP=y
    CONFIG_LOCK_STAT=y
    CONFIG_PROVE_LOCKING=n

To fix, rename CONFIG_PROVE_LOCKING_SMALL to CONFIG_LOCKDEP_SMALL, and
enable this option with CONFIG_LOCKDEP=y so we get the reduced memory
usage every time lockdep is turned on.

Tested that CONFIG_LOCKDEP_SMALL is set to 'y' if and only if
CONFIG_LOCKDEP is set to 'y'.  When other lockdep-related config options
that select CONFIG_LOCKDEP are enabled (e.g. CONFIG_LOCK_STAT or
CONFIG_PROVE_LOCKING), verified that CONFIG_LOCKDEP_SMALL is also
enabled.

Fixes: e6b5f1be ("config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc")
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: NBabu Moger <babu.moger@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

395102db

18 4月, 2017 2 次提交

powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=y · be5c5e84

由 Michael Ellerman 提交于 4月 18, 2017

Prior to commit 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi
interrupts"), the branch from hmi_exception_early() to hmi_exception_realmode()
was just a bl hmi_exception_realmode, which the linker would turn into a bl to
the local entry point of hmi_exception_realmode. This was broken when
CONFIG_RELOCATABLE=y because hmi_exception_realmode() is not in the low part of
the kernel text that is copied down to 0x0.

But in fixing that, we added a new bug on little endian kernels. Because the
branch is now a bctrl when CONFIG_RELOCATABLE=y, we branch to the global entry
point of hmi_exception_realmode(). The global entry point must be called with
r12 containing the address of hmi_exception_realmode(), because it uses that
value to calculate the TOC value (r2).

This may manifest as a checkstop, because we take a junk value from r12 which
came from HSRR1, add a small constant to it and then use that as the TOC
pointer. The HSRR1 value will have 0x9 as the top nibble, which puts it above
RAM and somewhere in MMIO space.

Fix it by changing the BRANCH_LINK_TO_FAR() macro to always use r12 to load the
label we're branching to. This means r12 will be setup correctly on LE, fixing
this bug, and r12 is also volatile across function calls on BE so it's a good
choice anyway.

Fixes: 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts")
Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: NNicholas Piggin <npiggin@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

be5c5e84

powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction · 9e1ba4f2

由 Ravi Bangoria 提交于 4月 11, 2017

If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
OOPS:

  Bad kernel stack pointer cd93c840 at c000000000009868
  Oops: Bad kernel stack pointer, sig: 6 [#1]
  ...
  GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
  ...
  NIP [c000000000009868] resume_kernel+0x2c/0x58
  LR [c000000000006208] program_check_common+0x108/0x180

On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
not emulate actual store in emulate_step() because it may corrupt the exception
frame. So the kernel does the actual store operation in exception return code
i.e. resume_kernel().

resume_kernel() loads the saved stack pointer from memory using lwz, which only
loads the low 32-bits of the address, causing the kernel crash.

Fix this by loading the 64-bit value instead.

Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: NAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
[mpe: Change log massage, add stable tag]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9e1ba4f2

16 4月, 2017 1 次提交

parisc: Fix get_user() for 64-bit value on 32-bit kernel · 3f795cef

由 Helge Deller 提交于 4月 16, 2017

This fixes a bug in which the upper 32-bits of a 64-bit value which is
read by get_user() was lost on a 32-bit kernel.
While touching this code, split out pre-loading of %sr2 space register
and clean up code indent.

Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: NHelge Deller <deller@gmx.de>

3f795cef

15 4月, 2017 2 次提交

parisc: fix bugs in pa_memcpy · 409c1b25

由 Mikulas Patocka 提交于 4月 14, 2017

The patch 554bfece ("parisc: Fix access
fault handling in pa_memcpy()") reimplements the pa_memcpy function.
Unfortunatelly, it makes the kernel unbootable. The crash happens in the
function ide_complete_cmd where memcpy is called with the same source
and destination address.

This patch fixes a few bugs in pa_memcpy:

* When jumping to .Lcopy_loop_16 for the first time, don't skip the
  instruction "ldi 31,t0" (this bug made the kernel unbootable)
* Use the COND macro when comparing length, so that the comparison is
  64-bit (a theoretical issue, in case the length is greater than
  0xffffffff)
* Don't use the COND macro after the "extru" instruction (the PA-RISC
  specification says that the upper 32-bits of extru result are undefined,
  although they are set to zero in practice)
* Fix exception addresses in .Lcopy16_fault and .Lcopy8_fault
* Rename .Lcopy_loop_4 to .Lcopy_loop_8 (so that it is consistent with
  .Lcopy8_fault)

Cc: <stable@vger.kernel.org> # v4.9+
Fixes: 554bfece ("parisc: Fix access fault handling in pa_memcpy()")
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NHelge Deller <deller@gmx.de>

409c1b25

ARC: [plat-eznps] Fix build error · 6492f09e

由 Noam Camus 提交于 4月 04, 2017

Make ATOMIC_INIT available for all ARC platforms (including plat-eznps)

Cc: <stable@vger.kernel.org>	# 4.9+
Signed-off-by: NNoam Camus <noamca@mellanox.com>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

6492f09e

14 4月, 2017 3 次提交

ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram · 34a477e5

由 Josh Poimboeuf 提交于 4月 13, 2017

On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function
graph tracing and then suspend to RAM, it will triple fault and reboot when
it resumes.

The first fault happens when booting a secondary CPU:

startup_32_smp()
  load_ucode_ap()
    prepare_ftrace_return()
      ftrace_graph_is_dead()
        (accesses 'kill_ftrace_graph')

The early head_32.S code calls into load_ucode_ap(), which has an an
ftrace hook, so it calls prepare_ftrace_return(), which calls
ftrace_graph_is_dead(), which tries to access the global
'kill_ftrace_graph' variable with a virtual address, causing a fault
because the CPU is still in real mode.

The fix is to add a check in prepare_ftrace_return() to make sure it's
running in protected mode before continuing.  The check makes sure the
stack pointer is a virtual kernel address.  It's a bit of a hack, but
it's not very intrusive and it works well enough.

For reference, here are a few other (more difficult) ways this could
have potentially been fixed:

- Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging
  is enabled.  (No idea what that would break.)

- Track down load_ucode_ap()'s entire callee tree and mark all the
  functions 'notrace'.  (Probably not realistic.)

- Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu()
  or __cpu_up(), and ensure that the pause facility can be queried from
  real mode.
Reported-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Tested-by: NPaul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: linux-acpi@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@kernel.org
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

34a477e5

perf/x86: Avoid exposing wrong/stale data in intel_pmu_lbr_read_32() · f2200ac3

由 Peter Zijlstra 提交于 4月 11, 2017

When the perf_branch_entry::{in_tx,abort,cycles} fields were added,
intel_pmu_lbr_read_32() wasn't updated to initialize them.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Fixes: 135c5612 ("perf/x86/intel: Support Haswell/v4 LBR format")
Signed-off-by: NIngo Molnar <mingo@kernel.org>

f2200ac3

ia64: restore symbol versions for symbols defined in assembly · d8a6e3ae

由 Jan Beulich 提交于 4月 13, 2017

The ia64 build generates many warnings like this:

   WARNING: EXPORT symbol "empty_zero_page" [vmlinux] version generation failed, symbol will not be versioned.

Besides adding the necessary header this also requires fiddling with
some explicit .S -> .o rules.

Cc: IA64-ML <linux-ia64@vger.kernel.org>
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Signed-off-by: NTony Luck <tony.luck@intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8a6e3ae

13 4月, 2017 6 次提交

x86/efi: Don't try to reserve runtime regions · 6f6266a5

由 Omar Sandoval 提交于 4月 12, 2017

Reserving a runtime region results in splitting the EFI memory
descriptors for the runtime region. This results in runtime region
descriptors with bogus memory mappings, leading to interesting crashes
like the following during a kexec:

  general protection fault: 0000 [#1] SMP
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1 #53
  Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM05   09/30/2016
  RIP: 0010:virt_efi_set_variable()
  ...
  Call Trace:
   efi_delete_dummy_variable()
   efi_enter_virtual_mode()
   start_kernel()
   ? set_init_arg()
   x86_64_start_reservations()
   x86_64_start_kernel()
   start_cpu()
  ...
  Kernel panic - not syncing: Fatal exception

Runtime regions will not be freed and do not need to be reserved, so
skip the memmap modification in this case.
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 8e80632f ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
Link: http://lkml.kernel.org/r/20170412152719.9779-2-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>

6f6266a5

MIPS: PCI: add controllers before the specified head · edb0b6a0

由 Mathias Kresin 提交于 3月 26, 2017

With commit 23dac14d ("MIPS: PCI: Use struct list_head lists") new
controllers are added after the specified head where they where added
before the specified head previously.

Use list_add_tail to restore the former order.

This patches fixes the following PCI error on lantiq:

  pci 0000:01:00.0: BAR 0: error updating (0x1c000004 != 0x000000)

Fixes: 23dac14d ("MIPS: PCI: Use struct list_head lists")
Signed-off-by: NMathias Kresin <dev@kresin.me>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15808/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

edb0b6a0

x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions · 11e63f6d

由 Dan Williams 提交于 4月 06, 2017

Before we rework the "pmem api" to stop abusing __copy_user_nocache()
for memcpy_to_pmem() we need to fix cases where we may strand dirty data
in the cpu cache. The problem occurs when copy_from_iter_pmem() is used
for arbitrary data transfers from userspace. There is no guarantee that
these transfers, performed by dax_iomap_actor(), will have aligned
destinations or aligned transfer lengths. Backstop the usage
__copy_user_nocache() with explicit cache management in these unaligned
cases.

Yes, copy_from_iter_pmem() is now too big for an inline, but addressing
that is saved for a later patch that moves the entirety of the "pmem
api" into the pmem driver directly.

Fixes: 5de490da ("pmem: add copy_from_iter_pmem() and clear_pmem()")
Cc: <stable@vger.kernel.org>
Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

11e63f6d

MIPS: KGDB: Use kernel context for sleeping threads · 162b270c

由 James Hogan 提交于 3月 30, 2017

KGDB is a kernel debug stub and it can't be used to debug userland as it
can only safely access kernel memory.

On MIPS however KGDB has always got the register state of sleeping
processes from the userland register context at the beginning of the
kernel stack. This is meaningless for kernel threads (which never enter
userland), and for user threads it prevents the user seeing what it is
doing while in the kernel:

(gdb) info threads
  Id   Target Id         Frame
  ...
  3    Thread 2 (kthreadd) 0x0000000000000000 in ?? ()
  2    Thread 1 (init)   0x000000007705c4b4 in ?? ()
  1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201

Get the register state instead from the (partial) kernel register
context stored in the task's thread_struct for resume() to restore. All
threads now correctly appear to be in context_switch():

(gdb) info threads
  Id   Target Id         Frame
  ...
  3    Thread 2 (kthreadd) context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
  2    Thread 1 (init)   context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
  1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201

Call clobbered registers which aren't saved and exception registers
(BadVAddr & Cause) which can't be easily determined without stack
unwinding are reported as 0. The PC is taken from the return address,
such that the state presented matches that found immediately after
returning from resume().

Fixes: 88547001 ("[MIPS] kgdb: add arch support for the kernel's kgdb core")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15829/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

162b270c

MIPS: smp-cps: Fix potentially uninitialised value of core · bac06cf0

由 Matt Redfearn 提交于 3月 31, 2017

Turning on DEBUG in smp-cps.c, or compiling the kernel with
CONFIG_DYNAMIC_DEBUG enabled results the build error:

arch/mips/kernel/smp-cps.c: In function 'play_dead':
./include/linux/dynamic_debug.h:126:3: error: 'core' may be used
uninitialized in this function [-Werror=maybe-uninitialized]

Fix this by always initialising the variable.

Fixes: 0d2808f3 ("MIPS: smp-cps: Add support for CPU hotplug of MIPSr6 processors")
Signed-off-by: NMatt Redfearn <matt.redfearn@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15848/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

bac06cf0

mm: Tighten x86 /dev/mem with zeroing reads · a4866aa8

由 Kees Cook 提交于 4月 05, 2017

Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is
disallowed. However, on x86, the first 1MB was always allowed for BIOS
and similar things, regardless of it actually being System RAM. It was
possible for heap to end up getting allocated in low 1MB RAM, and then
read by things like x86info or dd, which would trip hardened usercopy:

usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes)

This changes the x86 exception for the low 1MB by reading back zeros for
System RAM areas instead of blindly allowing them. More work is needed to
extend this to mmap, but currently mmap doesn't go through usercopy, so
hardened usercopy won't Oops the kernel.
Reported-by: NTommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: NTommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: NKees Cook <keescook@chromium.org>

a4866aa8

12 4月, 2017 4 次提交

MIPS: KASLR: Add missing header files · ec62a3bd

由 Matt Redfearn 提交于 3月 31, 2017

After the split of linux/sched.h, KASLR stopped building.

Fix this by including the correct header file for init_thread_union
Signed-off-by: NMatt Redfearn <matt.redfearn@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: Steven J. Hill <Steven.Hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15849/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

ec62a3bd

MIPS: Avoid BUG warning in arch_check_elf · c46f59e9

由 James Cowgill 提交于 4月 11, 2017

arch_check_elf contains a usage of current_cpu_data that will call
smp_processor_id() with preemption enabled and therefore triggers a
"BUG: using smp_processor_id() in preemptible" warning when an fpxx
executable is loaded.

As a follow-up to commit b244614a ("MIPS: Avoid a BUG warning during
prctl(PR_SET_FP_MODE, ...)"), apply the same fix to arch_check_elf by
using raw_current_cpu_data instead. The rationale quoted from the previous
commit:

"It is assumed throughout the kernel that if any CPU has an FPU, then
all CPUs would have an FPU as well, so it is safe to perform the check
with preemption enabled - change the code to use raw_ variant of the
check to avoid the warning."

Fixes: 46490b57 ("MIPS: kernel: elf: Improve the overall ABI and FPU mode checks")
Signed-off-by: NJames Cowgill <James.Cowgill@imgtec.com>
CC: <stable@vger.kernel.org> # 4.0+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15951/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

c46f59e9

MIPS: Fix modversioning of _mcount symbol · e0211327

由 James Cowgill 提交于 4月 11, 2017

In commit 827456e7 ("MIPS: Export _mcount alongside its definition")
the EXPORT_SYMBOL macro exporting _mcount was moved from C code into
assembly. Unlike C, exported assembly symbols need to have a function
prototype in asm/asm-prototypes.h for modversions to work properly.
Without this, modpost prints out this warning:

     WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed,
     symbol will not be versioned.

Fix by including asm/ftrace.h (where _mcount is declared) in
asm/asm-prototypes.h.

Fixes: 827456e7 ("MIPS: Export _mcount alongside its definition")
Signed-off-by: NJames Cowgill <James.Cowgill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15952/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

e0211327

s390/mm: fix CMMA vs KSM vs others · a8f60d1f

由 Christian Borntraeger 提交于 4月 09, 2017

On heavy paging with KSM I see guest data corruption. Turns out that
KSM will add pages to its tree, where the mapping return true for
pte_unused (or might become as such later).  KSM will unmap such pages
and reinstantiate with different attributes (e.g. write protected or
special, e.g. in replace_page or write_protect_page)). This uncovered
a bug in our pagetable handling: We must remove the unused flag as
soon as an entry becomes present again.

Cc: stable@vger.kernel.org
Signed-of-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

a8f60d1f

11 4月, 2017 5 次提交

MIPS: generic: fix out-of-tree defconfig target builds · 337b775b

由 Marcin Nowakowski 提交于 3月 13, 2017

When specifying a generic defconfig target with O=... option set, make
is invoked in the output location before a target makefile wrapper is
created. Ensure that the correct makefile is used by specifying the
kernel source makefile during make invocation.

This fixes the either of the following errors:

$ make sead3_defoncifg ARCH=mips O=test
make[1]: Entering directory '/mnt/ssd/MIPS/linux-next/test'
make[2]: *** No rule to make target '32r2el_defconfig'.  Stop.
arch/mips/Makefile:506: recipe for target 'sead3_defconfig' failed
make[1]: *** [sead3_defconfig] Error 2
make[1]: Leaving directory '/mnt/ssd/MIPS/linux-next/test'
Makefile:152: recipe for target 'sub-make' failed
make: *** [sub-make] Error 2

$ make 32r2el_defconfig ARCH=mips O=test
make[1]: Entering directory '/mnt/ssd/MIPS/linux-next/test'
Using ../arch/mips/configs/generic_defconfig as base
Merging ../arch/mips/configs/generic/32r2.config
Merging ../arch/mips/configs/generic/el.config
Merging ../arch/mips/configs/generic/board-sead-3.config
!
! merged configuration written to .config (needs make)
!
make[2]: *** No rule to make target 'olddefconfig'.  Stop.
arch/mips/Makefile:489: recipe for target '32r2el_defconfig' failed
make[1]: *** [32r2el_defconfig] Error 2
make[1]: Leaving directory '/mnt/ssd/MIPS/linux-next/test'
Makefile:152: recipe for target 'sub-make' failed
make: *** [sub-make] Error 2

Fixes: eed0eabd ('MIPS: generic: Introduce generic DT-based board support')
Fixes: 3f5f0a44 ('MIPS: generic: Convert SEAD-3 to a generic board')
Signed-off-by: NMarcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15464/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

337b775b

x86/intel_rdt: Fix locking in rdtgroup_schemata_write() · 7f00f388

由 Jiri Olsa 提交于 4月 11, 2017

The schemata lock is released before freeing the resource's temporary
tmp_cbms allocation. That's racy versus another write which allocates and
uses new temporary storage, resulting in memory leaks, freeing in use
memory, double a free or any combination of those.

Move the unlock after the release code.

Fixes: 60ec2440 ("x86/intel_rdt: Add schemata file")
Signed-off-by: NJiri Olsa <jolsa@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Shaohua Li <shli@fb.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170411071446.15241-1-jolsa@kernel.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

7f00f388

x86/debug: Fix the printk() debug output of signal_fault(), do_trap() and do_general_protection() · 1c99a687

由 Markus Trippelsdorf 提交于 4月 07, 2017

Since commit:

  4bcc595c "printk: reinstate KERN_CONT for printing"

... the debug output of signal_fault(), do_trap() and do_general_protection()
looks garbled, e.g.:

 traps: conftest[9335] trap invalid opcode ip:400428 sp:7ffeaba1b0d8 error:0
  in conftest[400000+1000]

(note the unintended line break.)

Fix the bug by adding KERN_CONTs.
Signed-off-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

1c99a687

x86/vdso: Plug race between mapping and ELF header setup · 6fdc6dd9

由 Thomas Gleixner 提交于 4月 10, 2017

The vsyscall32 sysctl can racy against a concurrent fork when it switches
from disabled to enabled:

    arch_setup_additional_pages()
	if (vdso32_enabled)
           --> No mapping
                                        sysctl.vsysscall32()
                                          --> vdso32_enabled = true
    create_elf_tables()
      ARCH_DLINFO_IA32
        if (vdso32_enabled) {
           --> Add VDSO entry with NULL pointer

Make ARCH_DLINFO_IA32 check whether the VDSO mapping has been set up for
the newly forked process or not.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170410151723.602367196@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6fdc6dd9

x86/vdso: Ensure vdso32_enabled gets set to valid values only · c06989da

由 Mathias Krause 提交于 4月 10, 2017

vdso_enabled can be set to arbitrary integer values via the kernel command
line 'vdso32=' parameter or via 'sysctl abi.vsyscall32'.

load_vdso32() only maps VDSO if vdso_enabled == 1, but ARCH_DLINFO_IA32
merily checks for vdso_enabled != 0. As a consequence the AT_SYSINFO_EHDR
auxiliary vector for the VDSO_ENTRY is emitted with a NULL pointer which
causes a segfault when the application tries to use the VDSO.

Restrict the valid arguments on the command line and the sysctl to 0 and 1.

Fixes: b0b49f26 ("x86, vdso: Remove compat vdso support")
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Acked-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Roland McGrath <roland@redhat.com>
Link: http://lkml.kernel.org/r/1491424561-7187-1-git-send-email-minipli@googlemail.com
Link: http://lkml.kernel.org/r/20170410151723.518412863@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c06989da

10 4月, 2017 3 次提交

MIPS: cevt-r4k: Fix out-of-bounds array access · 9d7f29cd

由 James Hogan 提交于 4月 05, 2017

calculate_min_delta() may incorrectly access a 4th element of buf2[]
which only has 3 elements. This may trigger undefined behaviour and has
been reported to cause strange crashes in start_kernel() sometime after
timer initialization when built with GCC 5.3, possibly due to
register/stack corruption:

sched_clock: 32 bits at 200MHz, resolution 5ns, wraps every 10737418237ns
CPU 0 Unable to handle kernel paging request at virtual address ffffb0aa, epc == 8067daa8, ra == 8067da84
Oops[#1]:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #51
task: 8065e3e0 task.stack: 80644000
$ 0   : 00000000 00000001 00000000 00000000
$ 4   : 8065b4d0 00000000 805d0000 00000010
$ 8   : 00000010 80321400 fffff000 812de408
$12   : 00000000 00000000 00000000 ffffffff
$16   : 00000002 ffffffff 80660000 806a666c
$20   : 806c0000 00000000 00000000 00000000
$24   : 00000000 00000010
$28   : 80644000 80645ed0 00000000 8067da84
Hi    : 00000000
Lo    : 00000000
epc   : 8067daa8 start_kernel+0x33c/0x500
ra    : 8067da84 start_kernel+0x318/0x500
Status: 11000402 KERNEL EXL
Cause : 4080040c (ExcCode 03)
BadVA : ffffb0aa
PrId  : 0501992c (MIPS 1004Kc)
Modules linked in:
Process swapper/0 (pid: 0, threadinfo=80644000, task=8065e3e0, tls=00000000)
Call Trace:
[<8067daa8>] start_kernel+0x33c/0x500
Code: 24050240  0c0131f9  24849c64 <a200b0a8> 41606020  000000c0  0c1a45e6 00000000  0c1a5f44

UBSAN also detects the same issue:

================================================================
UBSAN: Undefined behaviour in arch/mips/kernel/cevt-r4k.c:85:41
load of address 80647e4c with insufficient space
for an object of type 'unsigned int'
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #47
Call Trace:
[<80028f70>] show_stack+0x88/0xa4
[<80312654>] dump_stack+0x84/0xc0
[<8034163c>] ubsan_epilogue+0x14/0x50
[<803417d8>] __ubsan_handle_type_mismatch+0x160/0x168
[<8002dab0>] r4k_clockevent_init+0x544/0x764
[<80684d34>] time_init+0x18/0x90
[<8067fa5c>] start_kernel+0x2f0/0x500
=================================================================

buf2[] is intentionally only 3 elements so that the last element is the
median once 5 samples have been inserted, so explicitly prevent the
possibility of comparing against the 4th element rather than extending
the array.

Fixes: 1fa40555 ("MIPS: cevt-r4k: Dynamically calculate min_delta_ns")
Reported-by: NRabin Vincent <rabinv@axis.com>
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Tested-by: NRabin Vincent <rabinv@axis.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.7.x-
Patchwork: https://patchwork.linux-mips.org/patch/15892/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

9d7f29cd

MIPS: perf: fix deadlock · f2b42866

由 Rabin Vincent 提交于 4月 05, 2017

mipsxx_pmu_handle_shared_irq() calls irq_work_run() while holding the
pmuint_rwlock for read.  irq_work_run() can, via perf_pending_event(),
call try_to_wake_up() which can try to take rq->lock.

However, perf can also call perf_pmu_enable() (and thus take the
pmuint_rwlock for write) while holding the rq->lock, from
finish_task_switch() via perf_event_context_sched_in().

This leads to an ABBA deadlock:

 PID: 3855   TASK: 8f7ce288  CPU: 2   COMMAND: "process"
  #0 [89c39ac8] __delay at 803b5be4
  #1 [89c39ac8] do_raw_spin_lock at 8008fdcc
  #2 [89c39af8] try_to_wake_up at 8006e47c
  #3 [89c39b38] pollwake at 8018eab0
  #4 [89c39b68] __wake_up_common at 800879f4
  #5 [89c39b98] __wake_up at 800880e4
  #6 [89c39bc8] perf_event_wakeup at 8012109c
  #7 [89c39be8] perf_pending_event at 80121184
  #8 [89c39c08] irq_work_run_list at 801151f0
  #9 [89c39c38] irq_work_run at 80115274
 #10 [89c39c50] mipsxx_pmu_handle_shared_irq at 8002cc7c

 PID: 1481   TASK: 8eaac6a8  CPU: 3   COMMAND: "process"
  #0 [8de7f900] do_raw_write_lock at 800900e0
  #1 [8de7f918] perf_event_context_sched_in at 80122310
  #2 [8de7f938] __perf_event_task_sched_in at 80122608
  #3 [8de7f958] finish_task_switch at 8006b8a4
  #4 [8de7f998] __schedule at 805e4dc4
  #5 [8de7f9f8] schedule at 805e5558
  #6 [8de7fa10] schedule_hrtimeout_range_clock at 805e9984
  #7 [8de7fa70] poll_schedule_timeout at 8018e8f8
  #8 [8de7fa88] do_select at 8018f338
  #9 [8de7fd88] core_sys_select at 8018f5cc
 #10 [8de7fee0] sys_select at 8018f854
 #11 [8de7ff28] syscall_common at 80028fc8

The lock seems to be there to protect the hardware counters so there is
no need to hold it across irq_work_run().
Signed-off-by: NRabin Vincent <rabinv@axis.com>
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

f2b42866

MIPS: Malta: Fix i8259 irqchip setup · 9eec1c01

由 Matt Redfearn 提交于 4月 06, 2017

Since commit 4cfffcfa ("irqchip/mips-gic: Fix local interrupts"),
the gic driver has been allocating virq's for local interrupts during
its initialisation. Unfortunately on Malta platforms, these are the
first IRQs to be allocated and so are allocated virqs 1-3. The i8259
driver uses a legacy irq domain which expects to map virqs 0-15. Probing
of that driver therefore fails because some of those virqs are already
taken, with the warning:

WARNING: CPU: 0 PID: 0 at kernel/irq/irqdomain.c:344
irq_domain_associate+0x1e8/0x228
error: virq1 is already associated
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc6-00011-g4cfffcfa #368
Stack : 00000000 00000000 807ae03a 0000004d 00000000 806c1010 0000000b ffff0a01
        80725467 807258f4 806a64a4 00000000 00000000 807a9acc 00000100 80713e68
        806d5598 8017593c 8072bf90 8072bf94 806ac358 00000000 806abb60 80713ce4
        00000100 801b22d4 806d5598 8017593c 807ae03a 00000000 80713ce4 80720000
        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        ...
Call Trace:
[<8010c480>] show_stack+0x88/0xa4
[<80376758>] dump_stack+0x88/0xd0
[<8012c4a8>] __warn+0x104/0x118
[<8012c4ec>] warn_slowpath_fmt+0x30/0x3c
[<8017edfc>] irq_domain_associate+0x1e8/0x228
[<8017efd0>] irq_domain_add_legacy+0x7c/0xb0
[<80764c50>] __init_i8259_irqs+0x64/0xa0
[<80764ca4>] i8259_of_init+0x18/0x74
[<8076ddc0>] of_irq_init+0x19c/0x310
[<80752dd8>] arch_init_irq+0x28/0x19c
[<80750a08>] start_kernel+0x2a8/0x434

Fix this by reserving the required i8259 virqs in malta platform code
before probing any irq chips.

Fixes: 4cfffcfa ("irqchip/mips-gic: Fix local interrupts")
Signed-off-by: NMatt Redfearn <matt.redfearn@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15919/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

9eec1c01

07 4月, 2017 6 次提交

Revert "Revert "arm64: hugetlb: partial revert of "" · 6ae979ab

由 Will Deacon 提交于 3月 31, 2017

The use of the contiguous bit by our hugetlb implementation violates
the break-before-make requirements of the architecture and can lead to
silent data corruption or TLB conflict aborts. Once again, disable these
hugetlb sizes whilst it gets worked out.

This reverts commit ab2e1b89.

Conflicts:
	arch/arm64/mm/hugetlbpage.c
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6ae979ab

powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() · 4749228f

由 Michael Ellerman 提交于 4月 06, 2017

In crc32c_vpmsum() we call enable_kernel_altivec() without first
disabling preemption, which is not allowed:

  WARNING: CPU: 9 PID: 2949 at ../arch/powerpc/kernel/process.c:277 enable_kernel_altivec+0x100/0x120
  Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto ...
  CPU: 9 PID: 2949 Comm: docker Not tainted 4.11.0-rc5-compiler_gcc-6.3.1-00033-g308ac756 #381
  ...
  NIP [c00000000001e320] enable_kernel_altivec+0x100/0x120
  LR [d000000003df0910] crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
  Call Trace:
    0xc138fd09 (unreliable)
    crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
    crc32c_vpmsum_update+0x3c/0x60 [crc32c_vpmsum]
    crypto_shash_update+0x88/0x1c0
    crc32c+0x64/0x90 [libcrc32c]
    dm_bm_checksum+0x48/0x80 [dm_persistent_data]
    sb_check+0x84/0x120 [dm_thin_pool]
    dm_bm_validate_buffer.isra.0+0xc0/0x1b0 [dm_persistent_data]
    dm_bm_read_lock+0x80/0xf0 [dm_persistent_data]
    __create_persistent_data_objects+0x16c/0x810 [dm_thin_pool]
    dm_pool_metadata_open+0xb0/0x1a0 [dm_thin_pool]
    pool_ctr+0x4cc/0xb60 [dm_thin_pool]
    dm_table_add_target+0x16c/0x3c0
    table_load+0x184/0x400
    ctl_ioctl+0x2f0/0x560
    dm_ctl_ioctl+0x38/0x50
    do_vfs_ioctl+0xd8/0x920
    SyS_ioctl+0x68/0xc0
    system_call+0x38/0xfc

It used to be sufficient just to call pagefault_disable(), because that
also disabled preemption. But the two were decoupled in commit 8222dbe2
("sched/preempt, mm/fault: Decouple preemption from the page fault
logic") in mid 2015.

So add the missing preempt_disable/enable(). We should also call
disable_kernel_fp(), although it does nothing by default, there is a
debug switch to make it active and all enables should be paired with
disables.

Fixes: 6dd7a82c ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4749228f

sparc: remove unused wp_works_ok macro · 86e1066f

由 Mathias Krause 提交于 4月 05, 2017

It's unused for ages, used to be required for ksyms.c back in the v1.1
times.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86e1066f

sparc32: Export vac_cache_size to fix build error · 9d262d95

由 Guenter Roeck 提交于 4月 01, 2017

sparc32:allmodconfig fails to build with the following error.

ERROR: "vac_cache_size" [drivers/infiniband/sw/rxe/rdma_rxe.ko] undefined!

Fixes: cb886455 ("infiniband: Fix alignment of mmap cookies ...")
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Doug Ledford <dledford@redhat.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d262d95

sparc64: Fix memory corruption when THP is enabled · 76811263

由 Nitin Gupta 提交于 3月 31, 2017

The memory corruption was happening due to incorrect
TLB/TSB flushing of hugepages.
Reported-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76811263

sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write() · 9ae34dbd

由 Tom Hromatka 提交于 3月 31, 2017

This commit moves sparc64's prototype of pmd_write() outside
of the CONFIG_TRANSPARENT_HUGEPAGE ifdef.

In 2013, commit a7b9403f ("sparc64: Encode huge PMDs using PTE
encoding.") exposed a path where pmd_write() could be called without
CONFIG_TRANSPARENT_HUGEPAGE defined.  This can result in the panic below.

The diff is awkward to read, but the changes are straightforward.
pmd_write() was moved outside of #ifdef CONFIG_TRANSPARENT_HUGEPAGE.
Also, __HAVE_ARCH_PMD_WRITE was defined.

kernel BUG at include/asm-generic/pgtable.h:576!
              \|/ ____ \|/
              "@'/ .. \`@"
              /_| \__/ |_\
                 \__U_/
oracle_8114_cdb(8114): Kernel bad sw trap 5 [#1]
CPU: 120 PID: 8114 Comm: oracle_8114_cdb Not tainted
4.1.12-61.7.1.el6uek.rc1.sparc64 #1
task: fff8400700a24d60 ti: fff8400700bc4000 task.ti: fff8400700bc4000
TSTATE: 0000004411e01607 TPC: 00000000004609f8 TNPC: 00000000004609fc Y:
00000005    Not tainted
TPC: <gup_huge_pmd+0x198/0x1e0>
g0: 000000000001c000 g1: 0000000000ef3954 g2: 0000000000000000 g3: 0000000000000001
g4: fff8400700a24d60 g5: fff8001fa5c10000 g6: fff8400700bc4000 g7: 0000000000000720
o0: 0000000000bc5058 o1: 0000000000000240 o2: 0000000000006000 o3: 0000000000001c00
o4: 0000000000000000 o5: 0000048000080000 sp: fff8400700bc6ab1 ret_pc: 00000000004609f0
RPC: <gup_huge_pmd+0x190/0x1e0>
l0: fff8400700bc74fc l1: 0000000000020000 l2: 0000000000002000 l3: 0000000000000000
l4: fff8001f93250950 l5: 000000000113f800 l6: 0000000000000004 l7: 0000000000000000
i0: fff8400700ca46a0 i1: bd0000085e800453 i2: 000000026a0c4000 i3: 000000026a0c6000
i4: 0000000000000001 i5: fff800070c958de8 i6: fff8400700bc6b61 i7: 0000000000460dd0
I7: <gup_pud_range+0x170/0x1a0>
Call Trace:
 [0000000000460dd0] gup_pud_range+0x170/0x1a0
 [0000000000460e84] get_user_pages_fast+0x84/0x120
 [00000000006f5a18] iov_iter_get_pages+0x98/0x240
 [00000000005fa744] do_direct_IO+0xf64/0x1e00
 [00000000005fbbc0] __blockdev_direct_IO+0x360/0x15a0
 [00000000101f74fc] ext4_ind_direct_IO+0xdc/0x400 [ext4]
 [00000000101af690] ext4_ext_direct_IO+0x1d0/0x2c0 [ext4]
 [00000000101af86c] ext4_direct_IO+0xec/0x220 [ext4]
 [0000000000553bd4] generic_file_read_iter+0x114/0x140
 [00000000005bdc2c] __vfs_read+0xac/0x100
 [00000000005bf254] vfs_read+0x54/0x100
 [00000000005bf368] SyS_pread64+0x68/0x80
Signed-off-by: NTom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ae34dbd

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功