提交 · 4d7eaa12f33c07436939926600638b0d6ab73999 · openeuler / raspberrypi-kernel

08 3月, 2014 5 次提交

x86: Ignore NMIs that come in during early boot · 5fa10196

由 H. Peter Anvin 提交于 3月 07, 2014

Don Zickus reports:

A customer generated an external NMI using their iLO to test kdump
worked.  Unfortunately, the machine hung.  Disabling the nmi_watchdog
made things work.

I speculated the external NMI fired, caused the machine to panic (as
expected) and the perf NMI from the watchdog came in and was latched.
My guess was this somehow caused the hang.

   ----

It appears that the latched NMI stays latched until the early page
table generation on 64 bits, which causes exceptions to happen which
end in IRET, which re-enable NMI.  Therefore, ignore NMIs that come in
during early execution, until we have proper exception handling.
Reported-and-tested-by: NDon Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1394221143-29713-1-git-send-email-dzickus@redhat.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> # v3.5+, older with some backport effort

5fa10196

ARM: 7992/1: boot: compressed: ignore bswapsdi2.S · 38e0b088

由 Mark Rutland 提交于 2月 26, 2014

Commit 017f161a (ARM: 7877/1: use built-in byte swap function) added
bswapsdi2.{o,S} to arch/arm/boot/compressed/Makefile, but didn't update
the .gitignore. Thus after a a build git status shows bswapsdi2.S as a
new file, which is a little annoying.

This patch updates arch/arm/boot/compressed/.gitignore to ignore
bswapsdi2.S, as we already do for ashldi3.S and others.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NNicolas Pitre <nico@linaro.org>
Acked-by: NKim Phillips <kim.phillips@freescale.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

38e0b088

ARM: 7991/1: sa1100: fix compile problem on Collie · 052450fd

由 Linus Walleij 提交于 2月 25, 2014

Due to a problem in the MFD Kconfig it was not possible to
compile the UCB battery driver for the Collie SA1100 system,
in turn making it impossible to compile in the battery driver.
(See patch "mfd: include all drivers in subsystem menu".)

After fixing the MFD Kconfig (separate patch) a compile error
appears in the Collie battery driver due to the <mach/collie.h>
implicitly requiring <mach/hardware.h> through <linux/gpio.h>
via <mach/gpio.h> prior to commit
40ca061b "ARM: 7841/1: sa1100: remove complex GPIO interface".

Fix this up by including the required header into
<mach/collie.h>.

Cc: stable@vger.kernel.org
Cc: Andrea Adami <andrea.adami@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

052450fd

ARM: fix noMMU kallsyms symbol filtering · 006fa259

由 Russell King 提交于 2月 26, 2014

With noMMU, CONFIG_PAGE_OFFSET was not being set correctly.  As there's
no MMU, PAGE_OFFSET should be equal to PHYS_OFFSET in all cases.  This
commit makes that explicit.

Since we do this, we don't need to mess around in asm/memory.h with
ifdefs to sort this out, so let's get rid of that, and there's no point
offering the "Memory split" option for noMMU as that's meaningless
there.

Fixes: b9b32bf7 ("ARM: use linker magic for vectors and vector stubs")
Cc: <stable@vger.kernel.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

006fa259

ARC: Use correct PTAG register for icache flush · b053940d

由 Vineet Gupta 提交于 3月 07, 2014

This fixes a subtle issue with cache flush which could potentially cause
random userspace crashes because of stale icache lines.

This error crept in when consolidating the cache flush code

Fixes: bd12976c (ARC: cacheflush refactor #3: Unify the {d,i}cache)
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org  # 3.13
Cc: arc-linux-dev@synopsys.com
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b053940d

07 3月, 2014 3 次提交

powerpc: Align p_dyn, p_rela and p_st symbols · a5b2cf5b

由 Anton Blanchard 提交于 3月 04, 2014

The 64bit relocation code places a few symbols in the text segment.
These symbols are only 4 byte aligned where they need to be 8 byte
aligned. Add an explicit alignment.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Tested-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a5b2cf5b

powerpc/tm: Fix crash when forking inside a transaction · 621b5060

由 Michael Neuling 提交于 3月 03, 2014

When we fork/clone we currently don't copy any of the TM state to the new
thread.  This results in a TM bad thing (program check) when the new process is
switched in as the kernel does a tmrechkpt with TEXASR FS not set.  Also, since
R1 is from userspace, we trigger the bad kernel stack pointer detection.  So we
end up with something like this:

   Bad kernel stack pointer 0 at c0000000000404fc
   cpu 0x2: Vector: 700 (Program Check) at [c00000003ffefd40]
       pc: c0000000000404fc: restore_gprs+0xc0/0x148
       lr: 0000000000000000
       sp: 0
      msr: 9000000100201030
     current = 0xc000001dd1417c30
     paca    = 0xc00000000fe00800   softe: 0        irq_happened: 0x01
       pid   = 0, comm = swapper/2
   WARNING: exception is not recoverable, can't continue

The below fixes this by flushing the TM state before we copy the task_struct to
the clone.  To do this we go through the tmreclaim patch, which removes the
checkpointed registers from the CPU and transitions the CPU out of TM suspend
mode.  Hence we need to call tmrechkpt after to restore the checkpointed state
and the TM mode for the current task.

To make this fail from userspace is simply:
	tbegin
	li	r0, 2
	sc
	<boom>

Kudos to Adhemerval Zanella Neto for finding this.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
cc: Adhemerval Zanella Neto <azanella@br.ibm.com>
cc: stable@vger.kernel.org
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

621b5060

x86, trace: Further robustify CR2 handling vs tracing · d4078e23

由 Peter Zijlstra 提交于 3月 05, 2014

Building on commit 0ac09f9f ("x86, trace: Fix CR2 corruption when
tracing page faults") this patch addresses another few issues:

 - Now that read_cr2() is lifted into trace_do_page_fault(), we should
   pass the address to trace_page_fault_entries() to avoid it
   re-reading a potentially changed cr2.

 - Put both trace_do_page_fault() and trace_page_fault_entries() under
   CONFIG_TRACING.

 - Mark both fault entry functions {,trace_}do_page_fault() as notrace
   to avoid getting __mcount or other function entry trace callbacks
   before we've observed CR2.

 - Mark __do_page_fault() as noinline to guarantee the function tracer
   does get to see the fault.

Cc: <jolsa@redhat.com>
Cc: <vincent.weaver@maine.edu>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140306145300.GO9987@twins.programming.kicks-ass.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d4078e23

05 3月, 2014 3 次提交

x86, trace: Fix CR2 corruption when tracing page faults · 0ac09f9f

由 Jiri Olsa 提交于 2月 28, 2014

The trace_do_page_fault function trigger tracepoint
and then handles the actual page fault.

This could lead to error if the tracepoint caused page
fault. The original cr2 value gets lost and the original
page fault handler kills current process with SIGSEGV.

This happens if you record page faults with callchain
data, the user part of it will cause tracepoint handler
to page fault:

  # perf record -g -e exceptions:page_fault_user ls

Fixing this by saving the original cr2 value
and using it after tracepoint handler is done.

v2: Moving the cr2 read before exception_enter, because
    it could trigger tracepoint as well.
Reported-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Tested-by: NVince Weaver <vincent.weaver@maine.edu>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1402211701380.6395@vincent-weaver-1.um.maine.edu
Link: http://lkml.kernel.org/r/20140228160526.GD1133@krava.brq.redhat.com

0ac09f9f

x86/efi: Quirk out SGI UV · a5d90c92

由 Borislav Petkov 提交于 3月 04, 2014

Alex reported hitting the following BUG after the EFI 1:1 virtual
mapping work was merged,

 kernel BUG at arch/x86/mm/init_64.c:351!
 invalid opcode: 0000 [#1] SMP
 Call Trace:
  [<ffffffff818aa71d>] init_extra_mapping_uc+0x13/0x15
  [<ffffffff818a5e20>] uv_system_init+0x22b/0x124b
  [<ffffffff8108b886>] ? clockevents_register_device+0x138/0x13d
  [<ffffffff81028dbb>] ? setup_APIC_timer+0xc5/0xc7
  [<ffffffff8108b620>] ? clockevent_delta2ns+0xb/0xd
  [<ffffffff818a3a92>] ? setup_boot_APIC_clock+0x4a8/0x4b7
  [<ffffffff8153d955>] ? printk+0x72/0x74
  [<ffffffff818a1757>] native_smp_prepare_cpus+0x389/0x3d6
  [<ffffffff818957bc>] kernel_init_freeable+0xb7/0x1fb
  [<ffffffff81535530>] ? rest_init+0x74/0x74
  [<ffffffff81535539>] kernel_init+0x9/0xff
  [<ffffffff81541dfc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81535530>] ? rest_init+0x74/0x74

Getting this thing to work with the new mapping scheme would need more
work, so automatically switch to the old memmap layout for SGI UV.
Acked-by: NRuss Anderson <rja@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com
Signed-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>

a5d90c92

c6x: fix build failure caused by cache.h · ae72758f

由 Mark Salter 提交于 11月 11, 2013

A patch to linux/irqflags.h uncovered a problem with c6x asm/cache.h
which causes a build failure:

/arch/c6x/include/asm/cache.h:63:20: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘c6x_cache_init’
 extern void __init c6x_cache_init(void);

The asm/cache.h was relying on linux/irqflags.h to pull in linux/init.h
but the recent patch changed that. The c6x header should have included
linux/init.h all along.
Signed-off-by: NMark Salter <msalter@redhat.com>

ae72758f

04 3月, 2014 1 次提交

sh: prefix sh-specific "CCR" and "CCR2" by "SH_" · a5f6ea29

由 Geert Uytterhoeven 提交于 3月 03, 2014

Commit bcf24e1d ("mmc: omap_hsmmc: use the generic config for
omap2plus devices"), enabled the build for other platforms for compile
testing.

sh-allmodconfig now fails with:

    include/linux/omap-dma.h:171:8: error: expected identifier before numeric constant
    make[4]: *** [drivers/mmc/host/omap_hsmmc.o] Error 1

This happens because SuperH #defines "CCR", which is one of the enum
values in include/linux/omap-dma.h.  There's a similar issue with "CCR2"
on sh2a.

As "CCR" and "CCR2" are too generic names for global #defines, prefix
them with "SH_" to fix this.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a5f6ea29

03 3月, 2014 1 次提交

ARM: XEN depends on having a MMU · 7693decc

由 Uwe Kleine-König 提交于 3月 03, 2014

arch/arm/xen/enlighten.c (and maybe others) use MMU-specific functions
like pte_mkspecial which are only available on MMU builds. So let XEN
depend on MMU.
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: NStefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

7693decc

01 3月, 2014 1 次提交

arm64: Fix !CONFIG_SMP kernel build · b57fc9e8

由 Catalin Marinas 提交于 2月 28, 2014

Commit fb4a9602 (arm64: kernel: fix per-cpu offset restore on
resume) uses per_cpu_offset() unconditionally during CPU wakeup,
however, this is only defined for the SMP case.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NDave P Martin <Dave.Martin@arm.com>

b57fc9e8

28 2月, 2014 12 次提交

arm64: mm: Add double logical invert to pte accessors · 84fe6826

由 Steve Capper 提交于 2月 25, 2014

Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
	gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.
Signed-off-by: NSteve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

84fe6826

powerpc/powernv: Fix indirect XSCOM unmangling · e0cf9576

由 Benjamin Herrenschmidt 提交于 2月 28, 2014

We need to unmangle the full address, not just the register
number, and we also need to support the real indirect bit
being set for in-kernel uses.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.13]

e0cf9576

powerpc/powernv: Fix opal_xscom_{read,write} prototype · 2f3f38e4

由 Benjamin Herrenschmidt 提交于 2月 28, 2014

The OPAL firmware functions opal_xscom_read and opal_xscom_write
take a 64-bit argument for the XSCOM (PCB) address in order to
support the indirect mode on P8.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.13]

2f3f38e4

powerpc/powernv: Refactor PHB diag-data dump · af87d2fe

由 Gavin Shan 提交于 2月 25, 2014

As Ben suggested, the patch prints PHB diag-data with multiple
fields in one line and omits the line if the fields of that
line are all zero.

With the patch applied, the PHB3 diag-data dump looks like:

PHB3 PHB#3 Diag-data (Version: 1)

  brdgCtl:     00000002
  RootSts:     0000000f 00400000 b0830008 00100147 00002000
  nFir:        0000000000000000 0030006e00000000 0000000000000000
  PhbSts:      0000001c00000000 0000000000000000
  Lem:         0000000000100000 42498e327f502eae 0000000000000000
  InAErr:      8000000000000000 8000000000000000 0402030000000000 0000000000000000
  PE[  8] A/B: 8480002b00000000 8000000000000000

[ The current diag data is so big that it overflows the printk
  buffer pretty quickly in cases when we get a handful of errors
  at once which can happen. --BenH
]
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

af87d2fe

powerpc/powernv: Dump PHB diag-data immediately · 94716604

由 Gavin Shan 提交于 2月 25, 2014

The PHB diag-data is important to help locating the root cause for
EEH errors such as frozen PE or fenced PHB. However, the EEH core
enables IO path by clearing part of HW registers before collecting
this data causing it to be corrupted.

This patch fixes this by dumping the PHB diag-data immediately when
frozen/fenced state on PE or PHB is detected for the first time in
eeh_ops::get_state() or next_error() backend.
Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

94716604

powerpc: Increase stack redzone for 64-bit userspace to 512 bytes · 573ebfa6

由 Paul Mackerras 提交于 2月 26, 2014

The new ELFv2 little-endian ABI increases the stack redzone -- the
area below the stack pointer that can be used for storing data --
from 288 bytes to 512 bytes.  This means that we need to allow more
space on the user stack when delivering a signal to a 64-bit process.

To make the code a bit clearer, we define new USER_REDZONE_SIZE and
KERNEL_REDZONE_SIZE symbols in ptrace.h.  For now, we leave the
kernel redzone size at 288 bytes, since increasing it to 512 bytes
would increase the size of interrupt stack frames correspondingly.

Gcc currently only makes use of 288 bytes of redzone even when
compiling for the new little-endian ABI, and the kernel cannot
currently be compiled with the new ABI anyway.

In the future, hopefully gcc will provide an option to control the
amount of redzone used, and then we could reduce it even more.

This also changes the code in arch_compat_alloc_user_space() to
preserve the expanded redzone.  It is not clear why this function would
ever be used on a 64-bit process, though.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
CC: <stable@vger.kernel.org> [v3.13]
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

573ebfa6

powerpc/ftrace: bugfix for test_24bit_addr · a95fc585

由 Liu Ping Fan 提交于 2月 26, 2014

The branch target should be the func addr, not the addr of func_descr_t.
So using ppc_function_entry() to generate the right target addr.
Signed-off-by: NLiu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a95fc585

powerpc/crashdump : Fix page frame number check in copy_oldmem_page · f5295bd8

由 Laurent Dufour 提交于 2月 24, 2014

In copy_oldmem_page, the current check using max_pfn and min_low_pfn to
decide if the page is backed or not, is not valid when the memory layout is
not continuous.

This happens when running as a QEMU/KVM guest, where RTAS is mapped higher
in the memory. In that case max_pfn points to the end of RTAS, and a hole
between the end of the kdump kernel and RTAS is not backed by PTEs. As a
consequence, the kdump kernel is crashing in copy_oldmem_page when accessing
in a direct way the pages in that hole.

This fix relies on the memblock's service memblock_is_region_memory to
check if the read page is part or not of the directly accessible memory.
Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f5295bd8

powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctly · 41dd03a9

由 Tony Breeds 提交于 2月 20, 2014

Currently we're storing a host endian RTAS token in
rtas_stop_self_args.token.  We then pass that directly to rtas.  This is
fine on big endian however on little endian the token is not what we
expect.

This will typically result in hitting:
	panic("Alas, I survived.\n");

To fix this we always use the stop-self token in host order and always
convert it to be32 before passing this to rtas.
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

41dd03a9

kvm, vmx: Really fix lazy FPU on nested guest · 1b385cbd

由 Paolo Bonzini 提交于 2月 27, 2014

Commit e504c909 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13)
highlighted a real problem, but the fix was subtly wrong.

nested_read_cr0 is the CR0 as read by L2, but here we want to look at
the CR0 value reflecting L1's setup.  In other words, L2 might think
that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually
running it with TS=1, we should inject the fault into L1.

The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use
it.

Fixes: e504c909Reported-by: NKashyap Chamarty <kchamart@redhat.com>
Reported-by: NStefan Bader <stefan.bader@canonical.com>
Tested-by: NKashyap Chamarty <kchamart@redhat.com>
Tested-by: NAnthoine Bourgeois <bourgeois@bertin.fr>
Cc: stable@vger.kernel.org
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1b385cbd

kvm: x86: fix emulator buffer overflow (CVE-2014-0049) · a08d3b3b

由 Andrew Honig 提交于 2月 27, 2014

The problem occurs when the guest performs a pusha with the stack
address pointing to an mmio address (or an invalid guest physical
address) to start with, but then extending into an ordinary guest
physical address. When doing repeated emulated pushes
emulator_read_write sets mmio_needed to 1 on the first one. On a
later push when the stack points to regular memory,
mmio_nr_fragments is set to 0, but mmio_is_needed is not set to 0.

As a result, KVM exits to userspace, and then returns to
complete_emulated_mmio. In complete_emulated_mmio
vcpu->mmio_cur_fragment is incremented. The termination condition of
vcpu->mmio_cur_fragment == vcpu->mmio_nr_fragments is never achieved.
The code bounces back and fourth to userspace incrementing
mmio_cur_fragment past it's buffer. If the guest does nothing else it
eventually leads to a a crash on a memcpy from invalid memory address.

However if a guest code can cause the vm to be destroyed in another
vcpu with excellent timing, then kvm_clear_async_pf_completion_queue
can be used by the guest to control the data that's pointed to by the
call to cancel_work_item, which can be used to gain execution.

Fixes: f78146b0Signed-off-by: NAndrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org (3.5+)
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

a08d3b3b

arm/arm64: KVM: detect CPU reset on CPU_PM_EXIT · b20c9f29

由 Marc Zyngier 提交于 2月 26, 2014

Commit 1fcf7ce0 (arm: kvm: implement CPU PM notifier) added
support for CPU power-management, using a cpu_notifier to re-init
KVM on a CPU that entered CPU idle.

The code assumed that a CPU entering idle would actually be powered
off, loosing its state entierely, and would then need to be
reinitialized. It turns out that this is not always the case, and
some HW performs CPU PM without actually killing the core. In this
case, we try to reinitialize KVM while it is still live. It ends up
badly, as reported by Andre Przywara (using a Calxeda Midway):

[    3.663897] Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x685760
[    3.663897] unexpected data abort in Hyp mode at: 0xc067d150
[    3.663897] unexpected HVC/SVC trap in Hyp mode at: 0xc0901dd0

The trick here is to detect if we've been through a full re-init or
not by looking at HVBAR (VBAR_EL2 on arm64). This involves
implementing the backend for __hyp_get_vectors in the main KVM HYP
code (rather small), and checking the return value against the
default one when the CPU notifier is called on CPU_PM_EXIT.
Reported-by: NAndre Przywara <osp@andrep.de>
Tested-by: NAndre Przywara <osp@andrep.de>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Rob Herring <rob.herring@linaro.org>
Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b20c9f29

27 2月, 2014 2 次提交

perf/x86: Fix event scheduling · 26e61e89

由 Peter Zijlstra 提交于 2月 21, 2014

Vince "Super Tester" Weaver reported a new round of syscall fuzzing (Trinity) failures,
with perf WARN_ON()s triggering. He also provided traces of the failures.

This is I think the relevant bit:

	>    pec_1076_warn-2804  [000] d...   147.926153: x86_pmu_disable: x86_pmu_disable
	>    pec_1076_warn-2804  [000] d...   147.926153: x86_pmu_state: Events: {
	>    pec_1076_warn-2804  [000] d...   147.926156: x86_pmu_state:   0: state: .R config: ffffffffffffffff (          (null))
	>    pec_1076_warn-2804  [000] d...   147.926158: x86_pmu_state:   33: state: AR config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926159: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926160: x86_pmu_state: n_events: 1, n_added: 0, n_txn: 1
	>    pec_1076_warn-2804  [000] d...   147.926161: x86_pmu_state: Assignment: {
	>    pec_1076_warn-2804  [000] d...   147.926162: x86_pmu_state:   0->33 tag: 1 config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926163: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926166: collect_events: Adding event: 1 (ffff880119ec8800)

So we add the insn:p event (fd[23]).

At this point we should have:

  n_events = 2, n_added = 1, n_txn = 1

	>    pec_1076_warn-2804  [000] d...   147.926170: collect_events: Adding event: 0 (ffff8800c9e01800)
	>    pec_1076_warn-2804  [000] d...   147.926172: collect_events: Adding event: 4 (ffff8800cbab2c00)

We try and add the {BP,cycles,br_insn} group (fd[3], fd[4], fd[15]).
These events are 0:cycles and 4:br_insn, the BP event isn't x86_pmu so
that's not visible.

	group_sched_in()
	  pmu->start_txn() /* nop - BP pmu */
	  event_sched_in()
	     event->pmu->add()

So here we should end up with:

  0: n_events = 3, n_added = 2, n_txn = 2
  4: n_events = 4, n_added = 3, n_txn = 3

But seeing the below state on x86_pmu_enable(), the must have failed,
because the 0 and 4 events aren't there anymore.

Looking at group_sched_in(), since the BP is the leader, its
event_sched_in() must have succeeded, for otherwise we would not have
seen the sibling adds.

But since neither 0 or 4 are in the below state; their event_sched_in()
must have failed; but I don't see why, the complete state: 0,0,1:p,4
fits perfectly fine on a core2.

However, since we try and schedule 4 it means the 0 event must have
succeeded!  Therefore the 4 event must have failed, its failure will
have put group_sched_in() into the fail path, which will call:

	event_sched_out()
	  event->pmu->del()

on 0 and the BP event.

Now x86_pmu_del() will reduce n_events; but it will not reduce n_added;
giving what we see below:

 n_event = 2, n_added = 2, n_txn = 2

	>    pec_1076_warn-2804  [000] d...   147.926177: x86_pmu_enable: x86_pmu_enable
	>    pec_1076_warn-2804  [000] d...   147.926177: x86_pmu_state: Events: {
	>    pec_1076_warn-2804  [000] d...   147.926179: x86_pmu_state:   0: state: .R config: ffffffffffffffff (          (null))
	>    pec_1076_warn-2804  [000] d...   147.926181: x86_pmu_state:   33: state: AR config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926182: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926184: x86_pmu_state: n_events: 2, n_added: 2, n_txn: 2
	>    pec_1076_warn-2804  [000] d...   147.926184: x86_pmu_state: Assignment: {
	>    pec_1076_warn-2804  [000] d...   147.926186: x86_pmu_state:   0->33 tag: 1 config: 0 (ffff88011ac99800)
	>    pec_1076_warn-2804  [000] d...   147.926188: x86_pmu_state:   1->0 tag: 1 config: 1 (ffff880119ec8800)
	>    pec_1076_warn-2804  [000] d...   147.926188: x86_pmu_state: }
	>    pec_1076_warn-2804  [000] d...   147.926190: x86_pmu_enable: S0: hwc->idx: 33, hwc->last_cpu: 0, hwc->last_tag: 1 hwc->state: 0

So the problem is that x86_pmu_del(), when called from a
group_sched_in() that fails (for whatever reason), and without x86_pmu
TXN support (because the leader is !x86_pmu), will corrupt the n_added
state.
Reported-and-Tested-by: NVince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20140221150312.GF3104@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

26e61e89

KVM: MMU: drop read-only large sptes when creating lower level sptes · 404381c5

由 Marcelo Tosatti 提交于 2月 24, 2014

Read-only large sptes can be created due to read-only faults as
follows:

- QEMU pagetable entry that maps guest memory is read-only
due to COW.
- Guest read faults such memory, COW is not broken, because
it is a read-only fault.
- Enable dirty logging, large spte not nuked because it is read-only.
- Write-fault on such memory causes guest to loop endlessly
(which must go down to level 1 because dirty logging is enabled).

Fix by dropping large spte when necessary.
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

404381c5

26 2月, 2014 2 次提交

x86, kaslr: add missed "static" declarations · e290e8c5

由 Kees Cook 提交于 2月 09, 2014

This silences build warnings about unexported variables and functions.
Signed-off-by: NKees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20140209215644.GA30339@www.outflux.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e290e8c5

x86, kaslr: export offset in VMCOREINFO ELF notes · b6085a86

由 Eugene Surovegin 提交于 1月 23, 2014

Include kASLR offset in VMCOREINFO ELF notes to assist in debugging.

[ hpa: pushing this for v3.14 to avoid having a kernel version with
  kASLR where we can't debug output. ]
Signed-off-by: NEugene Surovegin <surovegin@google.com>
Link: http://lkml.kernel.org/r/20140123173120.GA25474@www.outflux.netSigned-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b6085a86

22 2月, 2014 10 次提交

perf/x86/uncore: Fix IVT/SNB-EP uncore CBOX NID filter table · 337397f3

由 Stephane Eranian 提交于 2月 19, 2014

This patch updates the CBOX PMU filters mapping tables for SNB-EP
and IVT (model 45 and 62 respectively).

The NID umask always comes in addition to another umask.
When set, the NID filter is applied.

The current mapping tables were missing some code/umask
combinations to account for the NID umask. This patch
fixes that.

Cc: mingo@elte.hu
Cc: ak@linux.intel.com
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140219131018.GA24475@quadSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

337397f3

perf/x86: Correctly use FEATURE_PDCM · c9b08884

由 Peter Zijlstra 提交于 2月 03, 2014

The current code simply assumes Intel Arch PerfMon v2+ to have
the IA32_PERF_CAPABILITIES MSR; the SDM specifies that we should check
CPUID[1].ECX[15] (aka, FEATURE_PDCM) instead.

This was found by KVM which implements v2+ but didn't provide the
capabilities MSR. Change the code to DTRT; KVM will also implement the
MSR and return 0.

Cc: pbonzini@redhat.com
Reported-by: N"Michael S. Tsirkin" <mst@redhat.com>
Suggested-by: NEduardo Habkost <ehabkost@redhat.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140203132903.GI8874@twins.programming.kicks-ass.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c9b08884

perf, nmi: Fix unknown NMI warning · a3ef2229

由 Markus Metzger 提交于 2月 14, 2014

When using BTS on Core i7-4*, I get the below kernel warning.

$ perf record -c 1 -e branches:u ls
Message from syslogd@labpc1501 at Nov 11 15:49:25 ...
 kernel:[  438.317893] Uhhuh. NMI received for unknown reason 31 on CPU 2.

Message from syslogd@labpc1501 at Nov 11 15:49:25 ...
 kernel:[  438.317920] Do you have a strange power saving mode enabled?

Message from syslogd@labpc1501 at Nov 11 15:49:25 ...
 kernel:[  438.317945] Dazed and confused, but trying to continue

Make intel_pmu_handle_irq() take the full exit path when returning early.

Cc: eranian@google.com
Cc: peterz@infradead.org
Cc: mingo@kernel.org
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1392425048-5309-1-git-send-email-andi@firstfloor.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

a3ef2229

M
xtensa: wire up sched_setattr and sched_getattr syscalls · f63b6d75
由 Max Filippov 提交于 2月 18, 2014
```
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
```
f63b6d75

xtensa: xtfpga: set ethoc clock frequency · 2bc2fde6

由 Max Filippov 提交于 1月 29, 2014

Connect xtfpga board ethernet MAC to the clock in the DTS. Set up MAC
base frequency in the platform data in case of build w/o CONFIG_OF.
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>

2bc2fde6

xtensa: xtfpga: use common clock framework · cdc9af7c

由 Max Filippov 提交于 1月 29, 2014

With this change the board needs to set up single clock object, users of
this clock will get correct frequency automatically.
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>

cdc9af7c

M
xtensa: support common clock framework · bda8932d
由 Max Filippov 提交于 1月 29, 2014
```
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
```
bda8932d

xtensa: no need to select USE_GENERIC_SMP_HELPERS · 4e3b4df8

由 Paul Bolle 提交于 2月 09, 2014

Commit f615136c ("xtensa: add SMP support") added "select
USE_GENERIC_SMP_HELPERS". But the Kconfig symbol USE_GENERIC_SMP_HELPERS
was already removed in v3.13, so that select is a nop. Drop it.
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>

4e3b4df8

xtensa: fsf: drop nonexistent GPIO32 support · 8e9356c6

由 Max Filippov 提交于 2月 07, 2014

The toolchain for xtensa FSF core never supported GPIO32, drop it on the
linux side too.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
Acked-by: NBaruch Siach <baruch@tkos.co.il>

8e9356c6

xtensa: don't pass high memory to bootmem allocator · e9d6dca5

由 Max Filippov 提交于 1月 31, 2014

This fixes panic when booting on machine with more than 128M memory
passed from the bootloader.
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>

e9d6dca5