提交 · b76146851eeba6ad9fef982e8cf7cd8ebd4d30e1 · openeuler / raspberrypi-kernel

24 9月, 2014 1 次提交

perf/x86/intel: Remove incorrect model number from Haswell perf · b7614685

由 Andi Kleen 提交于 9月 02, 2014

71 is a Broadwell, not a Haswell. The model number was added
by mistake earlier.

Remove it for now, until it can be re-added later with
real Broadwell support.

In practice it does not cause a lot of issues because the Broadwell
PMU is very similar to Haswell, but some details were wrong,
and it's better to handle it correctly.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1409683455-29168-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

b7614685

09 9月, 2014 2 次提交

perf/x86: Fix section mismatch in split uncore driver · a08b6769

由 Andi Kleen 提交于 8月 29, 2014

The new split Intel uncore driver code that recently went
into tip added a section mismatch, which the build process
complains about.

uncore_pmu_register() can be called from uncore_pci_probe,()
which is not __init and can be called from pci driver ->probe.
I'm not fully sure if it's actually possible to call the probe
function later, but it seems safer to mark uncore_pmu_register
not __init.

This also fixes the warning.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1409332858-29039-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

a08b6769

perf/x86/intel: Mark initialization code as such · 066ce64c

由 Mathias Krause 提交于 8月 26, 2014

A few of the initialization functions are missing the __init annotation.
Fix this and thereby allow ~680 additional bytes of code to be released
after initialization.
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1409071785-26015-1-git-send-email-minipli@googlemail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

066ce64c

13 8月, 2014 15 次提交

perf/x86/uncore: Rename IvyTown to IvyBridge-EP · ddcd0973

由 Peter Zijlstra 提交于 8月 12, 2014

Keeping track of all the various CPU names is hard enough; adding extra
silly names for no reason is just not helping. If we know the base arch
name (IvyBridge) then we can do the client/server parts with the well
known {,EP,EX} postfixes, no need to remember endless amounts of
unrelated and pointless names for this.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-8559jke61dsyr7d0i74iutli@git.kernel.org
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

ddcd0973

perf/x86/uncore: Export basic memory events for IVT IMC PMU · 85a16ef6

由 Stephane Eranian 提交于 8月 12, 2014

This patch exposes two basic events for Ivytown IMC uncore PMU:

- cas_count_read: number of full-cache line reads to memory controller
- cas_count_write: number of full-cache line writes to memory controller

Those events use the same encoding as for SNB-EP, so reuse the same
event table. See specification in:

http://www.intel.com/content/dam/www/public/us/en/documents/manuals/xeon-e5-2600-v2-uncore-manual.pdf

By aggregating all the read and write events from all the memory controllers
of each processor socket, one can determine the total memory bandwidth utilization.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140812060031.GA25239@quad
Cc: zheng.z.yan@intel.com
Cc: ak@linux.intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

85a16ef6

perf/x86: Clean up __intel_pmu_pebs_event() code · c8aab2e0

由 Stephane Eranian 提交于 8月 11, 2014

This patch makes the code more readable. It also renames
precise_store_data_hsw() to precise_datala_hsw() because
the function is called for both loads and stores on HSW.
The patch also gets rid of the hardcoded store events
codes in that same function.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1407785233-32193-5-git-send-email-eranian@google.com
Cc: ak@linux.intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

c8aab2e0

perf/x86: Fix data source encoding issues for load latency/precise store · 770eee1f

由 Stephane Eranian 提交于 8月 11, 2014

This patch fixes issues introuduce by Andi's previous patch 'Revamp PEBS'
series.

This patch fixes the following:

 - precise_store_data_hsw() encode the mem op type whenever we can
 - precise_store_data_hsw set the default data source correctly

 - 0 is not a valid init value for data source. Define PERF_MEM_NA as the
   default value

This bug was actually introduced by

    commit 722e76e6
    Author: Stephane Eranian <eranian@google.com>
    Date:   Thu May 15 17:56:44 2014 +0200

        fix Haswell precise store data source encoding
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1407785233-32193-4-git-send-email-eranian@google.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: ak@linux.intel.com
Signed-off-by: NIngo Molnar <mingo@kernel.org>

770eee1f

perf/x86: Don't mark DataLA addresses as store · f3908b8c

由 Andi Kleen 提交于 8月 11, 2014

Haswell supports reporting the data address for a range
of PEBS events, including:

	UOPS_RETIRED.ALL
	MEM_UOPS_RETIRED.STLB_MISS_LOADS
	MEM_UOPS_RETIRED.STLB_MISS_STORES
	MEM_UOPS_RETIRED.LOCK_LOADS
	MEM_UOPS_RETIRED.SPLIT_LOADS
	MEM_UOPS_RETIRED.SPLIT_STORES
	MEM_UOPS_RETIRED.ALL_LOADS
	MEM_UOPS_RETIRED.ALL_STORES
	MEM_LOAD_UOPS_RETIRED.L1_HIT
	MEM_LOAD_UOPS_RETIRED.L2_HIT
	MEM_LOAD_UOPS_RETIRED.L3_HIT
	MEM_LOAD_UOPS_RETIRED.L1_MISS
	MEM_LOAD_UOPS_RETIRED.L2_MISS
	MEM_LOAD_UOPS_RETIRED.L3_MISS
	MEM_LOAD_UOPS_RETIRED.HIT_LFB
	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
	MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
	MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM

This facility was already enabled earlier with the original Haswell
perf changes.

However these addresses were always reports as stores by perf, which is wrong,
as they could be loads too.  The hardware does not distinguish loads and stores
for these instructions, so there's no (cheap) way for the profiler
to find out.

Change the type to PERF_MEM_OP_NA instead.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1407785233-32193-3-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

f3908b8c

perf/x86: Revamp PEBS event selection · 86a04461

由 Andi Kleen 提交于 8月 11, 2014

The basic idea is that it does not make sense to list all PEBS
events individually. The list is very long, sometimes outdated
and the hardware doesn't need it. If an event does not support
PEBS it will just not count, there is no security issue.

We need to only list events that something special, like
supporting load or store addresses.

This vastly simplifies the PEBS event selection. It also
speeds up the scheduling because the scheduler doesn't
have to walk as many constraints.

Bugs fixed:

 - We do not allow setting forbidden flags with PEBS anymore
   (SDM 18.9.4), except for the special cycle event.
   This is done using a new constraint macro that also
   matches on the event flags.

 - Correct DataLA and load/store/na flags reporting on Haswell
   [Requires a followon patch]

 - We did not allow all PEBS events on Haswell:
   We were missing some valid subevents in d1-d2 (MEM_LOAD_UOPS_RETIRED.*,
   MEM_LOAD_UOPS_RETIRED_L3_HIT_RETIRED.*)

This includes the changes proposed by Stephane earlier and obsoletes
his patchkit (except for some changes on pre Sandy Bridge/Silvermont
CPUs)

I only did Sandy Bridge and Silvermont and later so far, mostly because these
are the parts I could directly confirm the hardware behavior with hardware
architects. Also I do not believe the older CPUs have any
missing events in their PEBS list, so there's no pressing
need to change them.

I did not implement the flag proposed by Peter to allow
setting forbidden flags. If really needed this could
be implemented on to of this patch.

v2: Fix broken store events on SNB/IVB (Stephane Eranian)
v3: More fixes. Rename some arguments (Stephane Eranian)
v4: List most Haswell events individually again to report
memory operation type correctly.
Add new flags to describe load/store/na for datala.
Update description.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1407785233-32193-2-git-send-email-eranian@google.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

86a04461

perf/x86: Fix :pp without LBR · 03de874a

由 Andi Kleen 提交于 8月 07, 2014

This fixes a side effect of Kan's earlier patch to probe the LBRs at boot
time. Normally when the LBRs are disabled cycles:pp is disabled too.
So for example cycles:pp doesn't work.

However this is not needed with PEBSv2 and later (Haswell) because
it does not need LBRs to correct the IP-off-by-one.

So add an extra check for PEBSv2 that also allows :pp
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: kan.liang@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1407456534-15747-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

03de874a

perf/x86: Use extended offcore mask on Haswell · 36bbb2f2

由 Andi Kleen 提交于 7月 31, 2014

HSW-EP has a larger offcore mask than the client Haswell CPUs.
It is the same mask as on Sandy/IvyBridge-EP. All of
Haswell was using the client mask, so some bits were missing.

On the client parts some bits were also missing compared
to Sandy/IvyBridge, in particular the bits to match on a L4
cache hit.

The Haswell core in both client and server incarnations
accepts the same bits (but some are nops), so we can use
the same mask.

So use the snbep extended mask, which is a superset of the
client and the server, for all of Haswell.

This allows specifying a number of extra offcore events, like
for example for HSW-EP.

% perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x3fffc00100,name=offcore_response_pf_l3_rfo_l3_miss_any_response/ true

which were <not supported> before.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Reviewed-by: eranian@google.com
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1406840722-25416-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

36bbb2f2

perf/x86/uncore: Fix coccinelle warnings · 17a60345

由 Fengguang Wu 提交于 8月 04, 2014

arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c:961:2-3: Unneeded semicolon
arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c:1100:2-3: Unneeded semicolon
arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c:1138:2-3: Unneeded semicolon

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/n/tip-ovfvr4nbqjo7nzc16y2lpjy9@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

17a60345

perf/x86/uncore: move NHM-EX/WSM-EX specific code to seperate file · c1e46580

由 Yan, Zheng 提交于 7月 30, 2014

Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406704935-27708-4-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

c1e46580

perf/x86/uncore: Move SNB/IVB-EP specific code to seperate file · 8268fdfc

由 Yan, Zheng 提交于 7月 30, 2014

Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406704935-27708-3-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

8268fdfc

perf/x86/uncore: Move NHM/SNB/IVB specific code to seperate file · 92807ffd

由 Yan, Zheng 提交于 7月 30, 2014

Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1406704935-27708-2-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

92807ffd

perf/x86/uncore: Declare some functions and variables · 514b2346

由 Yan, Zheng 提交于 7月 30, 2014

Prepare for moving hardware specific code to seperate files.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: eranian@google.com
Cc: andi@firstfloor.org
Link: http://lkml.kernel.org/r/1406704935-27708-1-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

514b2346

perf/x86/intel: Update Intel models · 0f7c29ce

由 Peter Zijlstra 提交于 7月 30, 2014

The model number descriptions got a bit messy, clean them up.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-oo3xclxdoy8s7ubssn929vaj@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

0f7c29ce

PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use · 9baa3c34

由 Benoit Taine 提交于 8月 08, 2014

We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to
meet kernel coding style guidelines.  This issue was reported by checkpatch.

A simplified version of the semantic patch that makes this change is as
follows (http://coccinelle.lip6.fr/):

// <smpl>

@@
identifier i;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer z;
@@

- DEFINE_PCI_DEVICE_TABLE(i)
+ const struct pci_device_id i[]
= z;

// </smpl>

[bhelgaas: add semantic patch]
Signed-off-by: NBenoit Taine <benoit.taine@lip6.fr>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

9baa3c34

09 8月, 2014 1 次提交

arch/x86: replace strict_strto calls · 164109e3

由 Daniel Walter 提交于 8月 08, 2014

Replace obsolete strict_strto calls with appropriate kstrto calls
Signed-off-by: NDaniel Walter <dwalter@google.com>
Acked-by: NBorislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

164109e3

06 8月, 2014 1 次提交

x86: MCE: Add raw_lock conversion again · ed5c41d3

由 Thomas Gleixner 提交于 8月 05, 2014

Commit ea431643 ("x86/mce: Fix CMCI preemption bugs") breaks RT by
the completely unrelated conversion of the cmci_discover_lock to a
regular (non raw) spinlock.  This lock was annotated in commit
59d958d2 ("locking, x86: mce: Annotate cmci_discover_lock as raw")
with a proper explanation why.

The argument for converting the lock back to a regular spinlock was:

 - it does percpu ops without disabling preemption. Preemption is not
   disabled due to the mistaken use of a raw spinlock.

Which is complete nonsense.  The raw_spinlock is disabling preemption in
the same way as a regular spinlock.  In mainline spinlock maps to
raw_spinlock, in RT spinlock becomes a "sleeping" lock.

raw_spinlock has on RT exactly the same semantics as in mainline.  And
because this lock is taken in non preemptible context it must be raw on
RT.

Undo the locking brainfart.
Reported-by: NClark Williams <williams@redhat.com>
Reported-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ed5c41d3

31 7月, 2014 1 次提交

x86/mm: Rip out complicated, out-of-date, buggy TLB flushing · e9f4e0a9

由 Dave Hansen 提交于 7月 31, 2014

I think the flush_tlb_mm_range() code that tries to tune the
flush sizes based on the CPU needs to get ripped out for
several reasons:

1. It is obviously buggy. It uses mm->total_vm to judge the
task's footprint in the TLB. It should certainly be using
some measure of RSS, *NOT* ->total_vm since only resident
memory can populate the TLB.
2. Haswell, and several other CPUs are missing from the
intel_tlb_flushall_shift_set() function. Thus, it has been
demonstrated to bitrot quickly in practice.
3. It is plain wrong in my vm:
[ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.037444] tlb_flushall_shift: 6
Which leads to it to never use invlpg.
4. The assumptions about TLB refill costs are wrong:
http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com
(more on this in later patches)
5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59
I believe the sample times were too short. Running the
benchmark in a loop yields times that vary quite a bit.

Note that this leaves us with a static ceiling of 1 page. This
is a conservative, dumb setting, and will be revised in a later
patch.

This also removes the code which attempts to predict whether we
are flushing data or instructions. We expect instruction flushes
to be relatively rare and not worth tuning for explicitly.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.comAcked-by: NRik van Riel <riel@redhat.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e9f4e0a9

23 7月, 2014 1 次提交

x86, cpu: Fix cache topology for early P4-SMT · 2a226155

由 Peter Zijlstra 提交于 7月 22, 2014

P4 systems with cpuid level < 4 can have SMT, but the cache topology
description available (cpuid2) does not include SMP information.

Now we know that SMT shares all cache levels, and therefore we can
mark all available cache levels as shared.

We do this by setting cpu_llc_id to ->phys_proc_id, since that's
the same for each SMT thread. We can do this unconditional since if
there's no SMT its still true, the one CPU shares cache with only
itself.

This fixes a problem where such CPUs report an incorrect LLC CPU mask.

This in turn fixes a crash in the scheduler where the topology was
build wrong, it assumes the LLC mask to include at least the SMT CPUs.

Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: NBruno Wolff III <bruno@wolff.to>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140722133514.GM12054@laptop.lanSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

2a226155

22 7月, 2014 1 次提交

x86, MCE: Robustify mcheck_init_device · 51cbe7e7

由 Borislav Petkov 提交于 6月 20, 2014

BorisO reports that misc_register() fails often on xen. The current code
unregisters the CPU hotplug notifier in that case. If then a CPU is
offlined and onlined back again, we end up with a second timer running
on that CPU, leading to soft lockups and system hangs.

So let's leave the hotcpu notifier always registered - even if
mce_device_create failed for some cores and never unreg it so that we
can deal with the timer handling accordingly.
Reported-and-Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1403274493-1371-1-git-send-email-boris.ostrovsky@oracle.comSigned-off-by: NBorislav Petkov <bp@suse.de>

51cbe7e7

16 7月, 2014 5 次提交

perf/x86/intel: Avoid spamming kernel log for BTS buffer failure · 44851541

由 David Rientjes 提交于 6月 30, 2014

It's unnecessary to excessively spam the kernel log anytime the BTS buffer
cannot be allocated, so make this allocation __GFP_NOWARN.

The user probably will want to at least find some artifact that the
allocation has failed in the past, probably due to fragmentation because
of its large size, when it's not allocated at bootstrap. Thus, add a
WARN_ONCE() so something is left behind for them to understand why perf
commnads that require PEBS is not working properly.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1406301600460.26302@chino.kir.corp.google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

44851541

perf/x86/amd: Try to fix some mem allocation failure handling · 503d3291

由 Zhouyi Zhou 提交于 6月 11, 2014

According to Peter's advice, put the failure handling to a goto chain.
Compiled in x86_64, could you check if there is anything that I missed.
Signed-off-by: NZhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1402459743-20513-1-git-send-email-zhouzhouyi@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

503d3291

perf/x86/intel: Protect LBR and extra_regs against KVM lying · 338b522c

由 Kan Liang 提交于 7月 14, 2014

With -cpu host, KVM reports LBR and extra_regs support, if the host has
support.

When the guest perf driver tries to access LBR or extra_regs MSR,
it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
So check the related MSRs access right once at initialization time to avoid
the error access at runtime.

For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
(for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
Start the guest with -cpu host.
Run perf record with --branch-any or --branch-filter in guest to trigger LBR
Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
trigger offcore_rsp #GP
Signed-off-by: NKan Liang <kan.liang@intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

338b522c

perf/x86/intel/uncore: Fix SNB-EP/IVT Cbox filter mappings · 7711fe4f

由 Stephane Eranian 提交于 6月 30, 2014

This patch fixes the SNB-EP and IVT Cbox filter mapping
table. The table controls which filters are supported by
which events. There were several mistakes in those tables
causing some filters to be ignored, such as NID on
TOR_INSERTS.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: zheng.z.yan@intel.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140630144624.GA2604@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>

7711fe4f

perf/x86/intel: Use proper dTLB-load-misses event on IvyBridge · 1996388e

由 Vince Weaver 提交于 7月 14, 2014

This was discussed back in February:

	https://lkml.org/lkml/2014/2/18/956

But I never saw a patch come out of it.

On IvyBridge we share the SandyBridge cache event tables, but the
dTLB-load-miss event is not compatible.  Patch it up after
the fact to the proper DTLB_LOAD_MISSES.DEMAND_LD_MISS_CAUSES_A_WALK
Signed-off-by: NVince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1407141528200.17214@vincent-weaver-1.umelst.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>

1996388e

15 7月, 2014 2 次提交

x86, amd: Cleanup init_amd · 26bfa5f8

由 Borislav Petkov 提交于 6月 24, 2014

Distribute family-specific code to corresponding functions.

Also,

* move the direct mapping splitting around the TSEG SMM area to
bsp_init_amd().

* kill ancient comment about what we should do for K5.

* merge amd_k7_smp_check() into its only caller init_amd_k7 and drop
cpu_has_mp macro.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1403609105-8332-3-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

26bfa5f8

x86/cpufeature: Add bug flags to /proc/cpuinfo · 80a208bd

由 Borislav Petkov 提交于 6月 24, 2014

Dump the flags which denote we have detected and/or have applied bug
workarounds to the CPU we're executing on, in a similar manner to the
feature flags.

The advantage is that those are not accumulating over time like the CPU
features.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1403609105-8332-2-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

80a208bd

05 7月, 2014 1 次提交

perf/x86: Micro-optimize nhmex_rbox_get_constraint() · 2172c1f5

由 Rasmus Villemoes 提交于 6月 19, 2014

Flipping the LSB doesn't require four lines of code. This shaves a few
bytes of the generated code, including a branch.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1403183731-15402-1-git-send-email-linux@rasmusvillemoes.dkSigned-off-by: NIngo Molnar <mingo@kernel.org>

2172c1f5

02 7月, 2014 1 次提交

perf/x86/intel: ignore CondChgd bit to avoid false NMI handling · b292d7a1

由 HATAYAMA Daisuke 提交于 6月 25, 2014

Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.

For example, we use external NMI to make system panic to get crash
dump, but in this case, the external NMI is falsely handled do to the
issue.

This commit deals with the issue simply by ignoring CondChgd bit.

Here is explanation in detail.

On x86 NMI watchdog uses performance monitoring feature to
periodically signal NMI each time performance counter gets overflowed.

intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
owner of a given NMI by looking at overflow status bits in
MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
handles the given NMI as its own NMI.

The problem is that the intel_pmu_handle_irq() doesn't distinguish
CondChgd bit from other bits. Unlike the other status bits, CondChgd
bit doesn't represent overflow status for performance counters. Thus,
CondChgd bit cannot be thought of as a mark indicating a given NMI is
NMI watchdog's.

As a result, if CondChgd bit is set, any NMI is falsely handled by the
NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
action is never performed until CondChgd bit is cleared.

I noticed this behavior on systems with Ivy Bridge processors: Intel
Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
in the beginning at boot. Then the CondChgd bit is immediately cleared
by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
0.

On the other hand, on older processors such as Nehalem, Xeon E7540,
CondChgd bit is not set in the beginning at boot.

I'm not sure about exact behavior of CondChgd bit, in particular when
this bit is set. Although I read Intel System Programmer's Manual to
figure out that, the descriptions I found are:

  In 18.9.1:

  "The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
   indicate changes to the state of performancmonitoring hardware"

  In Table 35-2 IA-32 Architectural MSRs

  63 CondChg: status bits of this register has changed.

These are different from the bahviour I see on the actual system as I
explained above.

At least, I think ignoring CondChgd bit should be enough for NMI
watchdog perspective.
Signed-off-by: NHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b292d7a1

24 6月, 2014 1 次提交

x86, MCE: Robustify mcheck_init_device · 27c93415

由 Borislav Petkov 提交于 6月 20, 2014

27c93415

23 6月, 2014 1 次提交

x86, MCE: Kill CPU_POST_DEAD · 38356c1f

由 Borislav Petkov 提交于 5月 22, 2014

In conjunction with cleaning up CPU hotplug, we want to get rid of
CPU_POST_DEAD. Kill this instance here and rediscover CMCI banks at the
end of CPU_DEAD.

Link: http://lkml.kernel.org/r/http://lkml.kernel.org/r/1400750624-19238-1-git-send-email-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de>

38356c1f

19 6月, 2014 1 次提交

x86, cpufeature: Convert more "features" to bugs · 9b13a93d

由 Borislav Petkov 提交于 6月 18, 2014

X86_FEATURE_FXSAVE_LEAK, X86_FEATURE_11AP and
X86_FEATURE_CLFLUSH_MONITOR are not really features but synthetic bits
we use for applying different bug workarounds. Call them what they
really are, and make sure they get the proper cross-CPU behavior (OR
rather than AND).
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1403042783-23278-1-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9b13a93d

09 6月, 2014 1 次提交

Revert "x86/smpboot: Initialize secondary CPU only if master CPU will wait for it" · bb077d60

由 Linus Torvalds 提交于 6月 08, 2014

This reverts commit 3e1a878b.

It came in very late, and already has one reported failure: Sitsofe
reports that the current tree fails to boot on his EeePC, and bisected
it down to this.  Rather than waste time trying to figure out what's
wrong, just revert it.
Reported-by: NSitsofe Wheeler <sitsofe@gmail.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bb077d60

05 6月, 2014 4 次提交

x86/smpboot: Initialize secondary CPU only if master CPU will wait for it · 3e1a878b

由 Igor Mammedov 提交于 6月 05, 2014

Hang is observed on virtual machines during CPU hotplug,
especially in big guests with many CPUs. (It reproducible
more often if host is over-committed).

It happens because master CPU gives up waiting on
secondary CPU and allows it to run wild. As result
AP causes locking or crashing system. For example
as described here:

   https://lkml.org/lkml/2014/3/6/257

If master CPU have sent STARTUP IPI successfully,
and AP signalled to master CPU that it's ready
to start initialization, make master CPU wait
indefinitely till AP is onlined.
To ensure that AP won't ever run wild, make it
wait at early startup till master CPU confirms its
intention to wait for AP. If AP doesn't respond in 10
seconds, the master CPU will timeout and cancel
AP onlining.
Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Acked-by: NToshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-4-git-send-email-imammedo@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

3e1a878b

perf/x86: Add conditional branch filtering support · 37548914

由 Anshuman Khandual 提交于 5月 22, 2014

This patch adds conditional branch filtering support,
enabling it for PERF_SAMPLE_BRANCH_COND in perf branch
stack sampling framework by utilizing an available
software filter X86_BR_JCC.
Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: mpe@ellerman.id.au
Cc: benh@kernel.crashing.org
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1400743210-32289-3-git-send-email-khandual@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

37548914

perf/x86: Use common PMU interrupt disabled code · c184c980

由 Vince Weaver 提交于 5月 16, 2014

Make the x86 perf code use the new common PMU interrupt disabled code.

Typically most x86 machines have working PMU interrupts, although
some older p6-class machines had this problem.
Signed-off-by: NVince Weaver <vincent.weaver@maine.edu>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161715560.11099@vincent-weaver-1.umelst.maine.eduSigned-off-by: NIngo Molnar <mingo@kernel.org>

c184c980

hwpoison: remove unused global variable in do_machine_check() · 65eb7182

由 Chen Yucong 提交于 6月 04, 2014

Remove an unused global variable mce_entry and relative operations in
do_machine_check().
Signed-off-by: NChen Yucong <slaoub@gmail.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

65eb7182