提交 · d2beea4a3419e63804094e9ac4b6d1518bc17a9b · openanolis / cloud-kernel

13 9月, 2013 2 次提交

perf/x86/intel: Clean-up/reduce PEBS code · d2beea4a

由 Peter Zijlstra 提交于 9月 12, 2013

Get rid of some pointless duplication introduced by the Haswell code.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-8q6y4davda9aawwv5yxe7klp@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

d2beea4a

perf/x86: Report TSX transaction abort cost as weight · 748e86aa

由 Andi Kleen 提交于 9月 05, 2013

Use the existing weight reporting facility to report the transaction
abort cost, that is the number of cycles wasted in aborts.
Haswell reports this in the PEBS record.

This was in fact the original user for weight.

This is a very useful sort key to concentrate on the most
costly aborts and a good metric for TSX tuning.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1378438661-24765-3-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

748e86aa

02 9月, 2013 2 次提交

perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node() · 7bfb7e6b

由 Joe Perches 提交于 8月 29, 2013

Use the convenience function instead of __GFP_ZERO.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/f58599ae1a8d7b32d37e9cf283e95fba6452f7f6.1377809875.git.joe@perches.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

7bfb7e6b

perf/x86: Add Silvermont (22nm Atom) support · 1fa64180

由 Yan, Zheng 提交于 7月 18, 2013

Compared to old atom, Silvermont has offcore and has more events
that support PEBS.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1374138144-17278-2-git-send-email-zheng.z.yan@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1fa64180

27 6月, 2013 1 次提交

perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable() · 983433b5

由 Stephane Eranian 提交于 6月 21, 2013

Make sure intel_pmu_pebs_disable() and intel_pmu_pebs_enable()
are symmetrical w.r.t. PEBS-LL and precise store.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371824448-7306-2-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

983433b5

19 6月, 2013 3 次提交

perf/x86/intel: Add mem-loads/stores support for Haswell · f9134f36

由 Andi Kleen 提交于 6月 17, 2013

mem-loads is basically the same as Sandy Bridge,
but we use a separate string for changes later.

Haswell doesn't support the full precise store mode,
so we emulate it using the "DataLA" facility.
This allows to do everything, but for data sources we
can only detect L1 hit or not.

There is no explicit enable bit anymore, so we have
to tie it to a perf internal only flag.

The address is supported for all memory related PEBS
events with DataLA. Instead of only logging for the
load and store events we allow logging it for all
(it will be simply 0 if the current event does not
support it)
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Andi Kleen <ak@linux.jf.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1371515812-9646-7-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

f9134f36

perf/x86/intel: Add Haswell PEBS support · 3044318f

由 Andi Kleen 提交于 6月 17, 2013

Add simple PEBS support for Haswell.

The constraints are similar to SandyBridge with a few new
events.
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Andi Kleen <ak@linux.jf.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1371515812-9646-4-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

3044318f

perf/x86/intel: Add Haswell PEBS record support · 130768b8

由 Andi Kleen 提交于 6月 17, 2013

Add support for the Haswell extended (fmt2) PEBS format.

It has a superset of the nhm (fmt1) PEBS fields, but has a
longer record so we need to adjust the code paths.

The main advantage is the new "EventingRip" support which
directly gives the instruction, not off-by-one instruction. So
with precise == 2 we use that directly and don't try to use LBRs
and walking basic blocks. This lowers the overhead of using
precise significantly.

Some other features are added in later patches.
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Cc: Andi Kleen <ak@linux.jf.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1371515812-9646-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

130768b8

01 4月, 2013 3 次提交

perf/x86: Add support for PEBS Precise Store · 9ad64c0f

由 Stephane Eranian 提交于 1月 24, 2013

This patch adds support for PEBS Precise Store
which is available on Intel Sandy Bridge and
Ivy Bridge processors.

To use Precise store, the proper PEBS event
must be used: mem_trans_retired:precise_stores.
For the perf tool, the generic mem-stores event
exported via sysfs can be used directly.
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-11-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9ad64c0f

perf/x86: Add memory profiling via PEBS Load Latency · f20093ee

由 Stephane Eranian 提交于 1月 24, 2013

This patch adds support for memory profiling using the
PEBS Load Latency facility.

Load accesses are sampled by HW and the instruction
address, data address, load latency, data source, tlb,
locked information can be saved in the sampling buffer
if using the PERF_SAMPLE_COST (for latency),
PERF_SAMPLE_ADDR, PERF_SAMPLE_DATA_SRC types.

To enable PEBS Load Latency, users have to use the
model specific event:

 - on NHM/WSM: MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD
 - on SNB/IVB: MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD

To make things easier, this patch also exports a generic
alias via sysfs: mem-loads. It export the right event
encoding based on the host CPU and can be used directly
by the perf tool.

Loosely based on Intel's Lin Ming patch posted on LKML
in July 2011.
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-9-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

f20093ee

perf/x86: Add flags to event constraints · 9fac2cf3

由 Stephane Eranian 提交于 1月 24, 2013

This patch adds a flags field to each event constraint.
It can be used to store event specific features which can
then later be used by scheduling code or low-level x86 code.

The flags are propagated into event->hw.flags during the
get_event_constraint() call. They are cleared during the
put_event_constraint() call.

This mechanism is going to be used by the PEBS-LL patches.
It avoids defining yet another table to hold event specific
information.
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-4-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9fac2cf3

21 3月, 2013 1 次提交

perf/x86: Fix uninitialized pt_regs in intel_pmu_drain_bts_buffer() · 0e48026a

由 Stephane Eranian 提交于 3月 19, 2013

This patch fixes an uninitialized pt_regs struct in drain BTS
function. The pt_regs struct is propagated all the way to the
code_get_segment() function from perf_instruction_pointer()
and may get garbage.

We cannot simply inherit the actual pt_regs from the interrupt
because BTS must be flushed on context-switch or when the
associated event is disabled. And there we do not have a pt_regs
handy.

Setting pt_regs to all zeroes may not be the best option but it
is not clear what else to do given where the drain_bts_buffer()
is called from.

In V2, we move the memset() later in the code to avoid doing it
when we end up returning early without doing the actual BTS
processing. Also dropped the reg.val initialization because it
is redundant with the memset() as suggested by PeterZ.
Signed-off-by: NStephane Eranian <eranian@google.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: peterz@infradead.org
Cc: sqazi@google.com
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Link: http://lkml.kernel.org/r/20130319151038.GA25439@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>

0e48026a

18 3月, 2013 1 次提交

perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2

由 Linus Torvalds 提交于 3月 17, 2013

Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") fixed a crash when doing PEBS performance profiling
after resuming, but in using init_debug_store_on_cpu() to restore the
DS_AREA mtrr it also resulted in a new WARN_ON() triggering.

init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
cross-calls to do the MSR update. Which is not really valid at the
early resume stage, and the warning is quite reasonable. Now, it all
happens to _work_, for the simple reason that smp_call_function_single()
ends up just doing the call directly on the CPU when the CPU number
matches, but we really should just do the wrmsr() directly instead.

This duplicates the wrmsr() logic, but hopefully we can just remove the
wrmsr_on_cpu() version eventually.
Reported-and-tested-by: NParag Warudkar <parag.lkml@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2a6e06b2

16 3月, 2013 1 次提交

perf,x86: fix kernel crash with PEBS/BTS after suspend/resume · 1d9d8639

由 Stephane Eranian 提交于 3月 15, 2013

This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.

The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.
Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d9d8639

19 9月, 2012 1 次提交

perf/x86: Fix Intel Ivy Bridge support · 20a36e39

由 Stephane Eranian 提交于 9月 11, 2012

This patch updates the existing Intel IvyBridge (model 58)
support with proper PEBS event constraints. It cannot reuse
the same as SandyBridge because some events (0xd3) are
specific to IvyBridge.

Also there is no UOPS_DISPATCHED.THREAD on IVB, so do not
populate the PERF_COUNT_HW_STALLED_CYCLES_BACKEND mapping.
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/20120910230701.GA5898@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>

20a36e39

31 7月, 2012 1 次提交

perf/x86: Fix USER/KERNEL tagging of samples properly · d07bdfd3

由 Peter Zijlstra 提交于 7月 10, 2012

Some PMUs don't provide a full register set for their sample,
specifically 'advanced' PMUs like AMD IBS and Intel PEBS which provide
'better' than regular interrupt accuracy.

In this case we use the interrupt regs as basis and over-write some
fields (typically IP) with different information.

The perf core however uses user_mode() to distinguish user/kernel
samples, user_mode() relies on regs->cs. If the interrupt skid pushed
us over a boundary the new IP might not be in the same domain as the
interrupt.

Commit ce5c1fe9 ("perf/x86: Fix USER/KERNEL tagging of samples")
tried to fix this by making the perf core use kernel_ip(). This
however is wrong (TM), as pointed out by Linus, since it doesn't allow
for VM86 and non-zero based segments in IA32 mode.

Therefore, provide a new helper to set the regs->ip field,
set_linear_ip(), which massages the regs into a suitable state
assuming the provided IP is in fact a linear address.

Also modify perf_instruction_pointer() and perf_callchain_user() to
deal with segments base offsets.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1341910954.3462.102.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>

d07bdfd3

06 7月, 2012 1 次提交

perf/x86: Rename Intel specific macros · 15c7ad51

由 Robert Richter 提交于 6月 20, 2012

There are macros that are Intel specific and not x86 generic. Rename
them into INTEL_*.

This patch removes X86_PMC_IDX_GENERIC and does:

 $ sed -i -e 's/X86_PMC_MAX_/INTEL_PMC_MAX_/g'           \
         arch/x86/include/asm/kvm_host.h                 \
         arch/x86/include/asm/perf_event.h               \
         arch/x86/kernel/cpu/perf_event.c                \
         arch/x86/kernel/cpu/perf_event_p4.c             \
         arch/x86/kvm/pmu.c
 $ sed -i -e 's/X86_PMC_IDX_FIXED/INTEL_PMC_IDX_FIXED/g' \
         arch/x86/include/asm/perf_event.h               \
         arch/x86/kernel/cpu/perf_event.c                \
         arch/x86/kernel/cpu/perf_event_intel.c          \
         arch/x86/kernel/cpu/perf_event_intel_ds.c       \
         arch/x86/kvm/pmu.c
 $ sed -i -e 's/X86_PMC_MSK_/INTEL_PMC_MSK_/g'           \
         arch/x86/include/asm/perf_event.h               \
         arch/x86/kernel/cpu/perf_event.c
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1340217996-2254-2-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

15c7ad51

06 6月, 2012 3 次提交

perf/x86: Don't assume there can be only 4 PEBS events · 70ab7003

由 Andi Kleen 提交于 6月 05, 2012

On Sandy Bridge in non HT mode there are 8 counters available.
Since every counter can write a PEBS record assuming there are
4 max is incorrect. Use the reported counter number -- with an
upper limit for a static array -- instead.

Also I made the warning messages a bit more informational.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1338944211-28275-2-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

70ab7003

perf/x86: Update SNB PEBS constraints · 212d95df

由 Peter Zijlstra 提交于 6月 05, 2012

Afaict there's no need to (incompletely) iterate the
MEM_UOPS_RETIRED.* umask state.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>

212d95df

perf/x86: Update SNB PEBS constraints · 8440ccb4

由 Peter Zijlstra 提交于 6月 05, 2012

Afaict there's no need to (incompletely) iterate the
MEM_UOPS_RETIRED.* umask state.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1338884803.28282.153.camel@twinsSigned-off-by: NIngo Molnar <mingo@kernel.org>

8440ccb4

09 5月, 2012 1 次提交

perf: Pass last sampling period to perf_sample_data_init() · fd0d000b

由 Robert Richter 提交于 4月 02, 2012

We always need to pass the last sample period to
perf_sample_data_init(), otherwise the event distribution will be
wrong. Thus, modifiyng the function interface with the required period
as argument. So basically a pattern like this:

        perf_sample_data_init(&data, ~0ULL);
        data.period = event->hw.last_period;

will now be like that:

        perf_sample_data_init(&data, ~0ULL, event->hw.last_period);

Avoids unininitialized data.period and simplifies code.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fd0d000b

05 3月, 2012 2 次提交

perf/x86: Add LBR software filter support for Intel CPUs · 3e702ff6

由 Stephane Eranian 提交于 2月 09, 2012

This patch adds an internal sofware filter to complement
the (optional) LBR hardware filter.

The software filter is necessary:

 - as a substitute when there is no HW LBR filter (e.g., Atom, Core)
 - to complement HW LBR filter in case of errata (e.g., Nehalem/Westmere)
 - to provide finer grain filtering (e.g., all processors)

Sometimes the LBR HW filter cannot distinguish between two types
of branches. For instance, to capture syscall as CALLS, it is necessary
to enable the LBR_FAR filter which will also capture JMP instructions.
Thus, a second pass is necessary to filter those out, this is what the
SW filter can do.

The SW filter is built on top of the internal x86 disassembler. It
is a best effort filter especially for user level code. It is subject
to the availability of the text page of the program.

The SW filter is enabled on all Intel processors. It is bypassed
when the user is capturing all branches at all priv levels.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-9-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

3e702ff6

perf/x86: Implement PERF_SAMPLE_BRANCH for Intel CPUs · 60ce0fbd

由 Stephane Eranian 提交于 2月 09, 2012

This patch implements PERF_SAMPLE_BRANCH support for Intel
x86processors. It connects PERF_SAMPLE_BRANCH to the actual LBR.

The patch adds the hooks in the PMU irq handler to save the LBR
on counter overflow for both regular and PEBS modes.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-8-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

60ce0fbd

03 2月, 2012 1 次提交

perf: Remove deprecated WARN_ON_ONCE() · 84f2b9b2

由 Stephane Eranian 提交于 2月 02, 2012

With the new throttling/unthrottling code introduced with
commit:

  e050e3f0 ("perf: Fix broken interrupt rate throttling")

we occasionally hit two WARN_ON_ONCE() checks in:

  - intel_pmu_pebs_enable()
  - intel_pmu_lbr_enable()
  - x86_pmu_start()

The assertions are no longer problematic. There is a valid
path where they can trigger but it is harmless.

The assertion can be triggered with:

  $ perf record -e instructions:pp ....

Leading to paths:

  intel_pmu_pebs_enable
  intel_pmu_enable_event
  x86_perf_event_set_period
  x86_pmu_start
  perf_adjust_freq_unthr_context
  perf_event_task_tick
  scheduler_tick

And:

  intel_pmu_lbr_enable
  intel_pmu_enable_event
  x86_perf_event_set_period
  x86_pmu_start
  perf_adjust_freq_unthr_context.
  perf_event_task_tick
  scheduler_tick

cpuc->enabled is always on because when we get to
perf_adjust_freq_unthr_context() the PMU is not totally
disabled. Furthermore when we need to adjust a period,
we only stop the event we need to change and not the
entire PMU. Thus, when we re-enable, cpuc->enabled is
already set. Note that when we stop the event, both
pebs and lbr are stopped if necessary (and possible).
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/20120202110401.GA30911@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>

84f2b9b2

14 11月, 2011 1 次提交

perf/x86: Fix PEBS instruction unwind · 57d1c0c0

由 Peter Zijlstra 提交于 10月 07, 2011

Masami spotted that we always try to decode the instruction stream as
64bit instructions when running a 64bit kernel, this doesn't work for
ia32-compat proglets.

Use TIF_IA32 to detect if we need to use the 32bit instruction
decoder.
Reported-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: stable@kernel.org
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

57d1c0c0

26 9月, 2011 1 次提交

x86, perf: Clean up perf_event cpu code · de0428a7

由 Kevin Winchester 提交于 8月 30, 2011

The CPU support for perf events on x86 was implemented via included C files
with #ifdefs. Clean this up by creating a new header file and compiling
the vendor-specific files as needed.
Signed-off-by: NKevin Winchester <kjwinchester@gmail.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314747665-2090-1-git-send-email-kjwinchester@gmail.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

de0428a7

01 7月, 2011 2 次提交

perf: Remove the perf_output_begin(.sample) argument · a7ac67ea

由 Peter Zijlstra 提交于 6月 27, 2011

Since only samples call perf_output_sample() its much saner (and more
correct) to put the sample logic in there than in the
perf_output_begin()/perf_output_end() pair.

Saves a useless argument, reduces conditionals and shrinks
struct perf_output_handle, win!
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

a7ac67ea

perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17

由 Peter Zijlstra 提交于 6月 27, 2011

The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

a8b0ca17

16 3月, 2011 2 次提交

perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints · 7d5d02da

由 Lin Ming 提交于 3月 09, 2011

PEBS_EVENT_CONSTRAINT() is just a duplicate of INTEL_UEVENT_CONSTRAINT().
Remove it and use INTEL_UEVENT_CONSTRAINT() instead.
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299684089-22835-3-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7d5d02da

perf, x86: Clean up SandyBridge PEBS events · eefaaac4

由 Lin Ming 提交于 3月 09, 2011

Use INTEL_EVENT_CONSTRAINT() for the events where all umasks support PEBS.
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299684089-22835-2-git-send-email-ming.m.lin@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eefaaac4

04 3月, 2011 1 次提交

perf_events: Update PEBS event constraints · 17e31629

由 Stephane Eranian 提交于 3月 02, 2011

This patch updates PEBS event constraints for Intel Atom, Nehalem, Westmere.

This patch also reorganizes the PEBS format/constraint detection code. It is
now based on processor model and not PEBS format. Two processors may use the
same PEBS format without have the same list of PEBS events.

In this second version, we simplified the initialization of the PEBS
constraints by leveraging the existing switch() statement in perf_event_intel.c.
We also renamed the constraint tables to be more consistent with regular
constraints.

In this 3rd version, we drop BR_INST_RETIRED.MISPRED from Intel Atom as it does
not seem to work. Use MISPREDICTED_BRANCH_RETIRED instead. Also add FP_ASSIST.*
o both Intel Nehalem and Westmere. I misssed those in the earlier patches.
Events were tested using libpfm4 perf_examples.
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d6e6b02.815bdf0a.637b.07a7@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

17e31629

02 3月, 2011 1 次提交

perf, x86: Add Intel SandyBridge CPU support · b06b3d49

由 Lin Ming 提交于 3月 02, 2011

This patch adds basic SandyBridge support, including hardware
cache events and PEBS events support.

It has been tested on SandyBridge CPUs with perf stat and also
with PEBS based profiling - both work fine.

The patch does not affect other models.

v2 -> v3:
 - fix PEBS event 0xd0 with right umask combinations
 - move snb pebs constraint assignment to intel_pmu_init

v1 -> v2:
 - add more raw and PEBS events constraints
 - use offcore events for LLC-* cache events
 - remove the call to Nehalem workaround enable_all function
Signed-off-by: NLin Ming <ming.m.lin@intel.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1299072424.2175.24.camel@localhost>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b06b3d49

22 10月, 2010 6 次提交

perf, x86: Use NUMA aware allocations for PEBS/BTS/DS allocations · 96681fc3

由 Peter Zijlstra 提交于 10月 19, 2010

For performance reasons its best to use memory node local memory for
per-cpu buffers.

This logic comes from a much larger patch proposed by Stephane.
Suggested-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <20101019134808.514465326@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96681fc3

perf, x86: Clean up reserve_ds_buffers() signature · f80c9e30

由 Peter Zijlstra 提交于 10月 19, 2010

Now that reserve_ds_buffers() never fails, change it to return
void and remove all code dealing with the error return.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <20101019134808.462621937@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f80c9e30

perf, x86: Less disastrous PEBS/BTS buffer allocation failure · 6809b6ea

由 Peter Zijlstra 提交于 10月 19, 2010

Currently PEBS/BTS buffers are allocated when we instantiate the first
event, when this fails everything fails.

This is a problem because esp. BTS tries to allocate a rather large
buffer (64K), which can easily fail.

This patch changes the logic such that when either buffer allocation
fails, we simply don't allow events that would use these facilities,
but continue functioning for all other events.

This logic comes from a much larger patch proposed by Stephane.
Suggested-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <20101019134808.354429461@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6809b6ea

perf, x86: Extract DS alloc/free functions · 65af94ba

由 Peter Zijlstra 提交于 10月 19, 2010

Again, mostly a cleanup to unclutter the reserve_ds_buffer() code.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <20101019134808.304495776@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

65af94ba

perf, x86: Extract PEBS/BTS allocation functions · 5ee25c87

由 Peter Zijlstra 提交于 10月 19, 2010

Mostly a cleanup.. it reduces code indentation and makes the code flow
of reserve_ds_buffers() clearer.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <20101019134808.253453452@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5ee25c87

perf, x86: Extract PEBS/BTS buffer free routines · b39f88ac

由 Peter Zijlstra 提交于 10月 19, 2010

So that we may grow additional call-sites..
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NStephane Eranian <eranian@google.com>
LKML-Reference: <20101019134808.196793164@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b39f88ac

13 9月, 2010 1 次提交

perf_events: Fix BTS interrupt handling to avoid being dazed by NMI (v2) · b0b2072d

由 Stephane Eranian 提交于 9月 10, 2010

Fix a bug introduced with commit de725dec and the change in the
meaning of the return value of intel_pmu_handle_irq(). With the
current code, when you are using the BTS, you get 'dazed by NMI'
each time the BTS buffer fills up.

BTS does interrupt on the PMU vector, thus NMI. You need to take
this into account in the return value of the function.

This version fixes initial patch which was missing changes to
perf_event_intel_ds.c.
Signed-off-by: NStephane Eranian <eranian@google.com>
Acked-by: NDon Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: davem@davemloft.net
Cc: fweisbec@gmail.com
Cc: perfmon2-devel@lists.sf.net
Cc: eranian@gmail.com
Cc: robert.richter@amd.com
LKML-Reference: <4c8a1686.aae9d80a.5aa4.5e35@mx.google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b0b2072d

10 9月, 2010 1 次提交

perf: Rework the PMU methods · a4eaf7f1

由 Peter Zijlstra 提交于 6月 16, 2010

Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.

The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.

This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).

It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).

The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:

 1) We disable the counter:
    a) the pmu has per-counter enable bits, we flip that
    b) we program a NOP event, preserving the counter state

 2) We store the counter state and ignore all read/overflow events
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a4eaf7f1

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功