- 03 4月, 2010 3 次提交
-
-
由 Peter Zijlstra 提交于
Stephane noticed that the ANY flag was in generic arch code, and Cyrill reported that it broke the P4 code. Solve this by merging x86_pmu::raw_event into x86_pmu::hw_config and provide intel_pmu and amd_pmu specific versions of this callback. The intel_pmu one deals with the ANY flag, the amd_pmu adds the few extra event bits AMD64 has. Reported-by: NStephane Eranian <eranian@google.com> Reported-by: NCyrill Gorcunov <gorcunov@gmail.com> Acked-by: NRobert Richter <robert.richter@amd.com> Acked-by: NCyrill Gorcunov <gorcunov@gmail.com> Acked-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269968113.5258.442.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Robert Richter 提交于
ARCH_PERFMON_EVENTSEL bit masks are often used in the kernel. This patch adds macros for the bit masks and removes local defines. The function intel_pmu_raw_event() becomes x86_pmu_raw_event() which is generic for x86 models and same also for p6. Duplicate code is removed. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100330092821.GH11907@erda.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Robert Richter 提交于
The big rename: cdd6c482 perf: Do the big rename: Performance Counters -> Performance Events accidentally renamed some members of stucts that were named after registers in the spec. To avoid confusion this patch reverts some changes. The related specs are MSR descriptions in AMD's BKDGs and the ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32 Architectures Software Developer's Manuals. This patch does: $ sed -i -e 's:num_events:num_counters:g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c \ arch/x86/oprofile/op_model_ppro.c $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269880612-25800-2-git-send-email-robert.richter@amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 3月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Implement the workaround for Intel Errata AAK100 and AAP53. Also, remove the Core-i7 name for Nehalem events since there are also Westmere based i7 chips. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1269608924.12097.147.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 3月, 2010 1 次提交
-
-
由 Cyrill Gorcunov 提交于
The netburst PMU is way different from the "architectural perfomance monitoring" specification that current CPUs use. P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle perfomance monitoring events. A few implementational details: 1) We need a separate x86_pmu::hw_config helper in struct x86_pmu since register bit-fields are quite different from P6, Core and later cpu series. 2) For the same reason is a x86_pmu::schedule_events helper introduced. 3) hw_perf_event::config consists of packed ESCR+CCCR values. It's allowed since in reality both registers only use a half of their size. Of course before making a real write into a particular MSR we need to unpack the value and extend it to a proper size. 4) The tuple of packed ESCR+CCCR in hw_perf_event::config doesn't describe the memory address of ESCR MSR register so that we need to keep a mapping between these tuples used and available ESCR (various P4 events may use same ESCRs but not simultaneously), for this sake every active event has a per-cpu map of hw_perf_event::idx <--> ESCR addresses. 5) Since hw_perf_event::idx is an offset to counter/control register we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel strips it down to 8 registers and event armed may never be turned off (ie the bit in active_mask is set but the loop never reaches this index to check), thanks to Peter Zijlstra Restrictions: - No cascaded counters support (do we ever need them?) - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS doesn't work for now) - There are events with same counters which can't work simultaneously (need to use intersected ones due to broken counter 1) - No PERF_COUNT_HW_CACHE_ events yet Todo: - Implement dependent events - Need proper hashing for event opcodes (no linear search, good for debugging stage but not in real loads) - Some events counted during a clock cycle -- need to set threshold for them and count every clock cycle just to get summary statistics (ie to behave the same way as other PMUs do) - Need to swicth to use event_constraints - To support RAW events we need to encode a global list of P4 events into p4_templates - Cache events need to be added Event support status matrix: Event status ----------------------------- cycles works cache-references works cache-misses works branch-misses works bus-cycles partially (does not work on 64bit cpu with HT enabled) instruction doesnt work (needs dependent event [mop tagging]) branches doesnt work Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: NLin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100311165439.GB5129@lenovo> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 3月, 2010 16 次提交
-
-
由 Peter Zijlstra 提交于
We don't need checking_{wr,rd}msr() calls, since we should know what cpu we're running on and not use blindly poke at msrs. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The documentation says we have to enable PEBS before we enable the PMU proper. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Some CPUs have errata where the LBR is not cleared on Power-On. So always clear the LBRs before use. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154128.966563424@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
This CPU has just too many handycaps to be really useful. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100305154128.890278662@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Saner PERF_CAPABILITIES support, which also exposes pebs_trap. Use that latter to make PEBS's use of LBR conditional since a fault-like pebs should already report the correct IP. ( As of this writing there is no known hardware that implements !pebs_trap ) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.770650663@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Use the LBR to fix up the PEBS IP+1 issue. As said, PEBS reports the next instruction, here we use the LBR to find the last branch and from that construct the actual IP. If the IP matches the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the beginning of the last basic block and decode forward. Once we find a match to the current IP, we use the previous location. This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction that caused the event (barring CPU errata). The fixup can fail due to various reasons: 1) LBR contains invalid data (quite possible) 2) part of the basic block got paged out 3) the reported IP isn't part of the basic block (see 1) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.619375431@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Implement simple suport Intel Last-Branch-Record, it supports all hardware that implements FREEZE_LBRS_ON_PMI, but does not (yet) implement the LBR config register. The Intel LBR is a FIFO of From,To addresses describing the last few branches the hardware took. This patch does not add perf interface to the LBR, but merely provides an interface for internal use. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.544191154@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
This patch implements support for Intel Precise Event Based Sampling, which is an alternative counter mode in which the counter triggers a hardware assist to collect information on events. The hardware assist takes a trap like snapshot of a subset of the machine registers. This data is written to the Intel Debug-Store, which can be programmed with a data threshold at which to raise a PMI. With the PEBS hardware assist being trap like, the reported IP is always one instruction after the actual instruction that triggered the event. This implements a simple PEBS model that always takes a single PEBS event at a time. This is done so that the interaction with the rest of the system is as expected (freq adjust, period randomization, lbr, callchains, etc.). It adds an ABI element: perf_event_attr::precise, which indicates that we wish to use this (constrained, but precise) mode. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <20100304140100.392111285@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result in a double disable, cure this by using x86_pmu_{start,stop} for throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
There is no concurrency on these variables, so don't use LOCK'ed ops. As to the intel_pmu_handle_irq() status bit clean, nobody uses that so remove it all together. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.240023029@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Pass the full perf_event into the x86_pmu functions so that those may make use of more than the hw_perf_event, and while doing this, remove the superfluous second argument. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.165166129@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The second and third argument to x86_perf_event_update() are superfluous since they are simple expressions of the first argument. Hence remove them. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.089468871@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The second and third argument to x86_perf_event_set_period() are superfluous since they are simple expressions of the first argument. Hence remove them. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100304140100.006500906@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Explicitly use intel_pmu_{disable,enable}_all() in intel_pmu_handle_irq() to avoid the NMI race conditions in perf_{disable,enable} Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug notifiers. This has the advantage of reducing the static weak interface as well as exposing all hotplug actions to the PMU. Use this to fix x86 hotplug usage where we did things in ONLINE which should have been done in UP_PREPARE or STARTING. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mundt <lethal@linux-sh.org> Cc: paulus@samba.org Cc: eranian@google.com Cc: robert.richter@amd.com Cc: fweisbec@gmail.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <20100305154128.736225361@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
This makes it easier to extend perf_sample_data and fixes a bug on arm and sparc, which failed to set ->raw to NULL, which can cause crashes when combined with PERF_SAMPLE_RAW. It also optimizes PowerPC and tracepoint, because the struct initialization is forced to zero out the whole structure. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NJean Pihet <jpihet@mvista.com> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: stable@kernel.org LKML-Reference: <20100304140100.315416040@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 3月, 2010 1 次提交
-
-
由 Akinobu Mita 提交于
Rename for_each_bit to for_each_set_bit in the kernel source tree. To permit for_each_clear_bit(), should that ever be added. The patch includes a macro to map the old for_each_bit() onto the new for_each_set_bit(). This is a (very) temporary thing to ease the migration. [akpm@linux-foundation.org: add temporary for_each_bit()] Suggested-by: NAlexey Dobriyan <adobriyan@gmail.com> Suggested-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 3月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Patch 1da53e02 ("perf_events, x86: Improve x86 event scheduling") lost us one of the fixed purpose counters and then ed8777fc ("perf_events, x86: Fix event constraint masks") broke it even further. Widen the fixed event mask to event+umask and specify the full config for each of the 3 fixed purpose counters. Then let the init code fill out the placement for the GP regs based on the cpuid info. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 2月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Split amd,p6,intel into separate files so that we can easily deal with CONFIG_CPU_SUP_* things, needed to make things build now that perf_event.c relies on symbols from amd.c Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-