- 27 9月, 2012 1 次提交
-
-
powerpc/perf: Sample only if SIAR-Valid bit is set in P7+ On POWER7+ two new bits (mmcra[35] and mmcra[36]) indicate whether the contents of SIAR and SDAR are valid. For marked instructions on P7+, we must save the contents of SIAR and SDAR registers only if these new bits are set. This code/check for the SIAR-Valid bit is specific to P7+, so rather than waste a CPU-feature bit use the PVR flag. Note that Carl Love proposed a similar change for oprofile: https://lkml.org/lkml/2012/6/22/309Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 28 3月, 2012 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
970 and Power4 don't support "continuous sampling" which means that when we aren't in marked instruction sampling mode (marked events), SIAR isn't updated with the last instruction sampled before the perf interrupt. On those processors, we must thus use the exception SRR0 value as the sampled instruction pointer. Those processors also don't support the SIPR and SIHV bits in MMCRA which means we need some kind of heuristic to decide if SIAR values represent kernel or user addresses. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 21 12月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
Put the logic to compute the event index into a per pmu method. This is required because the x86 rules are weird and wonderful and don't match the capabilities of the current scheme. AFAIK only powerpc actually has a usable userspace read of the PMCs but I'm not at all sure anybody actually used that. ARM is restored to the default since it currently does not support userspace access at all. And all software events are provided with a method that reports their index as 0 (disabled). Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Stephane Eranian <eranian@google.com> Cc: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/n/tip-dfydxodki16lylkt3gl2j7cw@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 3月, 2010 1 次提交
-
-
由 Scott Wood 提交于
This implements perf_event support for the Freescale embedded performance monitor, based on the existing perf_event.c that supports server/classic chips. Some limitations: - Performance monitor interrupts are regular EE interrupts, and thus you can't profile places with interrupts disabled. We may want to implement soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs. - When trying to schedule multiple event groups at once, and using restricted events, situations could arise where scheduling fails even though it would be possible. Consider three groups, each with two events. One group has restricted events, the others don't. The two non-restricted groups are scheduled, then one is removed, which happens to occupy the two counters that can't do restricted events. The remaining non-restricted group will not be moved to the non-restricted-capable counters to make room if the restricted group tries to be scheduled. Signed-off-by: NScott Wood <scottwood@freescale.com> Acked-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 22 9月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
This fixes two places in the powerpc perf_event (perf_counter) code where 'list_entry' needs to be changed to 'group_entry', but were missed in commit 65abc865 ("perf_counter: Rename list_entry -> group_entry, counter_list -> group_list"). This also changes 'event' back to 'counter' in a couple of contexts: * Field and function names that deal with the limited-function counters: it's really the hardware counters whose function is limited, not the events that they count. Hence: MAX_LIMITED_HWEVENTS -> MAX_LIMITED_HWCOUNTERS limited_event -> limited_counter freeze/thaw_limited_events -> freeze/thaw_limited_counters * The machine-specific PMU description struct (struct power_pmu): this renames 'n_event' back to 'n_counter' since it really describes how many hardware counters the machine has. (Renaming this back avoids a compile error in each of the machine-specific PMU back-ends where they initialize their power_pmu struct.) Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <19128.4280.813369.589704@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 9月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: NStephane Eranian <eranian@google.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPaul Mackerras <paulus@samba.org> Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 6月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
Update the mmap control page with the needed information to use the userspace RDPMC instruction for self monitoring. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 6月, 2009 3 次提交
-
-
由 Paul Mackerras 提交于
At present, the powerpc generic (processor-independent) perf_counter code has list of processor back-end modules, and at initialization, it looks at the PVR (processor version register) and has a switch statement to select a suitable processor-specific back-end. This is going to become inconvenient as we add more processor-specific back-ends, so this inverts the order: now each back-end checks whether it applies to the current processor, and registers itself if so. Furthermore, instead of looking at the PVR, back-ends now check the cur_cpu_spec->oprofile_cpu_type string and match on that. Lastly, each back-end now specifies a name for itself so the core can print a nice message when a back-end registers itself. This doesn't provide any support for unregistering back-ends, but that wouldn't be hard to do and would allow back-ends to be modules. Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linuxppc-dev@ozlabs.org Cc: benh@kernel.crashing.org LKML-Reference: <19000.55529.762227.518531@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul Mackerras 提交于
This changes the powerpc perf_counter back-end to use unsigned long types for hardware register values and for the value/mask pairs used in checking whether a given set of events fit within the hardware constraints. This is in preparation for adding support for the PMU on some 32-bit powerpc processors. On 32-bit processors the hardware registers are only 32 bits wide, and the PMU structure is generally simpler, so 32 bits should be ample for expressing the hardware constraints. On 64-bit processors, unsigned long is 64 bits wide, so using unsigned long vs. u64 (unsigned long long) makes no actual difference. This makes some other very minor changes: adjusting whitespace to line things up in initialized structures, and simplifying some code in hw_perf_disable(). Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linuxppc-dev@ozlabs.org Cc: benh@kernel.crashing.org LKML-Reference: <19000.55473.26174.331511@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul Mackerras 提交于
This enables the perf_counter subsystem on 32-bit powerpc. Since we don't have any support for hardware counters on 32-bit powerpc yet, only software counters can be used. Besides selecting HAVE_PERF_COUNTERS for 32-bit powerpc as well as 64-bit, the main thing this does is add an implementation of set_perf_counter_pending(). This needs to arrange for perf_counter_do_pending() to be called when interrupts are enabled. Rather than add code to local_irq_restore as 64-bit does, the 32-bit set_perf_counter_pending() generates an interrupt by setting the decrementer to 1 so that a decrementer interrupt will become pending in 1 or 2 timebase ticks (if a decrementer interrupt isn't already pending). When interrupts are enabled, timer_interrupt() will be called, and some new code in there calls perf_counter_do_pending(). We use a per-cpu array of flags to indicate whether we need to call perf_counter_do_pending() or not. This introduces a couple of new Kconfig symbols: PPC_HAVE_PMU_SUPPORT, which is selected by processor families for which we have hardware PMU support (currently only PPC64), and PPC_PERF_CTRS, which enables the powerpc-specific perf_counter back-end. Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linuxppc-dev@ozlabs.org Cc: benh@kernel.crashing.org LKML-Reference: <19000.55404.103840.393470@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 6月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
At present, every architecture that supports perf_counters has to declare set_perf_counter_pending() in its arch-specific headers. This consolidates the declarations into a single declaration in one common place, include/linux/perf_counter.h. On powerpc, we continue to provide a static inline definition of set_perf_counter_pending() in the powerpc hw_irq.h. Also, this removes from the x86 perf_counter.h the unused null definitions of {test,clear}_perf_counter_pending. Reported-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: benh@kernel.crashing.org LKML-Reference: <18998.13388.920691.523227@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 6月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
This adds tables of event codes for the generalized cache events for all the currently supported powerpc processors: POWER{4,5,5+,6,7} and PPC970*, plus powerpc-specific code to use these tables when a generalized cache event is requested. Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <18992.36430.933526.742969@drongo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 5月, 2009 2 次提交
-
-
由 Paul Mackerras 提交于
This uses values from the MMCRA, SIAR and SDAR registers on powerpc to supply more precise information for overflow events, including a data address when PERF_RECORD_ADDR is specified. Since POWER6 uses different bit positions in MMCRA from earlier processors, this converts the struct power_pmu limited_pmc5_6 field, which only had 0/1 values, into a flags field and defines bit values for its previous use (PPMU_LIMITED_PMC5_6) and a new flag (PPMU_ALT_SIPR) to indicate that the processor uses the POWER6 bit positions rather than the earlier positions. It also adds definitions in reg.h for the new and old positions of the bit that indicates that the SIAR and SDAR values come from the same instruction. For the data address, the SDAR value is supplied if we are not doing instruction sampling. In that case there is no guarantee that the address given in the PERF_RECORD_ADDR subrecord will correspond to the instruction whose address is given in the PERF_RECORD_IP subrecord. If instruction sampling is enabled (e.g. because this counter is counting a marked instruction event), then we only supply the SDAR value for the PERF_RECORD_ADDR subrecord if it corresponds to the instruction whose address is in the PERF_RECORD_IP subrecord. Otherwise we supply 0. [ Impact: support more PMU hardware features on PowerPC ] Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18955.37028.48861.555309@drongo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul Mackerras 提交于
Although the perf_counter API allows 63-bit raw event codes, internally in the powerpc back-end we had been using 32-bit event codes. This expands them to 64 bits so that we can add bits for specifying threshold start/stop events and instruction sampling modes later. This also corrects the return value of can_go_on_limited_pmc; we were returning an event code rather than just a 0/1 value in some circumstances. That didn't particularly matter while event codes were 32-bit, but now that event codes are 64-bit it might, so this fixes it. [ Impact: extend PowerPC perfcounter interfaces from u32 to u64 ] Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18955.36874.472452.353104@drongo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 4月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
POWER5+ and POWER6 have two hardware counters with limited functionality: PMC5 counts instructions completed in run state and PMC6 counts cycles in run state. (Run state is the state when a hardware RUN bit is 1; the idle task clears RUN while waiting for work to do and sets it when there is work to do.) These counters can't be written to by the kernel, can't generate interrupts, and don't obey the freeze conditions. That means we can only use them for per-task counters (where we know we'll always be in run state; we can't put a per-task counter on an idle task), and only if we don't want interrupts and we do want to count in all processor modes. Obviously some counters can't go on a limited hardware counter, but there are also situations where we can only put a counter on a limited hardware counter - if there are already counters on that exclude some processor modes and we want to put on a per-task cycle or instruction counter that doesn't exclude any processor mode, it could go on if it can use a limited hardware counter. To keep track of these constraints, this adds a flags argument to the processor-specific get_alternatives() functions, with three bits defined: one to say that we can accept alternative event codes that go on limited counters, one to say we only want alternatives on limited counters, and one to say that this is a per-task counter and therefore events that are gated by run state are equivalent to those that aren't (e.g. a "cycles" event is equivalent to a "cycles in run state" event). These flags are computed for each counter and stored in the counter->hw.counter_base field (slightly wonky name for what it does, but it was an existing unused field). Since the limited counters don't freeze when we freeze the other counters, we need some special handling to avoid getting skew between things counted on the limited counters and those counted on normal counters. To minimize this skew, if we are using any limited counters, we read PMC5 and PMC6 immediately after setting and clearing the freeze bit. This is done in a single asm in the new write_mmcr0() function. The code here is specific to PMC5 and PMC6 being the limited hardware counters. Being more general (e.g. having a bitmap of limited hardware counter numbers) would have meant more complex code to read the limited counters when freezing and unfreezing the normal counters, with conditional branches, which would have increased the skew. Since it isn't necessary for the code to be more general at this stage, it isn't. This also extends the back-ends for POWER5+ and POWER6 to be able to handle up to 6 counters rather than the 4 they previously handled. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 1月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
This provides the architecture-specific functions needed to access PMU hardware on the 64-bit PowerPC processors. It has been designed for the IBM POWER family (POWER 4/4+/5/5+/6 and PPC970) but will hopefully also suit other 64-bit PowerPC machines (although probably not Cell given how different it is in this area). This doesn't include back-ends for any specific processors. This implements a system which allows back-ends to express the constraints that their hardware has on what events can be counted simultaneously. The constraints are expressed as a 64-bit mask + 64-bit value for each event, and the encoding is capable of expressing the constraints arising from having a set of multiplexers feeding an event bus, with some events being available through multiple multiplexer settings, such as we get on POWER4 and PPC970. Furthermore, the back-end can supply alternative event codes for each event, and the constraint checking code will try all possible combinations of alternative event codes to try to find a combination that will fit. Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 09 1月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
... with an empty/dummy asm/perf_counter.h so it builds. Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 17 4月, 2005 1 次提交
-
-
由 Linus Torvalds 提交于
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
-