- 03 6月, 2016 3 次提交
-
-
由 Julien Grall 提交于
pmu->irq_affinity will not be freed if an error occurred within arm_pmu_device_probe after of_pmu_irq_cfg has been called. Note that in the case of_pmu_irq_cfg is returning an error, pmu->irq_affinity will not be set, but it should be NULL as pmu was kzalloc'd. Therefore the result kfree(NULL) is benign. Signed-off-by: NJulien Grall <julien.grall@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Julien Grall 提交于
The global variable __oprofile_cpu_pmu is set before the PMU is fully initialized. If an error occurs before the end of the initialization, the PMU will be freed and the variable will contain an invalid pointer. This will result in a kernel crash when perf will be used. Fix it by moving the setting of __oprofile_cpu_pmu when the PMU is fully initialized (i.e when it is no longer possible to fail). Cc: <stable@vger.kernel.org> Signed-off-by: NJulien Grall <julien.grall@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Julien Grall 提交于
The only function called by of_pmu_irq_cfg that will increment the reference count on dn is of_parse_phandle. Each time we successfully parse a possible CPU from an interrupt-affinity property, we increment the refcount of that CPU node once via of_parse_handle. After validating the CPU is possible, we decrement the refcount once. Subsequently, we decrement the refcount again, either as part of an early break if we don't have a matching SPI, or as part of the end of the loop body. This will lead to decrementing twice the refcounnt. Remove the second pairs of call to of_node_put as nobody is using dn between the first and second call to of_node_put. Signed-off-by: NJulien Grall <julien.grall@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 05 5月, 2016 1 次提交
-
-
由 Mark Rutland 提交于
Commit: 26657848 ("perf/core: Verify we have a single perf_hw_context PMU") forcefully prevents multiple PMUs from sharing perf_hw_context, as this generally doesn't make sense. It is a common bug for uncore PMUs to use perf_hw_context rather than perf_invalid_context, which this detects. However, systems exist with heterogeneous CPUs (and hence heterogeneous HW PMUs), for which sharing perf_hw_context is necessary, and possible in some limited cases. To make this work we have to perform some gymnastics, as we did in these commits: 66eb579e ("perf: allow for PMU-specific event filtering") c904e32a ("arm: perf: filter unschedulable events") To allow those systems to work, we must allow PMUs for heterogeneous CPUs to share perf_hw_context, though we must still disallow sharing otherwise to detect the common misuse of perf_hw_context. This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates the core logic to account for this, and makes use of it in the arm_pmu code that is used for systems with heterogeneous CPUs. Comments are added to make the rationale clear and hopefully avoid accidental abuse. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostejSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 4月, 2016 1 次提交
-
-
由 Lorenzo Pieralisi 提交于
Commit da4e4f18 ("drivers/perf: arm_pmu: implement CPU_PM notifier") added code in the arm perf infrastructure that allows the kernel to save/restore perf counters whenever the CPU enters a low-power state. The kernel saves/restores the counters for each active event through the armpmu_{stop/start} ARM pmu API, so that the low-power state enter/exit cycle is emulated through pmu start/stop operations for each event in use. However, calling armpmu_start() for each active event on power up executes code that requires RCU locking (perf_event_update_userpage()) to be functional, so, given that the core may call the CPU_PM notifiers while running the idle thread in an quiescent RCU state this is not allowed as detected through the following splat when kernel is run with CONFIG_PROVE_LOCKING enabled: [ 49.293286] [ 49.294761] =============================== [ 49.298895] [ INFO: suspicious RCU usage. ] [ 49.303031] 4.6.0-rc3+ #421 Not tainted [ 49.306821] ------------------------------- [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used illegally while idle! [ 49.318530] [ 49.318530] other info that might help us debug this: [ 49.318530] [ 49.326451] [ 49.326451] RCU used illegally from idle CPU! [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0 [ 49.337209] RCU used illegally from extended quiescent state! [ 49.342892] 2 locks held by swapper/2/0: [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at: [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80 [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>] perf_event_update_userpage+0x0/0x260 This patch wraps the armpmu_start() call (that indirectly calls perf_event_update_userpage()) on CPU_PM notifier power state exit (or failed entry) within the RCU_NONIDLE() macro so that the RCU subsystem is made aware the calling cpu is not idle from an RCU perspective for the armpmu_start() call duration, therefore fixing the issue. Fixes: da4e4f18 ("drivers/perf: arm_pmu: implement CPU_PM notifier") Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reported-by: NJames Morse <james.morse@arm.com> Suggested-by: NKevin Hilman <khilman@baylibre.com> Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 21 3月, 2016 1 次提交
-
-
由 Will Deacon 提交于
Commit c6b90653 ("drivers/perf: arm_pmu: make info messages more verbose") breaks booting on systems where the PMU is probed without devicetree (e.g by inspecting the MIDR of the current CPU). In this case, pdev->dev.of_node is NULL and we shouldn't try to access its ->fullname field when printing probe error messages. This patch fixes the probing code to use of_node_full_name, which safely handles NULL nodes and removes the "Error %i" part of the string, since it's not terribly useful. Reported-by: NGuenter Roeck <private@roeck-us.net> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 26 2月, 2016 1 次提交
-
-
由 Lorenzo Pieralisi 提交于
When a CPU is suspended (either through suspend-to-RAM or CPUidle), its PMU registers content can be lost, which means that counters registers values that were initialized on power down entry have to be reprogrammed on power-up to make sure the counters set-up is preserved (ie on power-up registers take the reset values on Cold or Warm reset, which can be architecturally UNKNOWN). To guarantee seamless profiling conditions across a core power down this patch adds a CPU PM notifier to ARM pmus, that upon CPU PM entry/exit from low-power states saves/restores the pmu registers set-up (by using the ARM perf API), so that the power-down/up cycle does not affect the perf behaviour (apart from a black-out period between power-up/down CPU PM notifications that is unavoidable). Cc: Will Deacon <will.deacon@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: NAshwin Chaugule <ashwin.chaugule@linaro.org> Acked-by: NKevin Hilman <khilman@baylibre.com> Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 09 2月, 2016 1 次提交
-
-
由 Dirk Behme 提交于
On a big.LITTLE system e.g. with Cortex A57 and A53 in case not all cores are online at PMU probe time we might get hw perfevents: failed to probe PMU! hw perfevents: failed to register PMU devices! making it unclear which cores failed, here. Add the device tree full name which failed and the error value resulting in a more verbose and helpful message like hw perfevents: /soc/pmu_a53: failed to probe PMU! Error -6 hw perfevents: /soc/pmu_a53: failed to register PMU devices! Error -6 Signed-off-by: NDirk Behme <dirk.behme@de.bosch.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 26 1月, 2016 1 次提交
-
-
由 Martin Fuzzey 提交于
ARMv7 counters other than the CPU cycle counter only work if the Secure Debug Enable Register (SDER) SUNIDEN bit is set. Since access to the SDER is only possible in secure state, it will only be done if the device tree property "secure-reg-access" is set. Without this: Performance counter stats for 'sleep 1': 14606094 cycles # 0.000 GHz 0 instructions # 0.00 insns per cycle After applying: Performance counter stats for 'sleep 1': 5843809 cycles 2566484 instructions # 0.44 insns per cycle 1.020144000 seconds time elapsed Some platforms (eg i.MX53) may also need additional platform specific setup. Acked-by: NRob Herring <robh@kernel.org> Signed-off-by: NMartin Fuzzey <mfuzzey@parkeon.com> Signed-off-by: NPooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com> Signed-off-by: NGeorge G. Davis <george_davis@mentor.com> [will: add warning if property is found on arm64] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 16 11月, 2015 1 次提交
-
-
由 Mark Rutland 提交于
Nothing outside of drivers/perf/arm_pmu.c should call armpmu_register any more, so it no longer needs to be in include/linux/perf/arm_pmu.h. Additionally, by folding it in to arm_pmu_device_probe we can allow drivers to override struct pmu fields without getting blatted by the armpmu code. This patch folds armpmu_register into arm_pmu_device_probe. The logging to the console is moved to after the PMU is successfully registered with the core perf code. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Suggested-by: NWill Deacon <will.deacon@arm.com> Cc: Drew Richardson <drew.richardson@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 15 10月, 2015 1 次提交
-
-
由 Will Deacon 提交于
of_cpu_device_node_get increments the reference count on the CPU device_node, so we must take care to of_node_put once we've finished with it. This patch fixes the perf IRQ probing code to avoid the leak. Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 31 7月, 2015 4 次提交
-
-
由 Mark Rutland 提交于
To enable sharing of the arm_pmu code with arm64, this patch factors it out to drivers/perf/. A new drivers/perf directory is added for performance monitor drivers to live under. MAINTAINERS is updated accordingly. Files added previously without a corresponsing MAINTAINERS update (perf_regs.c, perf_callchain.c, and perf_event.h) are also added. Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> [will: augmented Kconfig help slightly] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Sudeep Holla 提交于
arch_find_n_match_cpu_physical_id parses the device tree to get the device node for a given logical cpu index. However, since ARM PMUs get probed after the CPU device nodes are stashed while registering the cpus, we can use of_cpu_device_node_get to avoid another DT parse. This patch replaces arch_find_n_match_cpu_physical_id with of_cpu_device_node_get to reuse the stashed value directly instead. Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: NSudeep Holla <sudeep.holla@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On systems containing multiple, heterogeneous clusters we need a way to associate a PMU "device" with the CPU(s) on which it exists. For PMUs that signal overflow with SPIs, this relationship is determined via the "interrupt-affinity" property, which contains a list of phandles to CPU nodes for the PMU. For PMUs using PPIs, the per-cpu nature of the interrupt isn't enough to determine the set of CPUs which actually contain the device. This patch allows the interrupt-affinity property to be specified on a PMU node irrespective of the interrupt type. For PPIs, it identifies the set of CPUs signalling the PPI in question. Tested-by: Stephen Boyd <sboyd@codeaurora.org> # Krait PMU Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Stephen Boyd 提交于
For PPI based PMUs, we bail out early in of_pmu_irq_cfg() without setting the PMU's supported_cpus bitmap. This causes the smp_call_function_any() in armv7_probe_num_events() to fail. Set the bitmap to be all CPUs so that we properly probe PMUs that use PPIs. Fixes: cc88116d ("arm: perf: treat PMUs as CPU affine") Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 17 7月, 2015 1 次提交
-
-
由 Stephen Boyd 提交于
It's possible, albeit unlikely, that using the of_node here will reference freed memory. Call of_node_put() after printing the name to be safe. Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 10 7月, 2015 1 次提交
-
-
由 Stephen Boyd 提交于
For PPI based PMUs, we bail out early in of_pmu_irq_cfg() without setting the PMU's supported_cpus bitmap. This causes the smp_call_function_any() in armv7_probe_num_events() to fail. Set the bitmap to be all CPUs so that we properly probe PMUs that use PPIs. Fixes: cc88116d ("arm: perf: treat PMUs as CPU affine") Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 29 5月, 2015 1 次提交
-
-
由 Mark Rutland 提交于
Now that the arm_pmu framework is only used for CPU PMUs, there's no reason to keep the pseudo-generic and CPU-specific framework portions separate. This patch folds the two into perf_event.c. Signed-off-by: NMark Rutland <mark.rutland@arm.com> [will: fixed up irq cfg to match upstream] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 28 5月, 2015 1 次提交
-
-
由 Mark Rutland 提交于
Currently the arm perf code has platdata callbacks for runtime PM and irq handling, but no platform implements the hooks for the former. Kill these off. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 27 5月, 2015 2 次提交
-
-
由 Mark Rutland 提交于
Different CPU microarchitectures implement different PMU events, and thus events which can be scheduled on one microarchitecture cannot be scheduled on another, and vice-versa. Some archicted events behave differently across microarchitectures, and thus cannot be meaningfully summed. Due to this, we reject the scheduling of an event on a CPU of a different microarchitecture to that the event targets. When the core perf code is scheduling events and encounters an event which cannot be scheduled, it stops attempting to schedule events. As the perf core periodically rotates the list of events, for some proportion of the time events which are unschedulable will block events which are schedulable, resulting in low utilisation of the hardware counters. This patch implements a pmu::filter_match callback such that we can detect and skip such events while scheduling early, before they can block the schedulable events. This prevents the low HW counter utilisation issue. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
In multi-cluster systems, the PMUs can be different across clusters, and so our logical PMU may not be able to schedule events on all CPUs. This patch adds a cpumask to encode which CPUs a PMU driver supports controlling events for, and limits the driver to scheduling events on those CPUs, and enabling and disabling the physical PMUs on those CPUs. The cpumask is built based on the interrupt-affinity property, and in the absence of such a property a homogenous system is assumed. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 20 3月, 2015 1 次提交
-
-
由 Suzuki K. Poulose 提交于
The perf core implicitly rejects events spanning multiple HW PMUs, as in these cases the event->ctx will differ. However this validation is performed after pmu::event_init() is called in perf_init_event(), and thus pmu::event_init() may be called with a group leader from a different HW PMU. The ARM PMU driver does not take this fact into account, and when validating groups assumes that it can call to_arm_pmu(event->pmu) for any HW event. When the event in question is from another HW PMU this is wrong, and results in dereferencing garbage. This patch updates the ARM PMU driver to first test for and reject events from other PMUs, moving the to_arm_pmu and related logic after this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with a CCI PMU present: --- CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57 Hardware name: ARM-Versatile Express task: bd8484c0 ti: be676000 task.ti: be676000 PC is at 0xbf1bbc90 LR is at validate_event+0x34/0x5c pc : [<bf1bbc90>] lr : [<80016060>] psr: 00000013 ... [<80016060>] (validate_event) from [<80016198>] (validate_group+0x28/0x90) [<80016198>] (validate_group) from [<80016398>] (armpmu_event_init+0x150/0x218) [<80016398>] (armpmu_event_init) from [<800882e4>] (perf_try_init_event+0x30/0x48) [<800882e4>] (perf_try_init_event) from [<8008f544>] (perf_init_event+0x5c/0xf4) [<8008f544>] (perf_init_event) from [<8008f8a8>] (perf_event_alloc+0x2cc/0x35c) [<8008f8a8>] (perf_event_alloc) from [<8009015c>] (SyS_perf_event_open+0x498/0xa70) [<8009015c>] (SyS_perf_event_open) from [<8000e420>] (ret_fast_syscall+0x0/0x34) Code: bf1be000 bf1bb380 802a2664 00000000 (00000002) ---[ end trace 01aff0ff00926a0a ]--- Also cleans up the code to use the arm_pmu only when we know that we are dealing with an arm pmu event. Cc: Will Deacon <will.deacon@arm.com> Acked-by: NMark Rutland <mark.rutland@arm.com> Acked-by: NPeter Ziljstra (Intel) <peterz@infradead.org> Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 13 1月, 2015 1 次提交
-
-
由 Daniel Thompson 提交于
If the overflow threshold for a counter is set above or near the 0xffffffff boundary then the kernel may lose track of the overflow causing only events that occur *after* the overflow to be recorded. Specifically the problem occurs when the value of the performance counter overtakes its original programmed value due to wrap around. Typical solutions to this problem are either to avoid programming in values likely to be overtaken or to treat the overflow bit as the 33rd bit of the counter. Its somewhat fiddly to refactor the code to correctly handle the 33rd bit during irqsave sections (context switches for example) so instead we take the simpler approach of avoiding values likely to be overtaken. We set the limit to half of max_period because this matches the limit imposed in __hw_perf_event_init(). This causes a doubling of the interrupt rate for large threshold values, however even with a very fast counter ticking at 4GHz the interrupt rate would only be ~1Hz. Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 13 12月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
After commit b2b49ccb (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere in the code under arch/arm/ (the defconfig files will be modified later). Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: NNishanth Menon <nm@ti.com> Acked-by: NSekhar Nori <nsekhar@ti.com> Acked-by: NSantosh Shilimkar <ssantosh@kernel.org>
-
- 30 10月, 2014 5 次提交
-
-
由 Mark Rutland 提交于
Currently the percpu_pmu pointers used as percpu_irq dev_id values are defined separately from the other per-cpu accounting data, which make dynamically allocating the data (as will be required for systems with heterogeneous CPUs) difficult. This patch moves the percpu_pmu pointers into pmu_hw_events (which is itself allocated per cpu), which will allow for easier dynamic allocation. Both percpu and regular irqs are requested using percpu_pmu pointers as tokens, freeing us from having to know whether an irq is percpu within the handler, and thus avoiding a radix tree lookup on the handler path. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NWill Deacon <will.deacon@arm.com> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Tested-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
Now that the arm pmu code is limited to CPU PMUs the get_hw_events() function is superfluous, as we'll always have a set of per-cpu pmu_hw_events structures. This patch removes the get_hw_events() function, replacing it with a percpu hw_events pointer. Uses of get_hw_events are updated to use this_cpu_ptr. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NWill Deacon <will.deacon@arm.com> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
Commit 3fc2c830 (ARM: perf: remove event limit from pmu_hw_events) got rid of the upper limit on the number of events an arm_pmu could handle, but introduced additional complexity and places a burden on each PMU driver to allocate accounting data somehow. So far this has not generally been useful as the only users of arm_pmu are the CPU backend and the CCI driver. Now that the CCI driver plugs into the perf subsystem directly, we can remove some of the complexities that get in the way of supporting heterogeneous CPU PMUs. This patch restores the original limits on pmu_hw_events fields such that the pmu_hw_events data can be allocated as a contiguous block. This will simplify dynamic pmu_hw_events allocation in later patches. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NWill Deacon <will.deacon@arm.com> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Tested-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
For systems with heterogeneous CPUs (e.g. big.LITTLE systems) the PMUs can be different in each cluster, and not all events can be migrated between clusters. To allow userspace to deal with this, it must be possible to address each PMU independently. This patch changes PMUs to be registered with dynamic (IDR) types, allowing them to be targeted individually. Each PMU's type can be found in ${SYSFS_ROOT}/bus/event_source/devices/${PMU_NAME}/type. From userspace, raw events can be targeted at a specific PMU: $ perf stat -e ${PMU_NAME}/config=V,config1=V1,.../ Doing this does not break existing tools which use existing perf types: when perf core can't find a PMU of matching type (in perf_init_event) it'll iterate over the set of all PMUs. If a compatible PMU exists, it'll be found eventually. If more than one compatible PMU exists, the event will be handled by whichever PMU happens to be earlier in the pmus list (which currently will be the last compatible PMU registered). Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Mark Rutland 提交于
The ARM callchain handling code is currently bundled with the ARM PMU management code, despite the two having no dependency on each other. This bundling has the unfortunate property of making callchain handling depend on CONFIG_HW_PERF_EVENTS, even though the callchain handling could be applied to software events in the absence of PMU hardware support. This patch separates the two, placing the callchain handling in perf_callchain.c and making it depend on CONFIG_PERF_EVENTS rather than CONFIG_HW_PERF_EVENTS, enabling callchain recording on kernels built without hardware perf event support. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 18 7月, 2014 1 次提交
-
-
由 Nikolay Borisov 提交于
Make the perf backend use the API so that it correctly references the FP when in THUMB2 mode Signed-off-by: NNikolay Borisov <Nikolay.Borisov@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 09 7月, 2014 2 次提交
-
-
由 Jean Pihet 提交于
Under perf, the fp unwinding scheme requires access to user space memory and can provoke a pagefault via call to __copy_from_user_inatomic from user_backtrace. This unwinding can take place in response to an interrupt (__perf_event_overflow). This is undesirable as we may already have mmap_sem held for write. One example being a process that calls mprotect just as a the PMU counters overflow. An example that can provoke this behaviour: perf record -e event:tocapture --call-graph fp ./application_to_test This patch addresses this issue by disabling pagefaults briefly in user_backtrace (as is done in the other architectures: ARM64, x86, Sparc etc.). Without the patch a deadlock occurs when __perf_event_overflow is called while reading the data from the user space: [ INFO: possible recursive locking detected ] 3.16.0-rc2-00038-g0ed7ff6 #46 Not tainted --------------------------------------------- stress/1634 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<c001dc04>] do_page_fault+0xa8/0x428 but task is already holding lock: (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&mm->mmap_sem); lock(&mm->mmap_sem); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by stress/1634: #0: (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8 #1: (rcu_read_lock){......}, at: [<c00c29dc>] __perf_event_overflow+0x120/0x294 stack backtrace: CPU: 1 PID: 1634 Comm: stress Not tainted 3.16.0-rc2-00038-g0ed7ff6 #46 [<c0017c8c>] (unwind_backtrace) from [<c0012eec>] (show_stack+0x20/0x24) [<c0012eec>] (show_stack) from [<c04de914>] (dump_stack+0x7c/0x98) [<c04de914>] (dump_stack) from [<c006a360>] (__lock_acquire+0x1484/0x1cf0) [<c006a360>] (__lock_acquire) from [<c006b14c>] (lock_acquire+0xa4/0x11c) [<c006b14c>] (lock_acquire) from [<c04e3880>] (down_read+0x40/0x7c) [<c04e3880>] (down_read) from [<c001dc04>] (do_page_fault+0xa8/0x428) [<c001dc04>] (do_page_fault) from [<c00084ec>] (do_DataAbort+0x44/0xa8) [<c00084ec>] (do_DataAbort) from [<c0013a1c>] (__dabt_svc+0x3c/0x60) Exception stack(0xed7c5ae0 to 0xed7c5b28) 5ae0: ed7c5b5c b6dadff4 ffffffec 00000000 b6dadff4 ebc08000 00000000 ebc08000 5b00: 0000007e 00000000 ed7c4000 ed7c5b94 00000014 ed7c5b2c c001a438 c0236c60 5b20: 00000013 ffffffff [<c0013a1c>] (__dabt_svc) from [<c0236c60>] (__copy_from_user+0xa4/0x3a4) Acked-by: NSteve Capper <steve.capper@linaro.org> Signed-off-by: NJean Pihet <jean.pihet@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Jean Pihet 提交于
An event may occur when an mm is already released. As per commit 20afc60f 'x86, perf: Check that current->mm is alive before getting user callchain' Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJean Pihet <jean.pihet@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 05 6月, 2014 1 次提交
-
-
由 Vince Weaver 提交于
Make the ARM perf code use the new common PMU interrupt disabled code. This allows perf to work on ARM machines without a working PMU interrupt (for example, raspberry pi). Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NVince Weaver <vincent.weaver@maine.edu> [peterz: applied changes suggested by Will] Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: devicetree@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161712190.11099@vincent-weaver-1.umelst.maine.edu [ Small readability tweaks to the code. ] Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 2月, 2014 3 次提交
-
-
由 Will Deacon 提交于
Since we indirect all of our PMU IRQ handling through a dispatcher, it's trivial to hook up perf_sample_event_took to prevent applications such as oprofile from generating interrupt storms due to an unrealisticly low sample period. Reported-by: NRobert Richter <rric@kernel.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Stephen Boyd 提交于
On Krait processors we have a many-to-one relationship between raw CPU events and the event programmed into the PMNx counter. Two raw CPU events could map to the same value programmed in the PMNx counter. To avoid this problem, we check for collisions during the get_event_idx() callback by setting a bit in a bitmap whenever a certain event is used in a PMNx counter (see the next patch). Unfortunately, we don't have a hook to clear this bit in the bitmap when the event is deleted so let's add an optional clear_event_idx() callback for this purpose. Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Stephen Boyd 提交于
Some CPU PMUs are wired up with one PPI for all the CPUs instead of with a different SPI for each CPU. Add support for these devices. Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 17 12月, 2013 1 次提交
-
-
由 Will Deacon 提交于
This reverts commit 3581fe0e. Fixes to the handling of PERF_EVENT_IOC_PERIOD in the core code mean we no longer have to play this horrible game. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1385560479-11014-2-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 09 10月, 2013 1 次提交
-
-
由 Will Deacon 提交于
Since software events can always be scheduled, perf allows software and hardware events to be mixed together in the same event group. There are two ways in which this can come about: (1) A SW event is added to a HW group. This validates using the HW PMU of the group leader. (2) A HW event is added to a SW group. This inserts the SW events and the new HW event into a HW context, but the SW event remains the group leader. When validating the latter case, we would ideally compare the PMU of each event in the group with the relevant HW PMU. The problem is, in the face of potentially multiple HW PMUs, we don't have a handle on the relevant structure. Commit 7b9f72c6 ("ARM: perf: clean up event group validation") attempting to resolve this issue, but actually made things *worse* by comparing with the leader PMU. If the leader is a SW event, then we automatically `pass' all the HW events during validation! This patch removes the check against the leader PMU. Whilst this will allow events from multiple HW PMUs to be grouped together, that should probably be dealt with in perf core as the result of a later patch. Acked-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 14 8月, 2013 2 次提交
-
-
由 Stephen Boyd 提交于
Fix constraint check in armpmu_map_hw_event(). Reported-and-tested-by: NVince Weaver <vincent.weaver@maine.edu> Cc: <stable@kernel.org> Signed-off-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Boyd 提交于
Vince Weaver reports an oops in the ARM perf event code while running his perf_fuzzer tool on a pandaboard running v3.11-rc4. Unable to handle kernel paging request at virtual address 73fd14cc pgd = eca6c000 [73fd14cc] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_omap_hdmi omapdss snd_soc_omap_abe_twl6040 snd_soc_twl6040 snd_soc_omap snd_soc_omap_hdmi_card snd_soc_omap_mcpdm snd_soc_omap_mcbsp snd_soc_core snd_compress regmap_spi snd_pcm snd_page_alloc snd_timer snd soundcore CPU: 1 PID: 2790 Comm: perf_fuzzer Not tainted 3.11.0-rc4 #6 task: eddcab80 ti: ed892000 task.ti: ed892000 PC is at armpmu_map_event+0x20/0x88 LR is at armpmu_event_init+0x38/0x280 pc : [<c001c3e4>] lr : [<c001c17c>] psr: 60000013 sp : ed893e40 ip : ecececec fp : edfaec00 r10: 00000000 r9 : 00000000 r8 : ed8c3ac0 r7 : ed8c3b5c r6 : edfaec00 r5 : 00000000 r4 : 00000000 r3 : 000000ff r2 : c0496144 r1 : c049611c r0 : edfaec00 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: aca6c04a DAC: 00000015 Process perf_fuzzer (pid: 2790, stack limit = 0xed892240) Stack: (0xed893e40 to 0xed894000) 3e40: 00000800 c001c17c 00000002 c008a748 00000001 00000000 00000000 c00bf078 3e60: 00000000 edfaee50 00000000 00000000 00000000 edfaec00 ed8c3ac0 edfaec00 3e80: 00000000 c073ffac ed893f20 c00bf180 00000001 00000000 c00bf078 ed893f20 3ea0: 00000000 ed8c3ac0 00000000 00000000 00000000 c0cb0818 eddcab80 c00bf440 3ec0: ed893f20 00000000 eddcab80 eca76800 00000000 eca76800 00000000 00000000 3ee0: 00000000 ec984c80 eddcab80 c00bfe68 00000000 00000000 00000000 00000080 3f00: 00000000 ed892000 00000000 ed892030 00000004 ecc7e3c8 ecc7e3c8 00000000 3f20: 00000000 00000048 ecececec 00000000 00000000 00000000 00000000 00000000 3f40: 00000000 00000000 00297810 00000000 00000000 00000000 00000000 00000000 3f60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 3f80: 00000002 00000002 000103a4 00000002 0000016c c00128e8 ed892000 00000000 3fa0: 00090998 c0012700 00000002 000103a4 00090ab8 00000000 00000000 0000000f 3fc0: 00000002 000103a4 00000002 0000016c 00090ab0 00090ab8 000107a0 00090998 3fe0: bed92be0 bed92bd0 0000b785 b6e8f6d0 40000010 00090ab8 00000000 00000000 [<c001c3e4>] (armpmu_map_event+0x20/0x88) from [<c001c17c>] (armpmu_event_init+0x38/0x280) [<c001c17c>] (armpmu_event_init+0x38/0x280) from [<c00bf180>] (perf_init_event+0x108/0x180) [<c00bf180>] (perf_init_event+0x108/0x180) from [<c00bf440>] (perf_event_alloc+0x248/0x40c) [<c00bf440>] (perf_event_alloc+0x248/0x40c) from [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc) [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc) from [<c0012700>] (ret_fast_syscall+0x0/0x48) Code: 0a000005 e3540004 0a000016 e3540000 (0791010c) This is because event->attr.config in armpmu_event_init() contains a very large number copied directly from userspace and is never checked against the size of the array indexed in armpmu_map_hw_event(). Fix the problem by checking the value of config before indexing the array and rejecting invalid config values. Reported-by: NVince Weaver <vincent.weaver@maine.edu> Tested-by: NVince Weaver <vincent.weaver@maine.edu> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-