提交 · 2bece1a0106497d065fb7db77abc525d32d3bf04 · openeuler / Kernel

03 9月, 2016 2 次提交

drivers/perf: arm_pmu: Fix NULL pointer dereference during probe · 63fb0a95

由 Stefan Wahren 提交于 8月 27, 2016

Patch 7f1d642f ("drivers/perf: arm-pmu: Fix handling of SPI lacking
interrupt-affinity property") unintended also fixes perf_event support
for bcm2835 which doesn't have PMU interrupts. Unfortunately this change
introduce a NULL pointer dereference on bcm2835, because irq_is_percpu
always expected to be called with a valid IRQ. So fix this regression
by validating the IRQ before.
Tested-by: NKevin Hilman <khilman@baylibre.com>
Signed-off-by: NStefan Wahren <stefan.wahren@i2se.com>
Fixes: 7f1d642f ("drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property")
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

63fb0a95

drivers/perf: arm_pmu: Fix leak in error path · 75324684

由 Stefan Wahren 提交于 8月 27, 2016

In case of a IRQ type mismatch in of_pmu_irq_cfg() the
device node for interrupt affinity isn't freed. So fix this
issue by calling of_node_put().
Signed-off-by: NStefan Wahren <stefan.wahren@i2se.com>
Fixes: fa8ad788 ("arm: perf: factor arm_pmu core out to drivers")
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

75324684

10 8月, 2016 2 次提交

drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property · 7f1d642f

由 Marc Zyngier 提交于 7月 19, 2016

Patch 19a469a5 ("drivers/perf: arm-pmu: Handle per-interrupt
affinity mask") added support for partitionned PPI setups, but
inadvertently broke setups using SPIs without the "interrupt-affinity"
property (which is the case for UP platforms).

This patch restore the broken functionnality by testing whether the
interrupt is percpu or not instead of relying on the using_spi flag
that really means "SPI *and* interrupt-affinity property".
Acked-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Fixes: 19a469a5 ("drivers/perf: arm-pmu: Handle per-interrupt affinity mask")
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7f1d642f

drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock · a026bb12

由 Sudeep Holla 提交于 8月 03, 2016

arm_pmu_mutex is never held long and we don't want to sleep while the
lock is being held as it's executed in the context of hotplug notifiers.
So it can be converted to a simple spinlock instead.

Without this patch we get the following warning:

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/2
no locks held by swapper/2/0.
irq event stamp: 381314
hardirqs last  enabled at (381313): _raw_spin_unlock_irqrestore+0x7c/0x88
hardirqs last disabled at (381314): cpu_die+0x28/0x48
softirqs last  enabled at (381294): _local_bh_enable+0x28/0x50
softirqs last disabled at (381293): irq_enter+0x58/0x78
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.7.0 #12
Call trace:
 dump_backtrace+0x0/0x220
 show_stack+0x24/0x30
 dump_stack+0xb4/0xf0
 ___might_sleep+0x1d8/0x1f0
 __might_sleep+0x5c/0x98
 mutex_lock_nested+0x54/0x400
 arm_perf_starting_cpu+0x34/0xb0
 cpuhp_invoke_callback+0x88/0x3d8
 notify_cpu_starting+0x78/0x98
 secondary_start_kernel+0x108/0x1a8

This patch converts the mutex to spinlock to eliminate the above
warnings. This constraints pmu->reset to be non-blocking call which is
the case with all the ARM PMU backends.

Cc: Stephen Boyd <sboyd@codeaurora.org>
Fixes: 37b502f1 ("arm/perf: Fix hotplug state machine conversion")
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a026bb12

20 7月, 2016 1 次提交

arm/perf: Fix hotplug state machine conversion · 37b502f1

由 Sebastian Andrzej Siewior 提交于 7月 20, 2016

Mark Rutland pointed out that this commit is incomplete:

  7d88eb69 ("arm/perf: Convert to hotplug state machine")

The problem is that:

 > We may have multiple PMUs (e.g. two in big.LITTLE systems), and
 > __oprofile_cpu_pmu only contains one of these. So this conversion is not
 > correct.
 >
 > We were relying on the notifier list implicitly containing a list of
 > those PMUs. It seems like we need an explicit list here.
 >
 > We keep __oprofile_cpu_pmu around for legacy 32-bit users of OProfile
 > (on non-hetereogeneous systems), and that's all that the variable should
 > be used for.

Introduce arm_pmu_list to correctly handle multiple PMUs in the system.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-tip-commits@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160719111733.GA22911@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

37b502f1

15 7月, 2016 1 次提交

arm/perf: Convert to hotplug state machine · 7d88eb69

由 Thomas Gleixner 提交于 7月 13, 2016

Straight forward conversion w/o bells and whistles.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153335.794097159@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

7d88eb69

09 7月, 2016 1 次提交

drivers/perf: arm-pmu: Handle per-interrupt affinity mask · 19a469a5

由 Marc Zyngier 提交于 7月 08, 2016

On a big-little system, PMUs can be wired to CPUs using per CPU
interrups (PPI). In this case, it is important to make sure that
the enable/disable do happen on the right set of CPUs.

So instead of relying on the interrupt-affinity property, we can
use the actual percpu affinity that DT exposes as part of the
interrupt specifier. The DT binding is also updated to reflect
the fact that the interrupt-affinity property shouldn't be used
in that case.
Acked-by: NRob Herring <robh@kernel.org>
Tested-by: NCaesar Wang <wxt@rock-chips.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

19a469a5

15 6月, 2016 1 次提交

arm: pmu: Fix non-devicetree probing · f7a6c149

由 Mark Salter 提交于 6月 07, 2016

There is a problem in the non-devicetree PMU probing where some
probe functions may get the number of supported events through
smp_call_function_any() using the arm_pmu supported_cpus mask.
But at the time the probe function is called, the supported_cpus
mask is empty so the call fails. This patch makes sure the mask
is set before calling the init function rather than after.
Signed-off-by: NMark Salter <msalter@redhat.com>
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f7a6c149

03 6月, 2016 3 次提交

drivers/perf: arm_pmu: Avoid leaking pmu->irq_affinity on error · 5988a363

由 Julien Grall 提交于 5月 31, 2016

pmu->irq_affinity will not be freed if an error occurred within
arm_pmu_device_probe after of_pmu_irq_cfg has been called.

Note that in the case of_pmu_irq_cfg is returning an error,
pmu->irq_affinity will not be set, but it should be NULL as pmu was
kzalloc'd. Therefore the result kfree(NULL) is benign.
Signed-off-by: NJulien Grall <julien.grall@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5988a363

drivers/perf: arm_pmu: Defer the setting of __oprofile_cpu_pmu · 0f254c76

由 Julien Grall 提交于 5月 31, 2016

The global variable __oprofile_cpu_pmu is set before the PMU is fully
initialized. If an error occurs before the end of the initialization,
the PMU will be freed and the variable will contain an invalid pointer.

This will result in a kernel crash when perf will be used.

Fix it by moving the setting of __oprofile_cpu_pmu when the PMU is fully
initialized (i.e when it is no longer possible to fail).

Cc: <stable@vger.kernel.org>
Signed-off-by: NJulien Grall <julien.grall@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0f254c76

drivers/perf: arm_pmu: Fix reference count of a device_node in of_pmu_irq_cfg · 121323ae

由 Julien Grall 提交于 5月 31, 2016

The only function called by of_pmu_irq_cfg that will increment the
reference count on dn is of_parse_phandle.

Each time we successfully parse a possible CPU from an
interrupt-affinity property, we increment the refcount of that CPU node
once via of_parse_handle. After validating the CPU is possible, we
decrement the refcount once. Subsequently, we decrement the refcount
again, either as part of an early break if we don't have a matching SPI,
or as part of the end of the loop body.

This will lead to decrementing twice the refcounnt.
Remove the second pairs of call to of_node_put as nobody is using dn
between the first and second call to of_node_put.
Signed-off-by: NJulien Grall <julien.grall@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

121323ae

05 5月, 2016 1 次提交

perf/arm: Special-case hetereogeneous CPUs · 5101ef20

由 Mark Rutland 提交于 4月 26, 2016

Commit:

  26657848 ("perf/core: Verify we have a single perf_hw_context PMU")

forcefully prevents multiple PMUs from sharing perf_hw_context, as this
generally doesn't make sense. It is a common bug for uncore PMUs to
use perf_hw_context rather than perf_invalid_context, which this detects.

However, systems exist with heterogeneous CPUs (and hence heterogeneous
HW PMUs), for which sharing perf_hw_context is necessary, and possible
in some limited cases.

To make this work we have to perform some gymnastics, as we did in these
commits:

  66eb579e ("perf: allow for PMU-specific event filtering")
  c904e32a ("arm: perf: filter unschedulable events")

To allow those systems to work, we must allow PMUs for heterogeneous
CPUs to share perf_hw_context, though we must still disallow sharing
otherwise to detect the common misuse of perf_hw_context.

This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates
the core logic to account for this, and makes use of it in the arm_pmu
code that is used for systems with heterogeneous CPUs. Comments are
added to make the rationale clear and hopefully avoid accidental abuse.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostejSigned-off-by: NIngo Molnar <mingo@kernel.org>

5101ef20

21 4月, 2016 1 次提交

drivers/perf: arm-pmu: fix RCU usage on pmu resume from low-power · cbcc72e0

由 Lorenzo Pieralisi 提交于 4月 21, 2016

Commit da4e4f18 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
added code in the arm perf infrastructure that allows the kernel to
save/restore perf counters whenever the CPU enters a low-power
state. The kernel saves/restores the counters for each active event
through the armpmu_{stop/start} ARM pmu API, so that the low-power state
enter/exit cycle is emulated through pmu start/stop operations for each
event in use.

However, calling armpmu_start() for each active event on power up
executes code that requires RCU locking (perf_event_update_userpage())
to be functional, so, given that the core may call the CPU_PM notifiers
while running the idle thread in an quiescent RCU state this is not
allowed as detected through the following splat when kernel is run with
CONFIG_PROVE_LOCKING enabled:

[   49.293286]
[   49.294761] ===============================
[   49.298895] [ INFO: suspicious RCU usage. ]
[   49.303031] 4.6.0-rc3+ #421 Not tainted
[   49.306821] -------------------------------
[   49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
illegally while idle!
[   49.318530]
[   49.318530] other info that might help us debug this:
[   49.318530]
[   49.326451]
[   49.326451] RCU used illegally from idle CPU!
[   49.326451] rcu_scheduler_active = 1, debug_locks = 0
[   49.337209] RCU used illegally from extended quiescent state!
[   49.342892] 2 locks held by swapper/2/0:
[   49.346768]  #0:  (cpu_pm_notifier_lock){......}, at:
[<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
[   49.355492]  #1:  (rcu_read_lock){......}, at: [<ffffff800816dc38>]
perf_event_update_userpage+0x0/0x260

This patch wraps the armpmu_start() call (that indirectly calls
perf_event_update_userpage()) on CPU_PM notifier power state exit (or
failed entry) within the RCU_NONIDLE() macro so that the RCU subsystem
is made aware the calling cpu is not idle from an RCU perspective for
the armpmu_start() call duration, therefore fixing the issue.

Fixes: da4e4f18 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reported-by: NJames Morse <james.morse@arm.com>
Suggested-by: NKevin Hilman <khilman@baylibre.com>
Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

cbcc72e0

21 3月, 2016 1 次提交

drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree · 357b565d

由 Will Deacon 提交于 3月 21, 2016

Commit c6b90653 ("drivers/perf: arm_pmu: make info messages more
verbose") breaks booting on systems where the PMU is probed without
devicetree (e.g by inspecting the MIDR of the current CPU). In this case,
pdev->dev.of_node is NULL and we shouldn't try to access its ->fullname
field when printing probe error messages.

This patch fixes the probing code to use of_node_full_name, which safely
handles NULL nodes and removes the "Error %i" part of the string, since
it's not terribly useful.
Reported-by: NGuenter Roeck <private@roeck-us.net>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

357b565d

26 2月, 2016 1 次提交

drivers/perf: arm_pmu: implement CPU_PM notifier · da4e4f18

由 Lorenzo Pieralisi 提交于 2月 23, 2016

When a CPU is suspended (either through suspend-to-RAM or CPUidle),
its PMU registers content can be lost, which means that counters
registers values that were initialized on power down entry have to be
reprogrammed on power-up to make sure the counters set-up is preserved
(ie on power-up registers take the reset values on Cold or Warm reset,
which can be architecturally UNKNOWN).

To guarantee seamless profiling conditions across a core power down
this patch adds a CPU PM notifier to ARM pmus, that upon CPU PM
entry/exit from low-power states saves/restores the pmu registers
set-up (by using the ARM perf API), so that the power-down/up cycle does
not affect the perf behaviour (apart from a black-out period between
power-up/down CPU PM notifications that is unavoidable).

Cc: Will Deacon <will.deacon@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: NAshwin Chaugule <ashwin.chaugule@linaro.org>
Acked-by: NKevin Hilman <khilman@baylibre.com>
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

da4e4f18

09 2月, 2016 1 次提交

drivers/perf: arm_pmu: make info messages more verbose · c6b90653

由 Dirk Behme 提交于 2月 04, 2016

On a big.LITTLE system e.g. with Cortex A57 and A53 in case not all cores
are online at PMU probe time we might get

hw perfevents: failed to probe PMU!
hw perfevents: failed to register PMU devices!

making it unclear which cores failed, here.

Add the device tree full name which failed and the error value resulting
in a more verbose and helpful message like

hw perfevents: /soc/pmu_a53: failed to probe PMU! Error -6
hw perfevents: /soc/pmu_a53: failed to register PMU devices! Error -6
Signed-off-by: NDirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c6b90653

26 1月, 2016 1 次提交

ARM: perf: Set ARMv7 SDER SUNIDEN bit · 8d1a0ae7

由 Martin Fuzzey 提交于 1月 13, 2016

ARMv7 counters other than the CPU cycle counter only work if the Secure
Debug Enable Register (SDER) SUNIDEN bit is set.

Since access to the SDER is only possible in secure state, it will
only be done if the device tree property "secure-reg-access" is set.

Without this:

 Performance counter stats for 'sleep 1':

          14606094 cycles                    #    0.000 GHz
                 0 instructions              #    0.00  insns per cycle

After applying:

 Performance counter stats for 'sleep 1':

           5843809 cycles
           2566484 instructions              #    0.44  insns per cycle

       1.020144000 seconds time elapsed

Some platforms (eg i.MX53) may also need additional platform specific
setup.
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NMartin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: NPooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Signed-off-by: NGeorge G. Davis <george_davis@mentor.com>
[will: add warning if property is found on arm64]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8d1a0ae7

16 11月, 2015 1 次提交

drivers/perf: kill armpmu_register · b916b785

由 Mark Rutland 提交于 10月 28, 2015

Nothing outside of drivers/perf/arm_pmu.c should call armpmu_register
any more, so it no longer needs to be in include/linux/perf/arm_pmu.h.
Additionally, by folding it in to arm_pmu_device_probe we can allow
drivers to override struct pmu fields without getting blatted by the
armpmu code.

This patch folds armpmu_register into arm_pmu_device_probe. The logging
to the console is moved to after the PMU is successfully registered with
the core perf code.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Suggested-by: NWill Deacon <will.deacon@arm.com>
Cc: Drew Richardson <drew.richardson@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b916b785

15 10月, 2015 1 次提交

drivers/perf: arm_pmu: avoid CPU device_node reference leak · fb659882

由 Will Deacon 提交于 10月 12, 2015

of_cpu_device_node_get increments the reference count on the CPU
device_node, so we must take care to of_node_put once we've finished
with it.

This patch fixes the perf IRQ probing code to avoid the leak.

Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

fb659882

31 7月, 2015 4 次提交

arm: perf: factor arm_pmu core out to drivers · fa8ad788

由 Mark Rutland 提交于 7月 06, 2015

To enable sharing of the arm_pmu code with arm64, this patch factors it
out to drivers/perf/. A new drivers/perf directory is added for
performance monitor drivers to live under.

MAINTAINERS is updated accordingly. Files added previously without a
corresponsing MAINTAINERS update (perf_regs.c, perf_callchain.c, and
perf_event.h) are also added.

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
[will: augmented Kconfig help slightly]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fa8ad788

ARM: perf: replace arch_find_n_match_cpu_physical_id with of_cpu_device_node_get · bc1e3c46

由 Sudeep Holla 提交于 6月 30, 2015

arch_find_n_match_cpu_physical_id parses the device tree to get the
device node for a given logical cpu index. However, since ARM PMUs get
probed after the CPU device nodes are stashed while registering the
cpus, we can use of_cpu_device_node_get to avoid another DT parse.

This patch replaces arch_find_n_match_cpu_physical_id with
of_cpu_device_node_get to reuse the stashed value directly instead.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bc1e3c46

ARM: perf: extend interrupt-affinity property for PPIs · b6c084d7

由 Will Deacon 提交于 6月 29, 2015

On systems containing multiple, heterogeneous clusters we need a way to
associate a PMU "device" with the CPU(s) on which it exists. For PMUs
that signal overflow with SPIs, this relationship is determined via the
"interrupt-affinity" property, which contains a list of phandles to CPU
nodes for the PMU. For PMUs using PPIs, the per-cpu nature of the
interrupt isn't enough to determine the set of CPUs which actually
contain the device.

This patch allows the interrupt-affinity property to be specified on a
PMU node irrespective of the interrupt type. For PPIs, it identifies
the set of CPUs signalling the PPI in question.

Tested-by: Stephen Boyd <sboyd@codeaurora.org> # Krait PMU
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b6c084d7

arm: perf: Set affinity for PPI based PMUs · 8ae81c25

由 Stephen Boyd 提交于 6月 29, 2015

For PPI based PMUs, we bail out early in of_pmu_irq_cfg() without
setting the PMU's supported_cpus bitmap. This causes the
smp_call_function_any() in armv7_probe_num_events() to fail. Set
the bitmap to be all CPUs so that we properly probe PMUs that use
PPIs.

Fixes: cc88116d ("arm: perf: treat PMUs as CPU affine")
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8ae81c25

17 7月, 2015 1 次提交

ARM: 8402/1: perf: Don't use of_node after putting it · 8e0c34b0

由 Stephen Boyd 提交于 7月 07, 2015

It's possible, albeit unlikely, that using the of_node here will
reference freed memory. Call of_node_put() after printing the
name to be safe.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8e0c34b0

10 7月, 2015 1 次提交

ARM: 8401/1: perf: Set affinity for PPI based PMUs · 8ded1e1a

由 Stephen Boyd 提交于 7月 07, 2015

For PPI based PMUs, we bail out early in of_pmu_irq_cfg() without
setting the PMU's supported_cpus bitmap. This causes the
smp_call_function_any() in armv7_probe_num_events() to fail. Set
the bitmap to be all CPUs so that we properly probe PMUs that use
PPIs.

Fixes: cc88116d ("arm: perf: treat PMUs as CPU affine")
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8ded1e1a

29 5月, 2015 1 次提交

arm: perf: unify perf_event{,_cpu}.c · 74cf0bc7

由 Mark Rutland 提交于 5月 26, 2015

Now that the arm_pmu framework is only used for CPU PMUs, there's no
reason to keep the pseudo-generic and CPU-specific framework portions
separate.

This patch folds the two into perf_event.c.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
[will: fixed up irq cfg to match upstream]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

74cf0bc7

28 5月, 2015 1 次提交

arm: perf: kill off unused pm callbacks · ed61f985

由 Mark Rutland 提交于 5月 26, 2015

Currently the arm perf code has platdata callbacks for runtime PM and
irq handling, but no platform implements the hooks for the former. Kill
these off.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ed61f985

27 5月, 2015 2 次提交

arm: perf: filter unschedulable events · c904e32a

由 Mark Rutland 提交于 5月 13, 2015

Different CPU microarchitectures implement different PMU events, and
thus events which can be scheduled on one microarchitecture cannot be
scheduled on another, and vice-versa. Some archicted events behave
differently across microarchitectures, and thus cannot be meaningfully
summed. Due to this, we reject the scheduling of an event on a CPU of a
different microarchitecture to that the event targets.

When the core perf code is scheduling events and encounters an event
which cannot be scheduled, it stops attempting to schedule events. As
the perf core periodically rotates the list of events, for some
proportion of the time events which are unschedulable will block events
which are schedulable, resulting in low utilisation of the hardware
counters.

This patch implements a pmu::filter_match callback such that we can
detect and skip such events while scheduling early, before they can
block the schedulable events. This prevents the low HW counter
utilisation issue.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c904e32a

arm: perf: treat PMUs as CPU affine · cc88116d

由 Mark Rutland 提交于 5月 13, 2015

In multi-cluster systems, the PMUs can be different across clusters, and
so our logical PMU may not be able to schedule events on all CPUs.

This patch adds a cpumask to encode which CPUs a PMU driver supports
controlling events for, and limits the driver to scheduling events on
those CPUs, and enabling and disabling the physical PMUs on those CPUs.
The cpumask is built based on the interrupt-affinity property, and in
the absence of such a property a homogenous system is assumed.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

cc88116d

20 3月, 2015 1 次提交

ARM: perf: reject groups spanning multiple hardware PMUs · e429817b

由 Suzuki K. Poulose 提交于 3月 17, 2015

The perf core implicitly rejects events spanning multiple HW PMUs, as in
these cases the event->ctx will differ. However this validation is
performed after pmu::event_init() is called in perf_init_event(), and
thus pmu::event_init() may be called with a group leader from a
different HW PMU.

The ARM PMU driver does not take this fact into account, and when
validating groups assumes that it can call to_arm_pmu(event->pmu) for
any HW event. When the event in question is from another HW PMU this is
wrong, and results in dereferencing garbage.

This patch updates the ARM PMU driver to first test for and reject
events from other PMUs, moving the to_arm_pmu and related logic after
this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with
a CCI PMU present:

 ---
CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57
Hardware name: ARM-Versatile Express
task: bd8484c0 ti: be676000 task.ti: be676000
PC is at 0xbf1bbc90
LR is at validate_event+0x34/0x5c
pc : [<bf1bbc90>]    lr : [<80016060>]    psr: 00000013
...
[<80016060>] (validate_event) from [<80016198>] (validate_group+0x28/0x90)
[<80016198>] (validate_group) from [<80016398>] (armpmu_event_init+0x150/0x218)
[<80016398>] (armpmu_event_init) from [<800882e4>] (perf_try_init_event+0x30/0x48)
[<800882e4>] (perf_try_init_event) from [<8008f544>] (perf_init_event+0x5c/0xf4)
[<8008f544>] (perf_init_event) from [<8008f8a8>] (perf_event_alloc+0x2cc/0x35c)
[<8008f8a8>] (perf_event_alloc) from [<8009015c>] (SyS_perf_event_open+0x498/0xa70)
[<8009015c>] (SyS_perf_event_open) from [<8000e420>] (ret_fast_syscall+0x0/0x34)
Code: bf1be000 bf1bb380 802a2664 00000000 (00000002)
---[ end trace 01aff0ff00926a0a ]---

Also cleans up the code to use the arm_pmu only when we know that
we are dealing with an arm pmu event.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NPeter Ziljstra (Intel) <peterz@infradead.org>
Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e429817b

13 1月, 2015 1 次提交

ARM: 8255/1: perf: Prevent wraparound during overflow · 2d9ed740

由 Daniel Thompson 提交于 1月 05, 2015

If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.
Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

2d9ed740

13 12月, 2014 1 次提交

ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM · bf7c5449

由 Rafael J. Wysocki 提交于 12月 13, 2014

After commit b2b49ccb (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME may now be changed to depend on
CONFIG_PM.

Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere in the code under
arch/arm/ (the defconfig files will be modified later).
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NNishanth Menon <nm@ti.com>
Acked-by: NSekhar Nori <nsekhar@ti.com>
Acked-by: NSantosh Shilimkar <ssantosh@kernel.org>

bf7c5449

30 10月, 2014 5 次提交

arm: perf: fold percpu_pmu into pmu_hw_events · 5ebd9200

由 Mark Rutland 提交于 5月 13, 2014

Currently the percpu_pmu pointers used as percpu_irq dev_id values are
defined separately from the other per-cpu accounting data, which make
dynamically allocating the data (as will be required for systems with
heterogeneous CPUs) difficult.

This patch moves the percpu_pmu pointers into pmu_hw_events (which is
itself allocated per cpu), which will allow for easier dynamic
allocation. Both percpu and regular irqs are requested using percpu_pmu
pointers as tokens, freeing us from having to know whether an irq is
percpu within the handler, and thus avoiding a radix tree lookup on the
handler path.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Tested-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5ebd9200

arm: perf: kill get_hw_events() · 11679250

由 Mark Rutland 提交于 5月 13, 2014

Now that the arm pmu code is limited to CPU PMUs the get_hw_events()
function is superfluous, as we'll always have a set of per-cpu
pmu_hw_events structures.

This patch removes the get_hw_events() function, replacing it with
a percpu hw_events pointer. Uses of get_hw_events are updated to use
this_cpu_ptr.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

11679250

arm: perf: limit size of accounting data · a4560846

由 Mark Rutland 提交于 5月 13, 2014

Commit 3fc2c830 (ARM: perf: remove event limit from pmu_hw_events) got
rid of the upper limit on the number of events an arm_pmu could handle,
but introduced additional complexity and places a burden on each PMU
driver to allocate accounting data somehow. So far this has not
generally been useful as the only users of arm_pmu are the CPU backend
and the CCI driver.

Now that the CCI driver plugs into the perf subsystem directly, we can
remove some of the complexities that get in the way of supporting
heterogeneous CPU PMUs.

This patch restores the original limits on pmu_hw_events fields such
that the pmu_hw_events data can be allocated as a contiguous block. This
will simplify dynamic pmu_hw_events allocation in later patches.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Tested-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a4560846

arm: perf: use IDR types for CPU PMUs · 67b4305a

由 Mark Rutland 提交于 9月 12, 2012

For systems with heterogeneous CPUs (e.g. big.LITTLE systems) the PMUs
can be different in each cluster, and not all events can be migrated
between clusters. To allow userspace to deal with this, it must be
possible to address each PMU independently.

This patch changes PMUs to be registered with dynamic (IDR) types,
allowing them to be targeted individually. Each PMU's type can be found
in ${SYSFS_ROOT}/bus/event_source/devices/${PMU_NAME}/type.

From userspace, raw events can be targeted at a specific PMU:
$ perf stat -e ${PMU_NAME}/config=V,config1=V1,.../

Doing this does not break existing tools which use existing perf types:
when perf core can't find a PMU of matching type (in perf_init_event)
it'll iterate over the set of all PMUs. If a compatible PMU exists,
it'll be found eventually. If more than one compatible PMU exists, the
event will be handled by whichever PMU happens to be earlier in the pmus
list (which currently will be the last compatible PMU registered).
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

67b4305a

arm: perf: factor out callchain code · d39976f0

由 Mark Rutland 提交于 9月 29, 2014

The ARM callchain handling code is currently bundled with the ARM PMU
management code, despite the two having no dependency on each other.
This bundling has the unfortunate property of making callchain handling
depend on CONFIG_HW_PERF_EVENTS, even though the callchain handling
could be applied to software events in the absence of PMU hardware
support.

This patch separates the two, placing the callchain handling in
perf_callchain.c and making it depend on CONFIG_PERF_EVENTS rather than
CONFIG_HW_PERF_EVENTS, enabling callchain recording on kernels built
without hardware perf event support.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d39976f0

18 7月, 2014 1 次提交

ARM: 8071/1: perf: Make perf use arm_get_current_stackframe · 6888e32a

由 Nikolay Borisov 提交于 6月 03, 2014

Make the perf backend use the API so that it correctly references the FP
when in THUMB2 mode
Signed-off-by: NNikolay Borisov <Nikolay.Borisov@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

6888e32a

09 7月, 2014 2 次提交

ARM: perf: disable the pagefault handler when reading from user space · 4b2974fa

由 Jean Pihet 提交于 7月 07, 2014

Under perf, the fp unwinding scheme requires access to user space memory
and can provoke a pagefault via call to __copy_from_user_inatomic from
user_backtrace. This unwinding can take place in response to an interrupt
(__perf_event_overflow). This is undesirable as we may already have
mmap_sem held for write. One example being a process that calls mprotect
just as a the PMU counters overflow.

An example that can provoke this behaviour:
perf record -e event:tocapture --call-graph fp ./application_to_test

This patch addresses this issue by disabling pagefaults briefly in
user_backtrace (as is done in the other architectures: ARM64, x86, Sparc etc.).

Without the patch a deadlock occurs when __perf_event_overflow is called
while reading the data from the user space:

 [ INFO: possible recursive locking detected ]
 3.16.0-rc2-00038-g0ed7ff6 #46 Not tainted
 ---------------------------------------------
 stress/1634 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c001dc04>] do_page_fault+0xa8/0x428

 but task is already holding lock:
 (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8

 other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&mm->mmap_sem);
  lock(&mm->mmap_sem);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

 2 locks held by stress/1634:
 #0:  (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8
 #1:  (rcu_read_lock){......}, at: [<c00c29dc>] __perf_event_overflow+0x120/0x294

 stack backtrace:
 CPU: 1 PID: 1634 Comm: stress Not tainted 3.16.0-rc2-00038-g0ed7ff6 #46
 [<c0017c8c>] (unwind_backtrace) from [<c0012eec>] (show_stack+0x20/0x24)
 [<c0012eec>] (show_stack) from [<c04de914>] (dump_stack+0x7c/0x98)
 [<c04de914>] (dump_stack) from [<c006a360>] (__lock_acquire+0x1484/0x1cf0)
 [<c006a360>] (__lock_acquire) from [<c006b14c>] (lock_acquire+0xa4/0x11c)
 [<c006b14c>] (lock_acquire) from [<c04e3880>] (down_read+0x40/0x7c)
 [<c04e3880>] (down_read) from [<c001dc04>] (do_page_fault+0xa8/0x428)
 [<c001dc04>] (do_page_fault) from [<c00084ec>] (do_DataAbort+0x44/0xa8)
 [<c00084ec>] (do_DataAbort) from [<c0013a1c>] (__dabt_svc+0x3c/0x60)
 Exception stack(0xed7c5ae0 to 0xed7c5b28)
 5ae0: ed7c5b5c b6dadff4 ffffffec 00000000 b6dadff4 ebc08000 00000000 ebc08000
 5b00: 0000007e 00000000 ed7c4000 ed7c5b94 00000014 ed7c5b2c c001a438 c0236c60
 5b20: 00000013 ffffffff
 [<c0013a1c>] (__dabt_svc) from [<c0236c60>] (__copy_from_user+0xa4/0x3a4)
Acked-by: NSteve Capper <steve.capper@linaro.org>
Signed-off-by: NJean Pihet <jean.pihet@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4b2974fa

ARM: perf: Check that current->mm is alive before getting user callchain · a7cc9100

由 Jean Pihet 提交于 7月 07, 2014

An event may occur when an mm is already released.

As per commit 20afc60f
 'x86, perf: Check that current->mm is alive before getting user callchain'
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJean Pihet <jean.pihet@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a7cc9100

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功