提交 · 8d1a0ae724ad74ef7946a45e3b2d3e01f39df02b · openeuler / Kernel

26 1月, 2016 1 次提交

ARM: perf: Set ARMv7 SDER SUNIDEN bit · 8d1a0ae7

由 Martin Fuzzey 提交于 1月 13, 2016

ARMv7 counters other than the CPU cycle counter only work if the Secure
Debug Enable Register (SDER) SUNIDEN bit is set.

Since access to the SDER is only possible in secure state, it will
only be done if the device tree property "secure-reg-access" is set.

Without this:

 Performance counter stats for 'sleep 1':

          14606094 cycles                    #    0.000 GHz
                 0 instructions              #    0.00  insns per cycle

After applying:

 Performance counter stats for 'sleep 1':

           5843809 cycles
           2566484 instructions              #    0.44  insns per cycle

       1.020144000 seconds time elapsed

Some platforms (eg i.MX53) may also need additional platform specific
setup.
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NMartin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: NPooya Keshavarzi <Pooya.Keshavarzi@de.bosch.com>
Signed-off-by: NGeorge G. Davis <george_davis@mentor.com>
[will: add warning if property is found on arm64]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8d1a0ae7

16 11月, 2015 1 次提交

drivers/perf: kill armpmu_register · b916b785

由 Mark Rutland 提交于 10月 28, 2015

Nothing outside of drivers/perf/arm_pmu.c should call armpmu_register
any more, so it no longer needs to be in include/linux/perf/arm_pmu.h.
Additionally, by folding it in to arm_pmu_device_probe we can allow
drivers to override struct pmu fields without getting blatted by the
armpmu code.

This patch folds armpmu_register into arm_pmu_device_probe. The logging
to the console is moved to after the PMU is successfully registered with
the core perf code.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Suggested-by: NWill Deacon <will.deacon@arm.com>
Cc: Drew Richardson <drew.richardson@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b916b785

15 10月, 2015 1 次提交

drivers/perf: arm_pmu: avoid CPU device_node reference leak · fb659882

由 Will Deacon 提交于 10月 12, 2015

of_cpu_device_node_get increments the reference count on the CPU
device_node, so we must take care to of_node_put once we've finished
with it.

This patch fixes the perf IRQ probing code to avoid the leak.

Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

fb659882

31 7月, 2015 4 次提交

arm: perf: factor arm_pmu core out to drivers · fa8ad788

由 Mark Rutland 提交于 7月 06, 2015

To enable sharing of the arm_pmu code with arm64, this patch factors it
out to drivers/perf/. A new drivers/perf directory is added for
performance monitor drivers to live under.

MAINTAINERS is updated accordingly. Files added previously without a
corresponsing MAINTAINERS update (perf_regs.c, perf_callchain.c, and
perf_event.h) are also added.

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
[will: augmented Kconfig help slightly]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fa8ad788

ARM: perf: replace arch_find_n_match_cpu_physical_id with of_cpu_device_node_get · bc1e3c46

由 Sudeep Holla 提交于 6月 30, 2015

arch_find_n_match_cpu_physical_id parses the device tree to get the
device node for a given logical cpu index. However, since ARM PMUs get
probed after the CPU device nodes are stashed while registering the
cpus, we can use of_cpu_device_node_get to avoid another DT parse.

This patch replaces arch_find_n_match_cpu_physical_id with
of_cpu_device_node_get to reuse the stashed value directly instead.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bc1e3c46

ARM: perf: extend interrupt-affinity property for PPIs · b6c084d7

由 Will Deacon 提交于 6月 29, 2015

On systems containing multiple, heterogeneous clusters we need a way to
associate a PMU "device" with the CPU(s) on which it exists. For PMUs
that signal overflow with SPIs, this relationship is determined via the
"interrupt-affinity" property, which contains a list of phandles to CPU
nodes for the PMU. For PMUs using PPIs, the per-cpu nature of the
interrupt isn't enough to determine the set of CPUs which actually
contain the device.

This patch allows the interrupt-affinity property to be specified on a
PMU node irrespective of the interrupt type. For PPIs, it identifies
the set of CPUs signalling the PPI in question.

Tested-by: Stephen Boyd <sboyd@codeaurora.org> # Krait PMU
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b6c084d7

arm: perf: Set affinity for PPI based PMUs · 8ae81c25

由 Stephen Boyd 提交于 6月 29, 2015

For PPI based PMUs, we bail out early in of_pmu_irq_cfg() without
setting the PMU's supported_cpus bitmap. This causes the
smp_call_function_any() in armv7_probe_num_events() to fail. Set
the bitmap to be all CPUs so that we properly probe PMUs that use
PPIs.

Fixes: cc88116d ("arm: perf: treat PMUs as CPU affine")
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8ae81c25

17 7月, 2015 1 次提交

ARM: 8402/1: perf: Don't use of_node after putting it · 8e0c34b0

由 Stephen Boyd 提交于 7月 07, 2015

It's possible, albeit unlikely, that using the of_node here will
reference freed memory. Call of_node_put() after printing the
name to be safe.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8e0c34b0

10 7月, 2015 1 次提交

ARM: 8401/1: perf: Set affinity for PPI based PMUs · 8ded1e1a

由 Stephen Boyd 提交于 7月 07, 2015

For PPI based PMUs, we bail out early in of_pmu_irq_cfg() without
setting the PMU's supported_cpus bitmap. This causes the
smp_call_function_any() in armv7_probe_num_events() to fail. Set
the bitmap to be all CPUs so that we properly probe PMUs that use
PPIs.

Fixes: cc88116d ("arm: perf: treat PMUs as CPU affine")
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8ded1e1a

29 5月, 2015 1 次提交

arm: perf: unify perf_event{,_cpu}.c · 74cf0bc7

由 Mark Rutland 提交于 5月 26, 2015

Now that the arm_pmu framework is only used for CPU PMUs, there's no
reason to keep the pseudo-generic and CPU-specific framework portions
separate.

This patch folds the two into perf_event.c.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
[will: fixed up irq cfg to match upstream]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

74cf0bc7

28 5月, 2015 1 次提交

arm: perf: kill off unused pm callbacks · ed61f985

由 Mark Rutland 提交于 5月 26, 2015

Currently the arm perf code has platdata callbacks for runtime PM and
irq handling, but no platform implements the hooks for the former. Kill
these off.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ed61f985

27 5月, 2015 2 次提交

arm: perf: filter unschedulable events · c904e32a

由 Mark Rutland 提交于 5月 13, 2015

Different CPU microarchitectures implement different PMU events, and
thus events which can be scheduled on one microarchitecture cannot be
scheduled on another, and vice-versa. Some archicted events behave
differently across microarchitectures, and thus cannot be meaningfully
summed. Due to this, we reject the scheduling of an event on a CPU of a
different microarchitecture to that the event targets.

When the core perf code is scheduling events and encounters an event
which cannot be scheduled, it stops attempting to schedule events. As
the perf core periodically rotates the list of events, for some
proportion of the time events which are unschedulable will block events
which are schedulable, resulting in low utilisation of the hardware
counters.

This patch implements a pmu::filter_match callback such that we can
detect and skip such events while scheduling early, before they can
block the schedulable events. This prevents the low HW counter
utilisation issue.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c904e32a

arm: perf: treat PMUs as CPU affine · cc88116d

由 Mark Rutland 提交于 5月 13, 2015

In multi-cluster systems, the PMUs can be different across clusters, and
so our logical PMU may not be able to schedule events on all CPUs.

This patch adds a cpumask to encode which CPUs a PMU driver supports
controlling events for, and limits the driver to scheduling events on
those CPUs, and enabling and disabling the physical PMUs on those CPUs.
The cpumask is built based on the interrupt-affinity property, and in
the absence of such a property a homogenous system is assumed.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

cc88116d

20 3月, 2015 1 次提交

ARM: perf: reject groups spanning multiple hardware PMUs · e429817b

由 Suzuki K. Poulose 提交于 3月 17, 2015

The perf core implicitly rejects events spanning multiple HW PMUs, as in
these cases the event->ctx will differ. However this validation is
performed after pmu::event_init() is called in perf_init_event(), and
thus pmu::event_init() may be called with a group leader from a
different HW PMU.

The ARM PMU driver does not take this fact into account, and when
validating groups assumes that it can call to_arm_pmu(event->pmu) for
any HW event. When the event in question is from another HW PMU this is
wrong, and results in dereferencing garbage.

This patch updates the ARM PMU driver to first test for and reject
events from other PMUs, moving the to_arm_pmu and related logic after
this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with
a CCI PMU present:

 ---
CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57
Hardware name: ARM-Versatile Express
task: bd8484c0 ti: be676000 task.ti: be676000
PC is at 0xbf1bbc90
LR is at validate_event+0x34/0x5c
pc : [<bf1bbc90>]    lr : [<80016060>]    psr: 00000013
...
[<80016060>] (validate_event) from [<80016198>] (validate_group+0x28/0x90)
[<80016198>] (validate_group) from [<80016398>] (armpmu_event_init+0x150/0x218)
[<80016398>] (armpmu_event_init) from [<800882e4>] (perf_try_init_event+0x30/0x48)
[<800882e4>] (perf_try_init_event) from [<8008f544>] (perf_init_event+0x5c/0xf4)
[<8008f544>] (perf_init_event) from [<8008f8a8>] (perf_event_alloc+0x2cc/0x35c)
[<8008f8a8>] (perf_event_alloc) from [<8009015c>] (SyS_perf_event_open+0x498/0xa70)
[<8009015c>] (SyS_perf_event_open) from [<8000e420>] (ret_fast_syscall+0x0/0x34)
Code: bf1be000 bf1bb380 802a2664 00000000 (00000002)
---[ end trace 01aff0ff00926a0a ]---

Also cleans up the code to use the arm_pmu only when we know that
we are dealing with an arm pmu event.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NPeter Ziljstra (Intel) <peterz@infradead.org>
Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e429817b

13 1月, 2015 1 次提交

ARM: 8255/1: perf: Prevent wraparound during overflow · 2d9ed740

由 Daniel Thompson 提交于 1月 05, 2015

If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.
Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

2d9ed740

13 12月, 2014 1 次提交

ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM · bf7c5449

由 Rafael J. Wysocki 提交于 12月 13, 2014

After commit b2b49ccb (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME may now be changed to depend on
CONFIG_PM.

Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere in the code under
arch/arm/ (the defconfig files will be modified later).
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NNishanth Menon <nm@ti.com>
Acked-by: NSekhar Nori <nsekhar@ti.com>
Acked-by: NSantosh Shilimkar <ssantosh@kernel.org>

bf7c5449

30 10月, 2014 5 次提交

arm: perf: fold percpu_pmu into pmu_hw_events · 5ebd9200

由 Mark Rutland 提交于 5月 13, 2014

Currently the percpu_pmu pointers used as percpu_irq dev_id values are
defined separately from the other per-cpu accounting data, which make
dynamically allocating the data (as will be required for systems with
heterogeneous CPUs) difficult.

This patch moves the percpu_pmu pointers into pmu_hw_events (which is
itself allocated per cpu), which will allow for easier dynamic
allocation. Both percpu and regular irqs are requested using percpu_pmu
pointers as tokens, freeing us from having to know whether an irq is
percpu within the handler, and thus avoiding a radix tree lookup on the
handler path.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Tested-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5ebd9200

arm: perf: kill get_hw_events() · 11679250

由 Mark Rutland 提交于 5月 13, 2014

Now that the arm pmu code is limited to CPU PMUs the get_hw_events()
function is superfluous, as we'll always have a set of per-cpu
pmu_hw_events structures.

This patch removes the get_hw_events() function, replacing it with
a percpu hw_events pointer. Uses of get_hw_events are updated to use
this_cpu_ptr.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

11679250

arm: perf: limit size of accounting data · a4560846

由 Mark Rutland 提交于 5月 13, 2014

Commit 3fc2c830 (ARM: perf: remove event limit from pmu_hw_events) got
rid of the upper limit on the number of events an arm_pmu could handle,
but introduced additional complexity and places a burden on each PMU
driver to allocate accounting data somehow. So far this has not
generally been useful as the only users of arm_pmu are the CPU backend
and the CCI driver.

Now that the CCI driver plugs into the perf subsystem directly, we can
remove some of the complexities that get in the way of supporting
heterogeneous CPU PMUs.

This patch restores the original limits on pmu_hw_events fields such
that the pmu_hw_events data can be allocated as a contiguous block. This
will simplify dynamic pmu_hw_events allocation in later patches.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Tested-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a4560846

arm: perf: use IDR types for CPU PMUs · 67b4305a

由 Mark Rutland 提交于 9月 12, 2012

For systems with heterogeneous CPUs (e.g. big.LITTLE systems) the PMUs
can be different in each cluster, and not all events can be migrated
between clusters. To allow userspace to deal with this, it must be
possible to address each PMU independently.

This patch changes PMUs to be registered with dynamic (IDR) types,
allowing them to be targeted individually. Each PMU's type can be found
in ${SYSFS_ROOT}/bus/event_source/devices/${PMU_NAME}/type.

From userspace, raw events can be targeted at a specific PMU:
$ perf stat -e ${PMU_NAME}/config=V,config1=V1,.../

Doing this does not break existing tools which use existing perf types:
when perf core can't find a PMU of matching type (in perf_init_event)
it'll iterate over the set of all PMUs. If a compatible PMU exists,
it'll be found eventually. If more than one compatible PMU exists, the
event will be handled by whichever PMU happens to be earlier in the pmus
list (which currently will be the last compatible PMU registered).
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

67b4305a

arm: perf: factor out callchain code · d39976f0

由 Mark Rutland 提交于 9月 29, 2014

The ARM callchain handling code is currently bundled with the ARM PMU
management code, despite the two having no dependency on each other.
This bundling has the unfortunate property of making callchain handling
depend on CONFIG_HW_PERF_EVENTS, even though the callchain handling
could be applied to software events in the absence of PMU hardware
support.

This patch separates the two, placing the callchain handling in
perf_callchain.c and making it depend on CONFIG_PERF_EVENTS rather than
CONFIG_HW_PERF_EVENTS, enabling callchain recording on kernels built
without hardware perf event support.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d39976f0

18 7月, 2014 1 次提交

ARM: 8071/1: perf: Make perf use arm_get_current_stackframe · 6888e32a

由 Nikolay Borisov 提交于 6月 03, 2014

Make the perf backend use the API so that it correctly references the FP
when in THUMB2 mode
Signed-off-by: NNikolay Borisov <Nikolay.Borisov@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

6888e32a

09 7月, 2014 2 次提交

ARM: perf: disable the pagefault handler when reading from user space · 4b2974fa

由 Jean Pihet 提交于 7月 07, 2014

Under perf, the fp unwinding scheme requires access to user space memory
and can provoke a pagefault via call to __copy_from_user_inatomic from
user_backtrace. This unwinding can take place in response to an interrupt
(__perf_event_overflow). This is undesirable as we may already have
mmap_sem held for write. One example being a process that calls mprotect
just as a the PMU counters overflow.

An example that can provoke this behaviour:
perf record -e event:tocapture --call-graph fp ./application_to_test

This patch addresses this issue by disabling pagefaults briefly in
user_backtrace (as is done in the other architectures: ARM64, x86, Sparc etc.).

Without the patch a deadlock occurs when __perf_event_overflow is called
while reading the data from the user space:

 [ INFO: possible recursive locking detected ]
 3.16.0-rc2-00038-g0ed7ff6 #46 Not tainted
 ---------------------------------------------
 stress/1634 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c001dc04>] do_page_fault+0xa8/0x428

 but task is already holding lock:
 (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8

 other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&mm->mmap_sem);
  lock(&mm->mmap_sem);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

 2 locks held by stress/1634:
 #0:  (&mm->mmap_sem){++++++}, at: [<c00f4098>] SyS_mprotect+0xa8/0x1c8
 #1:  (rcu_read_lock){......}, at: [<c00c29dc>] __perf_event_overflow+0x120/0x294

 stack backtrace:
 CPU: 1 PID: 1634 Comm: stress Not tainted 3.16.0-rc2-00038-g0ed7ff6 #46
 [<c0017c8c>] (unwind_backtrace) from [<c0012eec>] (show_stack+0x20/0x24)
 [<c0012eec>] (show_stack) from [<c04de914>] (dump_stack+0x7c/0x98)
 [<c04de914>] (dump_stack) from [<c006a360>] (__lock_acquire+0x1484/0x1cf0)
 [<c006a360>] (__lock_acquire) from [<c006b14c>] (lock_acquire+0xa4/0x11c)
 [<c006b14c>] (lock_acquire) from [<c04e3880>] (down_read+0x40/0x7c)
 [<c04e3880>] (down_read) from [<c001dc04>] (do_page_fault+0xa8/0x428)
 [<c001dc04>] (do_page_fault) from [<c00084ec>] (do_DataAbort+0x44/0xa8)
 [<c00084ec>] (do_DataAbort) from [<c0013a1c>] (__dabt_svc+0x3c/0x60)
 Exception stack(0xed7c5ae0 to 0xed7c5b28)
 5ae0: ed7c5b5c b6dadff4 ffffffec 00000000 b6dadff4 ebc08000 00000000 ebc08000
 5b00: 0000007e 00000000 ed7c4000 ed7c5b94 00000014 ed7c5b2c c001a438 c0236c60
 5b20: 00000013 ffffffff
 [<c0013a1c>] (__dabt_svc) from [<c0236c60>] (__copy_from_user+0xa4/0x3a4)
Acked-by: NSteve Capper <steve.capper@linaro.org>
Signed-off-by: NJean Pihet <jean.pihet@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4b2974fa

ARM: perf: Check that current->mm is alive before getting user callchain · a7cc9100

由 Jean Pihet 提交于 7月 07, 2014

An event may occur when an mm is already released.

As per commit 20afc60f
 'x86, perf: Check that current->mm is alive before getting user callchain'
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJean Pihet <jean.pihet@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

a7cc9100

05 6月, 2014 1 次提交

perf/ARM: Use common PMU interrupt disabled code · edcb4d3c

由 Vince Weaver 提交于 5月 16, 2014

Make the ARM perf code use the new common PMU interrupt disabled code.

This allows perf to work on ARM machines without a working PMU
interrupt (for example, raspberry pi).
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NVince Weaver <vincent.weaver@maine.edu>
[peterz: applied changes suggested by Will]
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: devicetree@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1405161712190.11099@vincent-weaver-1.umelst.maine.edu
[ Small readability tweaks to the code. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

edcb4d3c

21 2月, 2014 3 次提交

ARM: perf: hook up perf_sample_event_took around pmu irq handling · 5f5092e7

由 Will Deacon 提交于 2月 11, 2014

Since we indirect all of our PMU IRQ handling through a dispatcher, it's
trivial to hook up perf_sample_event_took to prevent applications such
as oprofile from generating interrupt storms due to an unrealisticly
low sample period.
Reported-by: NRobert Richter <rric@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5f5092e7

ARM: perf: add hook for event index clearing · eab443ef

由 Stephen Boyd 提交于 2月 07, 2014

On Krait processors we have a many-to-one relationship between
raw CPU events and the event programmed into the PMNx counter.
Two raw CPU events could map to the same value programmed in the
PMNx counter. To avoid this problem, we check for collisions
during the get_event_idx() callback by setting a bit in a bitmap
whenever a certain event is used in a PMNx counter (see the next
patch). Unfortunately, we don't have a hook to clear this bit in
the bitmap when the event is deleted so let's add an optional
clear_event_idx() callback for this purpose.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

eab443ef

ARM: perf: support percpu irqs for the CPU PMU · bbd64559

由 Stephen Boyd 提交于 2月 07, 2014

Some CPU PMUs are wired up with one PPI for all the CPUs instead
of with a different SPI for each CPU. Add support for these
devices.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bbd64559

17 12月, 2013 1 次提交

Revert "ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD" · 9450d14f

由 Will Deacon 提交于 11月 27, 2013

This reverts commit 3581fe0e.

Fixes to the handling of PERF_EVENT_IOC_PERIOD in the core code mean
we no longer have to play this horrible game.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1385560479-11014-2-git-send-email-will.deacon@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

9450d14f

09 10月, 2013 1 次提交

ARM: perf: fix group validation for mixed software and hardware groups · 2dfcb802

由 Will Deacon 提交于 10月 09, 2013

Since software events can always be scheduled, perf allows software and
hardware events to be mixed together in the same event group. There are
two ways in which this can come about:

  (1) A SW event is added to a HW group. This validates using the HW PMU
      of the group leader.

  (2) A HW event is added to a SW group. This inserts the SW events and
      the new HW event into a HW context, but the SW event remains the
      group leader.

When validating the latter case, we would ideally compare the PMU of
each event in the group with the relevant HW PMU. The problem is, in the
face of potentially multiple HW PMUs, we don't have a handle on the
relevant structure. Commit 7b9f72c6 ("ARM: perf: clean up event
group validation") attempting to resolve this issue, but actually made
things *worse* by comparing with the leader PMU. If the leader is a SW
event, then we automatically `pass' all the HW events during validation!

This patch removes the check against the leader PMU. Whilst this will
allow events from multiple HW PMUs to be grouped together, that should
probably be dealt with in perf core as the result of a later patch.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2dfcb802

14 8月, 2013 3 次提交

perf/arm: Fix armpmu_map_hw_event() · b88a2595

由 Stephen Boyd 提交于 8月 07, 2013

Fix constraint check in armpmu_map_hw_event().
Reported-and-tested-by: NVince Weaver <vincent.weaver@maine.edu>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b88a2595

ARM: 7810/1: perf: Fix array out of bounds access in armpmu_map_hw_event() · d9f96635

由 Stephen Boyd 提交于 8月 08, 2013

Vince Weaver reports an oops in the ARM perf event code while
running his perf_fuzzer tool on a pandaboard running v3.11-rc4.

Unable to handle kernel paging request at virtual address 73fd14cc
pgd = eca6c000
[73fd14cc] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in: snd_soc_omap_hdmi omapdss snd_soc_omap_abe_twl6040 snd_soc_twl6040 snd_soc_omap snd_soc_omap_hdmi_card snd_soc_omap_mcpdm snd_soc_omap_mcbsp snd_soc_core snd_compress regmap_spi snd_pcm snd_page_alloc snd_timer snd soundcore
CPU: 1 PID: 2790 Comm: perf_fuzzer Not tainted 3.11.0-rc4 #6
task: eddcab80 ti: ed892000 task.ti: ed892000
PC is at armpmu_map_event+0x20/0x88
LR is at armpmu_event_init+0x38/0x280
pc : [<c001c3e4>]    lr : [<c001c17c>]    psr: 60000013
sp : ed893e40  ip : ecececec  fp : edfaec00
r10: 00000000  r9 : 00000000  r8 : ed8c3ac0
r7 : ed8c3b5c  r6 : edfaec00  r5 : 00000000  r4 : 00000000
r3 : 000000ff  r2 : c0496144  r1 : c049611c  r0 : edfaec00
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: aca6c04a  DAC: 00000015
Process perf_fuzzer (pid: 2790, stack limit = 0xed892240)
Stack: (0xed893e40 to 0xed894000)
3e40: 00000800 c001c17c 00000002 c008a748 00000001 00000000 00000000 c00bf078
3e60: 00000000 edfaee50 00000000 00000000 00000000 edfaec00 ed8c3ac0 edfaec00
3e80: 00000000 c073ffac ed893f20 c00bf180 00000001 00000000 c00bf078 ed893f20
3ea0: 00000000 ed8c3ac0 00000000 00000000 00000000 c0cb0818 eddcab80 c00bf440
3ec0: ed893f20 00000000 eddcab80 eca76800 00000000 eca76800 00000000 00000000
3ee0: 00000000 ec984c80 eddcab80 c00bfe68 00000000 00000000 00000000 00000080
3f00: 00000000 ed892000 00000000 ed892030 00000004 ecc7e3c8 ecc7e3c8 00000000
3f20: 00000000 00000048 ecececec 00000000 00000000 00000000 00000000 00000000
3f40: 00000000 00000000 00297810 00000000 00000000 00000000 00000000 00000000
3f60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3f80: 00000002 00000002 000103a4 00000002 0000016c c00128e8 ed892000 00000000
3fa0: 00090998 c0012700 00000002 000103a4 00090ab8 00000000 00000000 0000000f
3fc0: 00000002 000103a4 00000002 0000016c 00090ab0 00090ab8 000107a0 00090998
3fe0: bed92be0 bed92bd0 0000b785 b6e8f6d0 40000010 00090ab8 00000000 00000000
[<c001c3e4>] (armpmu_map_event+0x20/0x88) from [<c001c17c>] (armpmu_event_init+0x38/0x280)
[<c001c17c>] (armpmu_event_init+0x38/0x280) from [<c00bf180>] (perf_init_event+0x108/0x180)
[<c00bf180>] (perf_init_event+0x108/0x180) from [<c00bf440>] (perf_event_alloc+0x248/0x40c)
[<c00bf440>] (perf_event_alloc+0x248/0x40c) from [<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc)
[<c00bfe68>] (SyS_perf_event_open+0x4f4/0x8fc) from [<c0012700>] (ret_fast_syscall+0x0/0x48)
Code: 0a000005 e3540004 0a000016 e3540000 (0791010c)

This is because event->attr.config in armpmu_event_init()
contains a very large number copied directly from userspace and
is never checked against the size of the array indexed in
armpmu_map_hw_event(). Fix the problem by checking the value of
config before indexing the array and rejecting invalid config
values.
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Tested-by: NVince Weaver <vincent.weaver@maine.edu>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

d9f96635

ARM: 7809/1: perf: fix event validation for software group leaders · c95eb318

由 Will Deacon 提交于 8月 07, 2013

It is possible to construct an event group with a software event as a
group leader and then subsequently add a hardware event to the group.
This results in the event group being validated by adding all members
of the group to a fake PMU and attempting to allocate each event on
their respective PMU.

Unfortunately, for software events wthout a corresponding arm_pmu, this
results in a kernel crash attempting to dereference the ->get_event_idx
function pointer.

This patch fixes the problem by checking explicitly for software events
and ignoring those in event validation (since they can always be
scheduled). We will probably want to revisit this for 3.12, since the
validation checks don't appear to work correctly when dealing with
multiple hardware PMUs anyway.

Cc: <stable@vger.kernel.org>
Reported-by: NVince Weaver <vincent.weaver@maine.edu>
Tested-by: NVince Weaver <vincent.weaver@maine.edu>
Tested-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c95eb318

24 6月, 2013 1 次提交

ARM: 7765/1: perf: Record the user-mode PC in the call chain. · c5f927a6

由 Jed Davis 提交于 6月 20, 2013

With this change, we no longer lose the innermost entry in the user-mode
part of the call chain.  See also the x86 port, which includes the ip.

It's possible to partially work around this problem by post-processing
the data to use the PERF_SAMPLE_IP value, but this works only if the CPU
wasn't in the kernel when the sample was taken.

Cc: <stable@vger.kernel.org>
Signed-off-by: NJed Davis <jld@mozilla.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c5f927a6

17 4月, 2013 1 次提交

ARM: 7698/1: perf: fix group validation when using enable_on_exec · cb2d8b34

由 Will Deacon 提交于 4月 12, 2013

Events may be created with attr->disabled == 1 and attr->enable_on_exec
== 1, which confuses the group validation code because events with the
PERF_EVENT_STATE_OFF are not considered candidates for scheduling, which
may lead to failure at group scheduling time.

This patch fixes the validation check for ARM, so that events in the
OFF state are still considered when enable_on_exec is true.

Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Reported-by: NSudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

cb2d8b34

07 3月, 2013 1 次提交

ARM: 7667/1: perf: Fix section mismatch on armpmu_init() · 44d6b1fc

由 Stephen Boyd 提交于 3月 05, 2013

WARNING: vmlinux.o(.text+0xfb80): Section mismatch in reference
from the function armpmu_register() to the function
.init.text:armpmu_init()
The function armpmu_register() references
the function __init armpmu_init().
This is often because armpmu_register lacks a __init
annotation or the annotation of armpmu_init is wrong.

Just drop the __init marking on armpmu_init() because
armpmu_register() no longer has an __init marking.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

44d6b1fc

04 3月, 2013 1 次提交

ARM: 7664/1: perf: remove erroneous semicolon from event initialisation · e595ede6

由 Chen Gang 提交于 2月 28, 2013

Commit 9dcbf466 ("ARM: perf: simplify __hw_perf_event_init err
handling") tidied up the error handling code for perf event
initialisation on ARM, but a copy-and-paste error left a dangling
semicolon at the end of an if statement.

This patch removes the broken semicolon, restoring the old group
validation semantics.

Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: NDirk Behme <dirk.behme@gmail.com>
Signed-off-by: NChen Gang <gang.chen@asianux.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

e595ede6

19 1月, 2013 1 次提交

ARM: perf: simplify __hw_perf_event_init err handling · 9dcbf466

由 Mark Rutland 提交于 1月 18, 2013

Currently __hw_perf_event_init has an err variable that's ignored right
until the end, where it's initialised, conditionally set, and then used
as a boolean flag deciding whether to return another error code.

This patch removes the err variable and simplifies the associated error
handling logic.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9dcbf466

18 1月, 2013 1 次提交

ARM: perf: remove unnecessary checks for idx < 0 · 8f3b90b5

由 Mark Rutland 提交于 1月 18, 2013

We currently check for hwx->idx < 0 in armpmu_read and armpmu_del
unnecessarily. The only case where hwc->idx < 0 is when armpmu_add
fails, in which case the event's state is set to
PERF_EVENT_STATE_INACTIVE.

The perf core will not attempt to read from an event in
PERF_EVENT_STATE_INACTIVE, and so the check in armpmu_read is
unnecessary. Similarly, if perf core cannot add an event it will not
attempt to delete it, so the WARN_ON in armpmu_del is unnecessary.

This patch removes these two redundant checks.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8f3b90b5

09 11月, 2012 1 次提交

ARM: PMU: fix runtime PM enable · 2ac29a14

由 Jon Hunter 提交于 10月 25, 2012

Commit 7be2958e (ARM: PMU: Add runtime PM Support) updated the ARM PMU code to
use runtime PM which was prototyped and validated on the OMAP devices. In this
commit, there is no call pm_runtime_enable() and for OMAP devices
pm_runtime_enable() is currently being called from the OMAP PMU code when the
PMU device is created. However, there are two problems with this:

1. For any other ARM device wishing to use runtime PM for PMU they will need
to call pm_runtime_enable() for runtime PM to work.
2. When booting with device-tree and using device-tree to create the PMU
device, pm_runtime_enable() needs to be called from within the ARM PERF
driver as we are no longer calling any device specific code to create the
device. Hence, PMU does not work on OMAP devices that use the runtime PM
callbacks when using device-tree to create the PMU device.

Therefore, call pm_runtime_enable() directly from the ARM PMU driver when
registering the device. For platforms that do not use runtime PM,
pm_runtime_enable() does nothing and for platforms that do use runtime PM but
may not require it specifically for PMU, this will just add a little overhead
when initialising and uninitialising the PMU device.

Tested with PERF on OMAP2420, OMAP3430 and OMAP4460.
Acked-by: NKevin Hilman <khilman@ti.com>
Acked-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NJon Hunter <jon-hunter@ti.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2ac29a14

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功