提交 · d0cb4260f899d07462d49fc67e29f2438dbaca2f · openeuler / raspberrypi-kernel

16 3月, 2010 3 次提交

perf probe: Use original address instead of CU-based address · d0cb4260

由 Masami Hiramatsu 提交于 3月 15, 2010

Use original address for looking up the location of variables
for dwarf_getlocation_addr() instead of CU-based address.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100315170235.31852.91195.stgit@localhost6.localdomain6>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d0cb4260

perf probe: Fix offset to allow signed value · 67c7ff7c

由 Masami Hiramatsu 提交于 3月 15, 2010

Fix dereference offset to intmax_t from uintmax_t, because
it can have negative values (for example local variable's offset
from frame pointer).
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100315170228.31852.71946.stgit@localhost6.localdomain6>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

67c7ff7c

perf top: Improve the autosizing of column lenghts · b63be8d7

由 Arnaldo Carvalho de Melo 提交于 3月 15, 2010

When profiling C++ workloads the symbol name length can be
really big, so cap it before it garbles the result.

This builds upon the autosizing already present where we choose
to use the short, basename of DSOs instead of its long, full
pathname.
Reported-by: NPavel Krauz <krauz@cngroup.cz>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268676230-9261-1-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b63be8d7

13 3月, 2010 2 次提交

perf probe: Fix need_dwarf flag if lazy matching is used · fc6ceea0

由 Masami Hiramatsu 提交于 3月 12, 2010

Set need_dwarf if lazy matching pattern is specified, because
lazy matching requires real source path for which we must use
debuginfo.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20100312232224.2017.54550.stgit@localhost6.localdomain6>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fc6ceea0

perf probe: Fix probe_point buffer overrun · 594087a0

由 Masami Hiramatsu 提交于 3月 12, 2010

Fix probe_point array-size overrun problem. In some cases (e.g.
inline function), one user-specified probe-point can be
translated to many probe address, and it overruns pre-defined
array-size. This also removes redundant MAX_PROBES macro
definition.
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: <stable@kernel.org>
LKML-Reference: <20100312232217.2017.45017.stgit@localhost6.localdomain6>
[ Note that only root can create new probes. Eventually we should remove
  the MAX_PROBES limit, but that is a larger patch not eligible to
  perf/urgent treatment. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

594087a0

12 3月, 2010 1 次提交

perf record: Don't try to find buildids in a zero sized file · 9f591fd7

由 Arnaldo Carvalho de Melo 提交于 3月 11, 2010

Fixing this symptom:

 [acme@mica linux-2.6-tip]$ perf record -a -f
   Fatal: Permission error - are you root?

 Bus error
 [acme@mica linux-2.6-tip]$

I.e. if for some reason no data is collected, in this case a non
root user trying to do systemwide profiling, no data will be
collected, and then we end up trying to mmap a zero sized file
and access the file header, b00m.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org>
LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9f591fd7

11 3月, 2010 7 次提交

perf: export perf_trace_regs and perf_arch_fetch_caller_regs · 639fe4b1

由 Xiao Guangrong 提交于 3月 11, 2010

Export perf_trace_regs and perf_arch_fetch_caller_regs since module will
use these.
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
[ use EXPORT_PER_CPU_SYMBOL_GPL() ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

639fe4b1

perf, x86: Fix hw_perf_enable() event assignment · 45e16a68

由 Peter Zijlstra 提交于 3月 11, 2010

What happens is that we schedule badly like:

<...>-1987  [019]   280.252808: x86_pmu_start: event-46/1300c0: idx: 0
<...>-1987  [019]   280.252811: x86_pmu_start: event-47/1300c0: idx: 1
<...>-1987  [019]   280.252812: x86_pmu_start: event-48/1300c0: idx: 2
<...>-1987  [019]   280.252813: x86_pmu_start: event-49/1300c0: idx: 3
<...>-1987  [019]   280.252814: x86_pmu_start: event-50/1300c0: idx: 32
<...>-1987  [019]   280.252825: x86_pmu_stop: event-46/1300c0: idx: 0
<...>-1987  [019]   280.252826: x86_pmu_stop: event-47/1300c0: idx: 1
<...>-1987  [019]   280.252827: x86_pmu_stop: event-48/1300c0: idx: 2
<...>-1987  [019]   280.252828: x86_pmu_stop: event-49/1300c0: idx: 3
<...>-1987  [019]   280.252829: x86_pmu_stop: event-50/1300c0: idx: 32
<...>-1987  [019]   280.252834: x86_pmu_start: event-47/1300c0: idx: 1
<...>-1987  [019]   280.252834: x86_pmu_start: event-48/1300c0: idx: 2
<...>-1987  [019]   280.252835: x86_pmu_start: event-49/1300c0: idx: 3
<...>-1987  [019]   280.252836: x86_pmu_start: event-50/1300c0: idx: 32
<...>-1987  [019]   280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL*

This happens because we only iterate the n_running events in the first
pass, and reset their index to -1 if they don't match to force a
re-assignment.

Now, in our RR example, n_running == 0 because we fully unscheduled, so
event-50 will retain its idx==32, even though in scheduling it will have
gotten idx=0, and we don't trigger the re-assign path.

The easiest way to fix this is the below patch, which simply validates
the full assignment in the second pass.
Reported-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268311069.5037.31.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

45e16a68

perf, ppc: Fix compile error due to new cpu notifiers · 85cfabbc

由 Peter Zijlstra 提交于 3月 11, 2010

Fix:

arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'

Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

85cfabbc

perf: Make the install relative to DESTDIR if specified · 7ae5f213

由 John Kacur 提交于 3月 11, 2010

Without this change, the install path is relative to
prefix/DESTDIR where prefix is automatically set to $HOME.

This can produce unexpected results. For example:

  make -C tools/perf DESTDIR=/home/jkacur/tmp install-man

creates the directory:		/home/jkacur/home/jkacur/tmp/share/...
instead of the expected:	/home/jkacur/tmp/share/...
Signed-off-by: NJohn Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Kyle McMartin <kyle@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <1268312220-12880-1-git-send-email-jkacur@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7ae5f213

kprobes: Calculate the index correctly when freeing the out-of-line execution slot · 83ff56f4

由 Masami Hiramatsu 提交于 3月 09, 2010

From : Ananth N Mavinakayanahalli <ananth@in.ibm.com>

When freeing the instruction slot, the arithmetic to calculate
the index of the slot in the page needs to account for the total
size of the instruction on the various architectures.

Calculate the index correctly when freeing the out-of-line
execution slot.
Reported-by: NSachin Sant <sachinp@in.ibm.com>
Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <4B9667AB.9050507@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

83ff56f4

perf tools: Fix sparse CPU numbering related bugs · a12b51c4

由 Paul Mackerras 提交于 3月 10, 2010

At present, the perf subcommands that do system-wide monitoring
(perf stat, perf record and perf top) don't work properly unless
the online cpus are numbered 0, 1, ..., N-1.  These tools ask
for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
and then try to create events for cpus 0, 1, ..., N-1.

This creates problems for systems where the online cpus are
numbered sparsely.  For example, a POWER6 system in
single-threaded mode (i.e. only running 1 hardware thread per
core) will have only even-numbered cpus online.

This fixes the problem by reading the /sys/devices/system/cpu/online
file to find out which cpus are online.  The code that does that is in
tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
function that sets up a cpumap[] array and returns the number of
online cpus.  If /sys/devices/system/cpu/online can't be read or
can't be parsed successfully, it falls back to using sysconf to
ask how many cpus are online and sets up an identity map in cpumap[].

The perf record, perf stat and perf top code then calls
read_cpu_map() in the system-wide monitoring case (instead of
sysconf) and uses cpumap[] to get the cpu numbers to pass to
perf_event_open.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a12b51c4

perf_event: Fix oops triggered by cpu offline/online · 220b140b

由 Paul Mackerras 提交于 3月 10, 2010

Anton Blanchard found that he could reliably make the kernel hit a
BUG_ON in the slab allocator by taking a cpu offline and then online
while a system-wide perf record session was running.

The reason is that when the cpu comes up, we completely reinitialize
the ctx field of the struct perf_cpu_context for the cpu.  If there is
a system-wide perf record session running, then there will be a struct
perf_event that has a reference to the context, so its refcount will
be 2.  (The perf_event has been removed from the context's group_entry
and event_entry lists by perf_event_exit_cpu(), but that doesn't
remove the perf_event's reference to the context and doesn't decrement
the context's refcount.)

When the cpu comes up, perf_event_init_cpu() gets called, and it calls
__perf_event_init_context() on the cpu's context.  That resets the
refcount to 1.  Then when the perf record session finishes and the
perf_event is closed, the refcount gets decremented to 0 and the
context gets kfreed after an RCU grace period.  Since the context
wasn't kmalloced -- it's part of a per-cpu variable -- bad things
happen.

In fact we don't need to completely reinitialize the context when the
cpu comes up.  It's sufficient to initialize the context once at boot,
but we need to do it for all possible cpus.

This moves the context initialization to happen at boot time.  With
this, we don't trash the refcount and the context never gets kfreed,
and we don't hit the BUG_ON.
Reported-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Tested-by: NAnton Blanchard <anton@samba.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

220b140b

10 3月, 2010 27 次提交

perf: Drop the obsolete profile naming for trace events · 97d5a220

由 Frederic Weisbecker 提交于 3月 05, 2010

Drop the obsolete "profile" naming used by perf for trace events.
Perf can now do more than simple events counting, so generalize
the API naming.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>

97d5a220

perf: Take a hot regs snapshot for trace events · c530665c

由 Frederic Weisbecker 提交于 3月 03, 2010

We are taking a wrong regs snapshot when a trace event triggers.
Either we use get_irq_regs(), which gives us the interrupted
registers if we are in an interrupt, or we use task_pt_regs()
which gives us the state before we entered the kernel, assuming
we are lucky enough to be no kernel thread, in which case
task_pt_regs() returns the initial set of regs when the kernel
thread was started.

What we want is different. We need a hot snapshot of the regs,
so that we can get the instruction pointer to record in the
sample, the frame pointer for the callchain, and some other
things.

Let's use the new perf_fetch_caller_regs() for that.

Comparison with perf record -e lock: -R -a -f -g
Before:

        perf  [kernel]                   [k] __do_softirq
               |
               --- __do_softirq
                  |
                  |--55.16%-- __open
                  |
                   --44.84%-- __write_nocancel

After:

            perf  [kernel]           [k] perf_tp_event
               |
               --- perf_tp_event
                  |
                  |--41.07%-- lock_acquire
                  |          |
                  |          |--39.36%-- _raw_spin_lock
                  |          |          |
                  |          |          |--7.81%-- hrtimer_interrupt
                  |          |          |          smp_apic_timer_interrupt
                  |          |          |          apic_timer_interrupt

The old case was producing unreliable callchains. Now having
right frame and instruction pointers, we have the trace we
want.

Also syscalls and kprobe events already have the right regs,
let's use them instead of wasting a retrieval.

v2: Follow the rename perf_save_regs() -> perf_fetch_caller_regs()
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Archs <linux-arch@vger.kernel.org>

c530665c

perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot · 5331d7b8

由 Frederic Weisbecker 提交于 3月 04, 2010

Events that trigger overflows by interrupting a context can
use get_irq_regs() or task_pt_regs() to retrieve the state
when the event triggered. But this is not the case for some
other class of events like trace events as tracepoints are
executed in the same context than the code that triggered
the event.

It means we need a different api to capture the regs there,
namely we need a hot snapshot to get the most important
informations for perf: the instruction pointer to get the
event origin, the frame pointer for the callchain, the code
segment for user_mode() tests (we always use __KERNEL_CS as
trace events always occur from the kernel) and the eflags
for further purposes.

v2: rename perf_save_regs to perf_fetch_caller_regs as per
Masami's suggestion.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Archs <linux-arch@vger.kernel.org>

5331d7b8

perf/x86-64: Use frame pointer to walk on irq and process stacks · 61e67fb9

由 Frederic Weisbecker 提交于 3月 03, 2010

We were using the frame pointer based stack walker on every
contexts in x86-32, but not in x86-64 where we only use the
seven-league boots on the exception stacks.

Use it also on irq and process stacks. This utterly accelerate
the captures.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>

61e67fb9

lockdep: Move lock events under lockdep recursion protection · db2c4c77

由 Frederic Weisbecker 提交于 2月 02, 2010

There are rcu locked read side areas in the path where we submit
a trace event. And these rcu_read_(un)lock() trigger lock events,
which create recursive events.

One pair in do_perf_sw_event:

__lock_acquire
      |
      |--96.11%-- lock_acquire
      |          |
      |          |--27.21%-- do_perf_sw_event
      |          |          perf_tp_event
      |          |          |
      |          |          |--49.62%-- ftrace_profile_lock_release
      |          |          |          lock_release
      |          |          |          |
      |          |          |          |--33.85%-- _raw_spin_unlock

Another pair in perf_output_begin/end:

__lock_acquire
      |--23.40%-- perf_output_begin
      |          |          __perf_event_overflow
      |          |          perf_swevent_overflow
      |          |          perf_swevent_add
      |          |          perf_swevent_ctx_event
      |          |          do_perf_sw_event
      |          |          perf_tp_event
      |          |          |
      |          |          |--55.37%-- ftrace_profile_lock_acquire
      |          |          |          lock_acquire
      |          |          |          |
      |          |          |          |--37.31%-- _raw_spin_lock

The problem is not that much the trace recursion itself, as we have a
recursion protection already (though it's always wasteful to recurse).
But the trace events are outside the lockdep recursion protection, then
each lockdep event triggers a lock trace, which will trigger two
other lockdep events. Here the recursive lock trace event won't
be taken because of the trace recursion, so the recursion stops there
but lockdep will still analyse these new events:

To sum up, for each lockdep events we have:

	lock_*()
	     |
             trace lock_acquire
                  |
                  ----- rcu_read_lock()
                  |          |
                  |          lock_acquire()
                  |          |
                  |          trace_lock_acquire() (stopped)
                  |          |
		  |          lockdep analyze
                  |
                  ----- rcu_read_unlock()
                             |
                             lock_release
                             |
                             trace_lock_release() (stopped)
                             |
                             lockdep analyze

And you can repeat the above two times as we have two rcu read side
sections when we submit an event.

This is fixed in this patch by moving the lock trace event under
the lockdep recursion protection.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>

db2c4c77

perf report: Print the map table just after samples for which no map was found · 65f2ed2b

由 Arnaldo Carvalho de Melo 提交于 3月 09, 2010

If -vv is used just the map table will be printed, -vvv will
print the symbol table too, with it we can see that we have a
bug where some samples are not being resolved to a map when we
get them in the perf.data stream, but after we have it all
processed, we can find the right map, some reordering probably
is happening.

Upcoming patches will provide ways to ask for most PERF_SAMPLE_
conditional samples to be taken for !PERF_RECORD_SAMPLE events
too, then we'll be able to ask for PERF_SAMPLE_TIME and
PERF_SAMPLE_CPU to help diagnose this.
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1268161097-17761-1-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

65f2ed2b

perf report: Add multiple event support · cbbc79a5

由 Eric B Munson 提交于 3月 05, 2010

Perf report does not handle multiple events being reported, even
though perf record stores them properly on disk.  This patch
addresses that issue by adding the logic to perf report to use
the event stream id that is saved by record and the new data
structures to seperate the event streams and report them
individually.
Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-6-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cbbc79a5

perf session: Change perf_session post processing functions to take histogram tree · eefc465c

由 Eric B Munson 提交于 3月 05, 2010

Now that report can store historgrams for multiple events we
need to be able to do the post processing work for each
histogram. This patch changes the post processing functions so
that they can be called individually for each event's histogram.
Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
[ Guarantee bisectabilty by fixing up builtin-report.c ]
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-5-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

eefc465c

perf session: Add storage for seperating event types in report · cb8f0939

由 Eric B Munson 提交于 3月 05, 2010

This patch adds the structures necessary to count each event
type independently in perf report.
Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-4-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cb8f0939

perf session: Change add_hist_entry to take the tree root instead of session · d403d0ac

由 Eric B Munson 提交于 3月 05, 2010

In order to minimize the impact of storing multiple events in a
report this function will now take the root of the histogram
tree so that the logic for selecting the proper tree can be
inserted before the call.
Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-3-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d403d0ac

perf record: Add ID and to recorded event data when recording multiple events · 8907fd60

由 Eric B Munson 提交于 3月 05, 2010

Currently perf record does not write the ID or the to disk for
events. This doesn't allow report to tell if an event stream
contains one or more types of events.  This patch adds this
entry to the list of data that record will write to disk if more
than one event was requested.
Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-2-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8907fd60

perf probe: Add missing variable initialization · accd3cc4

由 Arnaldo Carvalho de Melo 提交于 3月 05, 2010

cc1: warnings being treated as errors
 util/probe-finder.c: In function 'find_line_range':
 util/probe-finder.c:172: warning: 'src' may be used
 uninitialized in this function make: *** [util/probe-finder.o]
 Error 1
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267804269-22660-1-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

accd3cc4

perf tools: Don't trow away old map slices not overlapped by new maps · 12245509

由 Arnaldo Carvalho de Melo 提交于 3月 05, 2010

Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1267800842-22324-1-git-send-email-acme@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

12245509

perf: Provide better condition for event rotation · d4944a06

由 Peter Zijlstra 提交于 3月 08, 2010

Try to avoid useless rotation and PMU disables.

[ Could be improved by keeping a nr_runnable count to better account
  for the < PERF_STAT_INACTIVE counters ]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d4944a06

perf, x86: Fix double enable calls · f3d46b2e

由 Peter Zijlstra 提交于 3月 06, 2010

hw_perf_enable() would enable already enabled events.

This causes problems with code that assumes that ->enable/->disable calls
are balanced (like the LBR code does).

What happens is that events that were already running and left in place
would get enabled again.

Avoid this by only enabling new events that match their previous
assignment.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f3d46b2e

perf, x86: Fix double disable calls · 19925ce7

由 Peter Zijlstra 提交于 3月 06, 2010

hw_perf_enable() would disable events that were not yet enabled.

This causes problems with code that assumes that ->enable/->disable calls
are balanced (like the LBR code does).

What happens is that we disable newly added counters that match their
previous assignment, even though they are not yet programmed on the
hardware.

Avoid this by only doing the first pass over the existing events.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

19925ce7

perf, x86: Properly account n_added · 356e1f2e

由 Peter Zijlstra 提交于 3月 06, 2010

Make sure n_added is properly accounted so that we can rely on the value
to reflect the number of added counters. This is needed if its going to
be used for more than a boolean check.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

356e1f2e

perf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE) · 71e2d282

由 Peter Zijlstra 提交于 3月 08, 2010

Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result
in a double disable, cure this by using x86_pmu_{start,stop} for
throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

71e2d282

perf, x86: Fix x86_pmu_start · c08053e6

由 Peter Zijlstra 提交于 3月 06, 2010

pmu::start should undo pmu::stop, make it so.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c08053e6

perf, x86: Use unlocked bitops · 34538ee7

由 Peter Zijlstra 提交于 3月 02, 2010

There is no concurrency on these variables, so don't use LOCK'ed ops.

As to the intel_pmu_handle_irq() status bit clean, nobody uses that so
remove it all together.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100304140100.240023029@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

34538ee7

perf, x86: Change x86_pmu.{enable,disable} calling convention · aff3d91a

由 Peter Zijlstra 提交于 3月 02, 2010

Pass the full perf_event into the x86_pmu functions so that those may
make use of more than the hw_perf_event, and while doing this, remove the
superfluous second argument.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100304140100.165166129@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

aff3d91a

perf, x86: Remove superfluous arguments to x86_perf_event_update() · cc2ad4ba

由 Peter Zijlstra 提交于 3月 02, 2010

The second and third argument to x86_perf_event_update() are superfluous
since they are simple expressions of the first argument. Hence remove
them.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100304140100.089468871@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cc2ad4ba

perf, x86: Remove superfluous arguments to x86_perf_event_set_period() · 07088edb

由 Peter Zijlstra 提交于 3月 02, 2010

The second and third argument to x86_perf_event_set_period() are
superfluous since they are simple expressions of the first argument.
Hence remove them.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100304140100.006500906@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

07088edb

perf, x86, Do not user perf_disable from NMI context · 3fb2b8dd

由 Peter Zijlstra 提交于 3月 08, 2010

Explicitly use intel_pmu_{disable,enable}_all() in intel_pmu_handle_irq()
to avoid the NMI race conditions in perf_{disable,enable}
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3fb2b8dd

perf: Optimize perf_disable · 32975a4f

由 Peter Zijlstra 提交于 3月 06, 2010

Currently we always call hw_perf_disable(), even if its already disabled,
this seems superflous, esp. since it cannot be made NMI safe (see further
patches).
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

32975a4f

perf: Rework and fix the arch CPU-hotplug hooks · 3f6da390

由 Peter Zijlstra 提交于 3月 05, 2010

Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug
notifiers. This has the advantage of reducing the static weak interface
as well as exposing all hotplug actions to the PMU.

Use this to fix x86 hotplug usage where we did things in ONLINE which
should have been done in UP_PREPARE or STARTING.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100305154128.736225361@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f6da390

perf: Provide generic perf_sample_data initialization · dc1d628a

由 Peter Zijlstra 提交于 3月 03, 2010

This makes it easier to extend perf_sample_data and fixes a bug on arm
and sparc, which failed to set ->raw to NULL, which can cause crashes
when combined with PERF_SAMPLE_RAW.

It also optimizes PowerPC and tracepoint, because the struct
initialization is forced to zero out the whole structure.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NJean Pihet <jpihet@mvista.com>
Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Jamie Iles <jamie.iles@picochip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@kernel.org
LKML-Reference: <20100304140100.315416040@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dc1d628a