提交 · ba74f0640d963ccc914ac533cb0ba133ee07bcf2 · openeuler / Kernel

09 12月, 2010 8 次提交

perf session: Split out user event processing · ba74f064

由 Thomas Gleixner 提交于 12月 07, 2010

Simplify further.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124551.110956235@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

ba74f064

perf session: Split out sample preprocessing · 3dfc2c0a

由 Thomas Gleixner 提交于 12月 07, 2010

Simplify the code a bit.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124551.014649793@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

3dfc2c0a

perf session: Move dump code to event delivery path · 532e7269

由 Thomas Gleixner 提交于 12月 07, 2010

Preparatory patch for ordered perf report -D
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.918655066@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

532e7269

perf session: Add file_offset to event delivery function · f74725dc

由 Thomas Gleixner 提交于 12月 07, 2010

Preparatory patch for ordered output of perf report -D
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.818568607@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

f74725dc

perf session: Store file offset in sample_queue · e4c2df13

由 Thomas Gleixner 提交于 12月 07, 2010

Preparatory patch for ordered output of perf report -D.
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.725128545@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

e4c2df13

perf session: Consolidate the dump code · 9aefcab0

由 Thomas Gleixner 提交于 12月 07, 2010

The dump code used by perf report -D is scattered all over the place.
Move it to separate functions.
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.625434869@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9aefcab0

perf session: Dont queue events w/o timestamps · 79a14c1f

由 Thomas Gleixner 提交于 12月 07, 2010

If the event has no timestamp assigned then the parse code sets it to
~0ULL which causes the ordering code to enqueue it at the end.

Process it right away.
Reported-by: NIan Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.528788441@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

79a14c1f

perf event: Prevent unbound event__name array access · 3835bc00

由 Thomas Gleixner 提交于 12月 07, 2010

event__name[] is missing an entry for PERF_RECORD_FINISHED_ROUND, but we
happily access the array from the dump code.

Make event__name[] static and provide an accessor function, fix up all
callers and add the missing string.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.432593943@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

3835bc00

07 12月, 2010 1 次提交

perf session: Sort all events if ordered_samples=true · cbf41645

由 Thomas Gleixner 提交于 12月 05, 2010

Now that we have timestamps on FORK, EXIT, COMM, MMAP events we can
sort everything in time order. This fixes the following observed
problem:

mmap(file1) -> pagefault() -> munmap(file1)
mmap(file2) -> pagefault() -> munmap(file2)

Resulted in decoding both pagefaults in file2 because the file1 map
was already replaced by the file2 map when the map address was
identical.

With all events sorted we decode both pagefaults correctly.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <alpine.LFD.2.00.1012051220450.2653@localhost6.localdomain6>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

cbf41645

05 12月, 2010 2 次提交

perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events · 9c90a61c

由 Arnaldo Carvalho de Melo 提交于 12月 02, 2010

So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME:

  $ perf record -aT
  $ perf report -D | grep PERF_RECORD_
  <SNIP>
   3   5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3
   3   5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3
   3   5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811)
   3   5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3
   3   5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853
   3   5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find
   3   5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3
   3   5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so
   3   5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso]
   3   5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1
   3   5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so
   3   5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so
   3   5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1
   3   5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3
  <SNIP>

First column is the cpu and the second the timestamp.

That way we can investigate problems in the event stream.

If the new perf binary is run on an older kernel, it will disable this feature
automatically.
Tested-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

9c90a61c

perf session: Parse sample earlier · 640c03ce

由 Arnaldo Carvalho de Melo 提交于 12月 02, 2010

At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.

This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.

Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.

There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.
Tested-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NIan Munsie <imunsie@au1.ibm.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

640c03ce

01 12月, 2010 9 次提交

perf session: Allocate chunks of sample objects · 5c891f38

由 Thomas Gleixner 提交于 11月 30, 2010

The ordered sample code allocates singular reference objects struct
sample_queue which have 48byte size on 64bit and 20 bytes on 32bit. That's
silly. Allocate ~64k sized chunks and hand them out.

Performance gain: ~ 15%

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.398713983@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

5c891f38

perf session: Cache sample objects · 020bb75a

由 Thomas Gleixner 提交于 11月 30, 2010

When the sample queue is flushed we free the sample reference objects. Though
we need to malloc new objects when we process further. Stop the malloc/free
orgy and cache the already allocated object for resuage. Only allocate when
the cache is empty.

Performance gain: ~ 10%

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.338488630@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

020bb75a

perf session: Keep file mmaped instead of malloc/memcpy · fe174207

由 Thomas Gleixner 提交于 11月 30, 2010

Profiling perf with perf revealed that a large part of the processing time is
spent in malloc/memcpy/free in the sample ordering code. That code copies the
data from the mmap into malloc'ed memory. That's silly. We can keep the mmap
and just store the pointer in the queuing data structure. For 64 bit this is
not a problem as we map the whole file anyway. On 32bit we keep 8 maps around
and unmap the oldest before mmaping the next chunk of the file.

Performance gain: 2.95s -> 1.23s (Faktor 2.4)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.278787719@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

fe174207

perf session: Use sensible mmap size · 55b44629

由 Thomas Gleixner 提交于 11月 30, 2010

On 64bit we can map the whole file in one go, on 32bit we can at least map
32MB and not map/unmap tiny chunks of the file.

Base the progress bar on 1/16 of the data size.

Preparatory patch to get rid of the malloc/memcpy/free of trace data.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.213687773@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

55b44629

perf session: Simplify termination checks · d6513281

由 Thomas Gleixner 提交于 11月 30, 2010

No need to check twice.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.152886642@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d6513281

perf session: Move ui_progress_update in __perf_session__process_events() · 85b99952

由 Thomas Gleixner 提交于 11月 30, 2010

The progress bar is changed when the file offset changes. This happens only
when the next mmap is done. No need to call ui_progress_update() for every
event.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.094836523@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

85b99952

perf session: Cleanup __perf_session__process_events() · 0331ee0c

由 Thomas Gleixner 提交于 11月 30, 2010

Replace the pseudo C++ self argument with session and give the mmap related
variables a sensible name. shift is a complete misnomer - it took me several
rounds of cursing to figure out that it's not a shift value.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.029687218@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0331ee0c

perf session: Use appropriate pointer type instead of silly typecasting · 28990f75

由 Thomas Gleixner 提交于 11月 30, 2010

There is no reason to use a struct sample_event pointer in struct sample_queue
and type cast it when flushing the queue.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163819.969462809@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

28990f75

perf session: Fix list sort algorithm · a1225dec

由 Thomas Gleixner 提交于 11月 30, 2010

The homebrewn sort algorithm fails to sort in time order. One of the problem
spots is that it fails to deal with equal timestamps correctly.

My first gut reaction was to replace the fancy list with an rbtree, but the
performance is 3 times worse.

Rewrite it so it works.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163819.908482530@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

a1225dec

27 11月, 2010 2 次提交

perf tools: Fix lost and unknown events handling · 068ffaa8

由 Arnaldo Carvalho de Melo 提交于 11月 27, 2010

Fix it by explaining what can be happening and giving the number of processed
and lost events.

Also holler if unknown events were found, that can be due to processing a
perf.data file collected using a newer tool where newer events got added on
reporting using an older perf tool, that or a bug, so ask for a report to be
made.

Works on both --tui and --stdio.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

068ffaa8

perf events: Default to using event__process_lost · 37982ba0

由 Arnaldo Carvalho de Melo 提交于 11月 26, 2010

Tool developers have to fill in a 'perf_event_ops' method table to
specify how to handle each event, so far the ones that were not
explicitely especified would get a stub that would just discard the
event.

Change that so that tool developers can get the lost event details and
the total number of such events at the end of 'perf report -D' output.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

37982ba0

03 8月, 2010 2 次提交

perf session: Invalidate last_match when removing threads from rb_tree · 70597f21

由 Arnaldo Carvalho de Melo 提交于 8月 02, 2010

If we receive two PERF_RECORD_EXIT for the same thread, we can end up
reusing session->last_match and trying to remove the thread twice from
the rb_tree, causing a segfault, so invalidade last_match in
perf_session__remove_thread.

Receiving two PERF_RECORD_EXIT for the same thread is a bug, but its a
harmless one if we make the tool more robust, like this patch does.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

70597f21

perf session: Free the ref_reloc_sym memory at the right place · 076c6e45

由 Arnaldo Carvalho de Melo 提交于 8月 02, 2010

Which is at perf_session__destroy_kernel_maps, counterpart to the
perf_session__create_kernel_maps where the kmap structure is located, just
after the vmlinux_maps.

Make it also check if the kernel maps were actually created, which may not
be the case if, for instance, perf_session__new can't complete due to
permission problems in, for instance, a 'perf report' case, when a
segfault will take place, that is how this was noticed.

The problem was introduced in d65a458b, thus post .35.

This also adds code to release guest machines as them are also created
in perf_session__create_kernel_maps, so should be deleted on this newly
introduced counterpart, perf_session__destroy_kernel_maps.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

076c6e45

31 7月, 2010 1 次提交

perf tools: Release session and symbol resources on exit · d65a458b

由 Arnaldo Carvalho de Melo 提交于 7月 30, 2010

So that we reduce the noise when looking for leaks using tools such as
valgrind.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d65a458b

27 7月, 2010 1 次提交

perf tools: Remove unneeded code for tracking the cwd in perf sessions · 88ca895d

由 Dave Martin 提交于 7月 27, 2010

Tidy-up patch to remove some code and struct perf_session data members
which are no longer needed due to the previous patch: "perf tools: Don't
abbreviate file paths relative to the cwd".

LKML-Reference: <new-submission>
Signed-off-by: NDave Martin <dave.martin@linaro.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

88ca895d

18 6月, 2010 1 次提交

perf session: fix error message on failure to open perf.data · 0f2c3de2

由 Andy Isaacson 提交于 6月 11, 2010

If we cannot open our data file, print strerror(errno) for a more
comprehensible error message; and only suggest 'perf record' on ENOENT.

In particular, this fixes the nonsensical advice when:

    % sudo perf record sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.009 MB perf.data (~381 samples) ]
    % perf trace
    failed to open file: perf.data  (try 'perf record' first)
    %

Cc: Ingo Molnar <mingo@elte.hu>
LPU-Reference: <20100612033615.GA24731@hexapodia.org>
Signed-off-by: NAndy Isaacson <adi@hexapodia.org>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

0f2c3de2

17 6月, 2010 1 次提交

perf session: Remove threads from tree on PERF_RECORD_EXIT · 720a3aeb

由 Arnaldo Carvalho de Melo 提交于 6月 17, 2010

Move them to a session->dead_threads list just like we do with maps that
are replaced, because we may have hist_entries pointing to them.

This fixes a bug when inserting maps for a new thread that reused the
TID, mixing maps for two different threads, causing an endless loop.

The code for insering maps should be made more robust but for .35 this
is the minimalistic patch.
Reported-by: NIngo Molnar <mingo@elte.hu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

720a3aeb

20 5月, 2010 1 次提交

perf session: Make read_build_id routines look at the host_machine too · f869097e

由 Arnaldo Carvalho de Melo 提交于 5月 19, 2010

The changes made to support host and guest machines in a session, that
started when the 'perf kvm' tool was introduced ended up introducing a
bug where the host_machine was not having its DSOs traversed for
build-id processing.

Fix it by moving some methods to the right classes and considering the
host_machine when processing build-ids.
Reported-by: NTom Zanussi <tzanussi@gmail.com>
Reported-by: NStephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

f869097e

19 5月, 2010 1 次提交

perf tools: Remove some unused functions · a41794cd

由 Arnaldo Carvalho de Melo 提交于 5月 18, 2010

Without the bloated cplus_demangle from binutils, i.e building with:

$ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install

Before:

   text	   data	    bss	    dec	    hex	filename
 471851	  29280	4025056	4526187	 45106b	/home/acme/bin/perf

After:

[acme@doppio linux-2.6-tip]$ size ~/bin/perf
   text	   data	    bss	    dec	    hex	filename
 446886	  29232	4008576	4484694	 446e56	/home/acme/bin/perf

So its a 5.3% size reduction in code, but the interesting part is in the git
diff --stat output:

 19 files changed, 20 insertions(+), 1909 deletions(-)

If we ever need some of the things we got from git but weren't using, we just
have to go to the git repo and get fresh, uptodate source code bits.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

a41794cd

15 5月, 2010 1 次提交

perf hist: Clarify events_stats fields usage · cee75ac7

由 Arnaldo Carvalho de Melo 提交于 5月 14, 2010

The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.

Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.

Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

cee75ac7

14 5月, 2010 1 次提交

perf hist: Make event__totals per hists · c8446b9b

由 Arnaldo Carvalho de Melo 提交于 5月 14, 2010

This is one more thing that started global but are more useful per hist
or per session.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c8446b9b

11 5月, 2010 1 次提交

perf hist: Introduce hists class and move lots of methods to it · 1c02c4d2

由 Arnaldo Carvalho de Melo 提交于 5月 10, 2010

In cbbc79a5 we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.

While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.

Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.

The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.

Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1c02c4d2

10 5月, 2010 2 次提交

perf session: create_kernel_maps should use ->host_machine · d118f8ba

由 Arnaldo Carvalho de Melo 提交于 5月 10, 2010

Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create
another machine instance for the host machine, and since 1f626bc3 we have
it out of the machines rb_tree.

Fix it by using machine__create_kernel_maps(&self->host_machine)
directly.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d118f8ba

perf session: Embed the host machine data on perf_session · 1f626bc3

由 Arnaldo Carvalho de Melo 提交于 5月 09, 2010

We have just one host on a given session, and that is the most common
setup right now, so embed a ->host_machine struct machine instance
directly in the perf_session class, check if we're looking for it before
going to the rb_tree.

This also fixes a problem found when we try to process old perf.data
files where we didn't have MMAP events for the kernel and modules and
thus don't create the kernel maps, do it in event__preprocess_sample if
it wasn't already.
Reported-by: NIngo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

1f626bc3

09 5月, 2010 2 次提交

perf/live-mode: Handle payload-less events · 794e43b5

由 Tom Zanussi 提交于 5月 05, 2010

Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of
only an event header and no data.  In this case, a 0-length payload
will be read, and the 0 return value will be wrongly interpreted as an
'unexpected end of event stream'.

This patch allows for proper handling of data-less events by skipping
0-length reads.
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1273038527.6383.51.camel@tropicana>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>

794e43b5

perf: Provide a new deterministic events reordering algorithm · d6b17beb

由 Frederic Weisbecker 提交于 5月 03, 2010

The current events reordering algorithm is based on a heuristic that
gets broken once we deal with a very fast flow of events.

Indeed the time period based flushing is not suitable anymore
in the following case, assuming we have a flush period of two
seconds.

    CPU 0           |        CPU 1
                    |
  cnt1 timestamps   |      cnt1 timestamps
                    |
    0               |         0
    1               |         1
    2               |         2
    3               |         3
    [...]           |        [...]
    4 seconds later

If we spend too much time to read the buffers (case of a lot of
events to record in each buffers or when we have a lot of CPU buffers
to read), in the next pass the CPU 0 buffer could contain a slice
of several seconds of events. We'll read them all and notice we've
reached the period to flush. In the above example we flush the first
half of the CPU 0 buffer, then we read the CPU 1 buffer where we
have events that were on the flush slice and then the reordering
fails.

It's simple to reproduce with:

	perf lock record perf bench sched messaging

To solve this, we use a new solution that doesn't rely on an
heuristical time slice period anymore but on a deterministic basis
based on how perf record does its job.

perf record saves the buffers through passes. A pass is a tour
on every buffers from every CPUs. This is made in order: for
each CPU we read the buffers of every counters. So the more
buffers we visit, the later will be the timstamps of their events.

When perf record finishes a pass it records a
PERF_RECORD_FINISHED_ROUND pseudo event.
We record the max timestamp t found in the pass n. Assuming these
timestamps are monotonic across cpus, we know that if a buffer
still has events with timestamps below t, they will be all available
and then read in the pass n + 1.
Hence when we start to read the pass n + 2, we can safely flush every
events with timestamps below t.

      ============ PASS n =================
         CPU 0         |   CPU 1
                       |
      cnt1 timestamps  |   cnt2 timestamps
            1          |         2
            2          |         3
            -          |         4  <--- max recorded

      ============ PASS n + 1 ==============
         CPU 0         |   CPU 1
                       |
      cnt1 timestamps  |   cnt2 timestamps
            3          |         5
            4          |         6
            5          |         7 <---- max recorded

        Flush every events below timestamp 4

      ============ PASS n + 2 ==============
         CPU 0         |   CPU 1
                       |
      cnt1 timestamps  |   cnt2 timestamps
            6          |         8
            7          |         9
            -          |         10

        Flush every events below timestamp 7
        etc...

It also works on perf.data versions that don't have
PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that
the events will be only flushed in the end of the perf.data
processing. It will then consume more memory and scale less with
large perf.data files.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>

d6b17beb

03 5月, 2010 1 次提交

perf: add perf-inject builtin · 454c407e

由 Tom Zanussi 提交于 5月 01, 2010

Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.

What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit.  Doing
that in perf-record itself, however, is off-limits.

This patch introduces perf-inject, which does the same job while
leaving perf-record untouched.  Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:

perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.

Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

454c407e

28 4月, 2010 2 次提交

perf machine: Adopt some map_groups functions · d28c6223

由 Arnaldo Carvalho de Melo 提交于 4月 27, 2010

Those functions operated on members now grouped in 'struct machine', so
move those methods to this new class.

The changes made to 'perf probe' shows that using this abstraction
inserting probes on guests almost got supported for free.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d28c6223

perf tools: Rename "kernel_info" to "machine" · 23346f21

由 Arnaldo Carvalho de Melo 提交于 4月 27, 2010

struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.

There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

23346f21

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功