- 20 6月, 2009 1 次提交
-
-
由 Paul Mackerras 提交于
On 64-bit powerpc, __u64 is defined to be unsigned long rather than unsigned long long. This causes compiler warnings every time we print a __u64 value with %Lx. Rather than changing __u64, we define our own u64 to be unsigned long long on all architectures, and similarly s64 as signed long long. For consistency we also define u32, s32, u16, s16, u8 and s8. These definitions are put in a new header, types.h, because these definitions are needed in util/string.h and util/symbol.h. The main change here is the mechanical change of __[us]{64,32,16,8} to remove the "__". The other changes are: * Create types.h * Include types.h in perf.h, util/string.h and util/symbol.h * Add types.h to the LIB_H definition in Makefile * Added (u64) casts in process_overflow_event() and print_sym_table() to kill two remaining warnings. Signed-off-by: NPaul Mackerras <paulus@samba.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: benh@kernel.crashing.org LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 6月, 2009 2 次提交
-
-
由 Peter Zijlstra 提交于
Add a data file header so we can transfer data between record and report. LKML-Reference: <new-submission> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Update the tools to reflect the new callchain sampling format. LKML-Reference: <new-submission> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 6月, 2009 8 次提交
-
-
由 Ingo Molnar 提交于
Make it easier to use parent filtering - default to a filtered output. Also add the parent column so that we get collapsing but dont display it by default. add --no-exclude-other to override this. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Make use of the new ->data_tail mechanism to tell kernel-space about user-space draining the data stream. Emit lost events (and display them) if they happen. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Paul Mackerras 提交于
On 64-bit powerpc, perf needs to be built as a 64-bit executable. This arranges to add the -m64 flag to CFLAGS if we are running on a 64-bit machine, indicated by the result of uname -m ending in "64". This means that we'll use -m64 on x86_64 machines as well. Signed-off-by: NPaul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linuxppc-dev@ozlabs.org Cc: benh@kernel.crashing.org LKML-Reference: <19000.55666.866148.559620@cargo.ozlabs.ibm.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Introduce isprint() to print out raw event dumps to ASCII, etc. (This is an extension to upstream Git's ctype.c.) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> [ removed openssl.h inclusion from util.h - it leaked ctype.h ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Add boundary checks for call-chain events. In case of corrupted entries we could crash otherwise. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Instead of the ambigious 'call' naming use the much more specific 'parent' naming: - rename --call <regex> to --parent <regex> - rename --sort call to --sort parent - rename [unmatched] to [other] - to signal that this is not an error but the inverse set Also add pagefaults to the default parent-symbol pattern too, as it's a 'syscall overhead category' in a sense. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The Git utils came with a ctype replacement that doesn't provide isprint(). Add a replacement. Solves a build bug on certain distros. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Implement sorting by callchain symbols, --sort <call>. It will create a new column which will show a match to --call $regex or "[unmatched]". Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 6月, 2009 4 次提交
-
-
由 Ingo Molnar 提交于
Yong Wang reported the following compiler warning: builtin-report.c: In function 'process_overflow_event': builtin-report.c:984: error: cast to pointer from integer of different size Which happens because we try to print ->ips[] out with a limited format, losing the high 32 bits. Print it out using %016Lx instead. Reported-by: NYong Wang <yong.y.wang@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Take advantage of call-graph percounter sampling/recording to display a non-trivial histogram: the true, collapsed/summarized cost measurement, on a per system call total overhead basis: aldebaran:~/linux/linux/tools/perf> ./perf record -g -a -f ~/hackbench 10 aldebaran:~/linux/linux/tools/perf> ./perf report -s symbol --syscalls | head -10 # # (3536 samples) # # Overhead Symbol # ........ ...... # 40.75% [k] sys_write 40.21% [k] sys_read 4.44% [k] do_nmi ... This is done by accounting each (reliable) call-chain that chains back to a given system call to that system call function. [ So in the above example we can see that hackbench spends about 40% of its total time somewhere in sys_write() and 40% somewhere in sys_read(), the rest of the time is spent in user-space. The time is not spent in sys_write() _itself_ but in one of its many child functions. ] Or, a recording of a (source files are already in the page-cache) kernel build: $ perf record -g -m 512 -f -- make -j32 kernel $ perf report -s s --syscalls | grep '\[k\]' | grep -v nmi 4.14% [k] do_page_fault 1.20% [k] sys_write 1.10% [k] sys_open 0.63% [k] sys_exit_group 0.48% [k] smp_apic_timer_interrupt 0.37% [k] sys_read 0.37% [k] sys_execve 0.20% [k] sys_mmap 0.18% [k] sys_close 0.14% [k] sys_munmap 0.13% [k] sys_poll 0.09% [k] sys_newstat 0.07% [k] sys_clone 0.06% [k] sys_newfstat 0.05% [k] sys_access 0.05% [k] schedule Shows the true total cost of each syscall variant that gets used during a kernel build. This profile reveals it that pagefaults are the costliest, followed by read()/write(). An interesting detail: timer interrupts cost 0.5% - or 0.5 seconds per 100 seconds of kernel build-time. (this was done with HZ=1000) The summary is done in 'perf report', i.e. in the post-processing stage - so once we have a good call-graph recording, this type of non-trivial high-level analysis becomes possible. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Recording with -a (or with -p) can race with tasks going away: couldn't open /proc/8440/maps Causing an early exit() and no recording done. Do not abort the recording session - instead just skip that task. Also, only print the warnings under -v. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Add the first steps of call-graph profiling: - add the -c (--call-graph) option to perf record - parse the call-graph record and printout out under -D (--dump-trace) The call-graph data is not put into the histogram yet, but it can be seen that it's being processed correctly: 0x3ce0 [0x38]: event: 35 . . ... raw event: size 56 bytes . 0000: 23 00 00 00 05 00 38 00 d4 df 0e 81 ff ff ff ff #.....8........ . 0010: 60 0b 00 00 60 0b 00 00 03 00 00 00 01 00 02 00 `...`.......... . 0020: d4 df 0e 81 ff ff ff ff a0 61 ed 41 36 00 00 00 .........a.A6.. . 0030: 04 92 e6 41 36 00 00 00 .a.A6.. . 0x3ce0 [0x38]: PERF_EVENT (IP, 5): 2912: 0xffffffff810edfd4 period: 1 ... chain: u:2, k:1, nr:3 ..... 0: 0xffffffff810edfd4 ..... 1: 0x3641ed61a0 ..... 2: 0x3641e69204 ... thread: perf:2912 ...... dso: [kernel] This shows a 3-entry call-graph: with 1 kernel-space and two user-space entries Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 6月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Print out events in hexa dump format, when -D is specified: 0x4868 [0x48]: event: 1 . . ... raw event: size 72 bytes . 0000: 01 00 00 00 00 00 48 00 d4 72 00 00 d4 72 00 00 ......H..r...r. . 0010: 00 00 40 f2 3e 00 00 00 00 30 01 00 00 00 00 00 ..@.>....0..... . 0020: 00 00 00 00 00 00 00 00 2f 75 73 72 2f 6c 69 62 ......../usr/li . 0030: 36 34 2f 6c 69 62 65 6c 66 2d 30 2e 31 34 31 2e 64/libelf-0.141 . 0040: 73 6f 00 00 00 00 00 00 f-0.141 . 0x4868 [0x48]: PERF_EVENT_MMAP 29396: [0x3ef2400000(0x13000) @ (nil)]: /usr/lib64/libelf-0.141.so This helps the debugging of mis-parsing of data files, and helps the addition of new sample/trace formats. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 6月, 2009 7 次提交
-
-
由 Frederic Weisbecker 提交于
- fix addr2line on userspace binary: don't only check kernel image. - fix string allocation size for path: missing ending null char room - fix overflow in symbol extra info Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1244907563-7820-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
If -vv (very verbose) is specified, print out raw data in the following format: $ perf stat -vv -r 3 ./loop_1b_instructions [ perf stat: executing run #1 ... ] [ perf stat: executing run #2 ... ] [ perf stat: executing run #3 ... ] debug: runtime[0]: 235871872 debug: walltime[0]: 236646752 debug: runtime_cycles[0]: 755150182 debug: counter/0[0]: 235871872 debug: counter/1[0]: 235871872 debug: counter/2[0]: 235871872 debug: scaled[0]: 0 debug: counter/0[1]: 2 debug: counter/1[1]: 235870662 debug: counter/2[1]: 235870662 debug: scaled[1]: 0 debug: counter/0[2]: 1 debug: counter/1[2]: 235870437 debug: counter/2[2]: 235870437 debug: scaled[2]: 0 debug: counter/0[3]: 140 debug: counter/1[3]: 235870298 debug: counter/2[3]: 235870298 debug: scaled[3]: 0 debug: counter/0[4]: 755150182 debug: counter/1[4]: 235870145 debug: counter/2[4]: 235870145 debug: scaled[4]: 0 debug: counter/0[5]: 1001411258 debug: counter/1[5]: 235868838 debug: counter/2[5]: 235868838 debug: scaled[5]: 0 debug: counter/0[6]: 27897 debug: counter/1[6]: 235868560 debug: counter/2[6]: 235868560 debug: scaled[6]: 0 debug: counter/0[7]: 2910 debug: counter/1[7]: 235868151 debug: counter/2[7]: 235868151 debug: scaled[7]: 0 debug: runtime[0]: 235980257 debug: walltime[0]: 236770942 debug: runtime_cycles[0]: 755114546 debug: counter/0[0]: 235980257 debug: counter/1[0]: 235980257 debug: counter/2[0]: 235980257 debug: scaled[0]: 0 debug: counter/0[1]: 3 debug: counter/1[1]: 235980049 debug: counter/2[1]: 235980049 debug: scaled[1]: 0 debug: counter/0[2]: 1 debug: counter/1[2]: 235979907 debug: counter/2[2]: 235979907 debug: scaled[2]: 0 debug: counter/0[3]: 135 debug: counter/1[3]: 235979780 debug: counter/2[3]: 235979780 debug: scaled[3]: 0 debug: counter/0[4]: 755114546 debug: counter/1[4]: 235979652 debug: counter/2[4]: 235979652 debug: scaled[4]: 0 debug: counter/0[5]: 1001439771 debug: counter/1[5]: 235979304 debug: counter/2[5]: 235979304 debug: scaled[5]: 0 debug: counter/0[6]: 23723 debug: counter/1[6]: 235979050 debug: counter/2[6]: 235979050 debug: scaled[6]: 0 debug: counter/0[7]: 2213 debug: counter/1[7]: 235978820 debug: counter/2[7]: 235978820 debug: scaled[7]: 0 debug: runtime[0]: 235888002 debug: walltime[0]: 236700533 debug: runtime_cycles[0]: 754881504 debug: counter/0[0]: 235888002 debug: counter/1[0]: 235888002 debug: counter/2[0]: 235888002 debug: scaled[0]: 0 debug: counter/0[1]: 2 debug: counter/1[1]: 235887793 debug: counter/2[1]: 235887793 debug: scaled[1]: 0 debug: counter/0[2]: 1 debug: counter/1[2]: 235887645 debug: counter/2[2]: 235887645 debug: scaled[2]: 0 debug: counter/0[3]: 135 debug: counter/1[3]: 235887499 debug: counter/2[3]: 235887499 debug: scaled[3]: 0 debug: counter/0[4]: 754881504 debug: counter/1[4]: 235887368 debug: counter/2[4]: 235887368 debug: scaled[4]: 0 debug: counter/0[5]: 1001401731 debug: counter/1[5]: 235887024 debug: counter/2[5]: 235887024 debug: scaled[5]: 0 debug: counter/0[6]: 24212 debug: counter/1[6]: 235886786 debug: counter/2[6]: 235886786 debug: scaled[6]: 0 debug: counter/0[7]: 1824 debug: counter/1[7]: 235886560 debug: counter/2[7]: 235886560 debug: scaled[7]: 0 Performance counter stats for '/home/mingo/loop_1b_instructions' (3 runs): 235.913377 task-clock-msecs # 0.997 CPUs ( +- 0.011% ) 2 context-switches # 0.000 M/sec ( +- 0.000% ) 1 CPU-migrations # 0.000 M/sec ( +- 0.000% ) 136 page-faults # 0.001 M/sec ( +- 0.730% ) 755048744 cycles # 3200.534 M/sec ( +- 0.009% ) 1001417586 instructions # 1.326 IPC ( +- 0.001% ) 25277 cache-references # 0.107 M/sec ( +- 3.988% ) 2315 cache-misses # 0.010 M/sec ( +- 9.845% ) 0.236706075 seconds time elapsed. This allows the summary stats to be validated. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Add the --repeat <n> feature to perf stat, which repeats a given command up to a 100 times, collects the stats and calculates an average and a stddev. For example, the following oneliner 'perf stat' command runs hackbench 5 times and prints a tabulated result of all metrics, with averages and noise levels (in percentage) printed: aldebaran:~/linux/linux/tools/perf> ./perf stat --repeat 5 ~/hackbench 10 Time: 0.117 Time: 0.108 Time: 0.089 Time: 0.088 Time: 0.100 Performance counter stats for '/home/mingo/hackbench 10' (5 runs): 1243.989586 task-clock-msecs # 10.460 CPUs ( +- 4.720% ) 47706 context-switches # 0.038 M/sec ( +- 19.706% ) 387 CPU-migrations # 0.000 M/sec ( +- 3.608% ) 17793 page-faults # 0.014 M/sec ( +- 0.354% ) 3770941606 cycles # 3031.329 M/sec ( +- 4.621% ) 1566372416 instructions # 0.415 IPC ( +- 2.703% ) 16783421 cache-references # 13.492 M/sec ( +- 5.202% ) 7128590 cache-misses # 5.730 M/sec ( +- 7.420% ) 0.118924455 seconds time elapsed. The goal of this feature is to allow the reliance on these accurate statistics and to know how many times a command has to be repeated for the noise to go down to an acceptable level. (The -v option can be used to see a line printed out as each run progresses.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
- use IPC for the instruction normalization output - CPUs for the CPU utilization factor value. - print out time elapsed like the other rows - tidy up the task-clocks/cpu-clocks printout Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
It's can be very annoying to scroll down perf annotated output until we find relevant overhead. Using the -l option, you can now have a small summary sorted per overhead in the beginning of the output. Example: ./perf annotate -l -k ../../vmlinux -s __lock_acquire Sorted summary for file ../../vmlinux ---------------------------------------------- 12.04 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1653 4.61 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1740 3.77 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1775 3.56 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1653 2.93 /home/fweisbec/linux/linux-2.6-tip/arch/x86/include/asm/irqflags.h:15 2.83 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2545 2.30 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2594 2.20 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2388 2.20 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:730 2.09 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:730 2.09 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:138 1.88 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2548 1.47 /home/fweisbec/linux/linux-2.6-tip/arch/x86/include/asm/irqflags.h:15 1.36 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2594 1.36 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:730 1.26 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1654 1.26 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1653 1.15 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:2592 1.15 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1740 1.15 /home/fweisbec/linux/linux-2.6-tip/kernel/lockdep.c:1740 [...] Only overhead over 0.5% are summarized. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1244844682-12928-2-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
When we have a colored line in perf annotate, ie a middle/high overhead one, it's sometimes useful to get the matching line and filename from the source file, especially this path prepares to another subsequent one which will print a sorted summary of midle/high overhead lines in the beginning of the output. Filename:Lines have the same color than the concerned ip lines. It can be slow because it relies on addr2line. We could also use objdump with -l but that implies we would have to bufferize objdump output and parse it to filter the relevant lines since we want to print a sorted summary in the beginning. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1244844682-12928-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Mike Frysinger 提交于
Help out arch porters who want to support perf counters by listing some basic requirements. Signed-off-by: NMike Frysinger <vapier@gentoo.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1244827063-24046-1-git-send-email-vapier@gentoo.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 6月, 2009 3 次提交
-
-
由 Peter Zijlstra 提交于
Provide for means of extending the perf_counter_attr in a 'natural' way. We allow growing the structure by appending fields at the end by specifying the full structure size inside it. When a new kernel sees a smaller (old) structure, it will 0 pad the tail. When an old kernel sees a larger (new) structure, it will verify the tail consists of 0s, otherwise fail. If we fail due to a size-mismatch, we return -E2BIG and write the kernel's native attribe size back into the provided structure. Furthermore, add some attribute verification, so that we'll fail counter creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in the __reserved fields). (This ABI detail is introduced while keeping the existing syscall ABI.) Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Up until now record has worked on the assumption that type=0, config=0 was a suitable configuration - which it is. Lets make this a little more explicit and more readable via the use of proper symbols. [ Impact: cleanup ] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Yong Wang 提交于
Otherwise all L1-instruction aliases will be recognized as L1-data by strcasestr() when calling function parse_aliases. Signed-off-by: NYong Wang <yong.y.wang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090612031706.GA22126@ywang-moblin2.bj.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 6月, 2009 3 次提交
-
-
由 Peter Zijlstra 提交于
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
A build error slipped in: builtin-report.c: In function ‘hist_entry__fprintf’: builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’ Because we got a bit sloppy with those types. uint64_t really sucks, because there's no printf format for it. So standardize on __u64 instead - for all types that go to or come from the ABI (which is __u64), or for values that need to be large enough even on 32-bit. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
When we use variable period sampling, add the period to the sample data and use that to normalize the samples. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 6月, 2009 2 次提交
-
-
由 Peter Zijlstra 提交于
Currently report and stat catch SIGINT (and others) without altering their exit state. This means that things like: while :; do perf stat ./foo ; done Loops become hard-to-interrupt, because bash never sees perf terminate due to interruption. Fix this. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Create the counter in a disabled state and only enable it after we mmap() the buffer, this allows us to see the first few samples (and observe the frequency ramp). Furthermore, print the period in the verbose report. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 6月, 2009 2 次提交
-
-
由 Ingo Molnar 提交于
The rule is: - high overhead: red - mid overhead: green - low overhead: normal (white/black) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Enberg 提交于
This patch adds support for profiling JIT generated code to 'perf report'. A JIT compiler is required to generate a "/tmp/perf-$PID.map" symbols map that is parsed when looking and displaying symbols. Thanks to Peter Zijlstra for his help with this patch! Example "perf report" output with the Jato JIT: # # (40311 samples) # # Overhead Command Shared Object Symbol # ........ ................ ......................... ...... # 97.80% jato /tmp/perf-11915.map [.] Fibonacci.fib(I)I 0.56% jato 00000000b7fa023b 0x000000b7fa023b 0.45% jato /tmp/perf-11915.map [.] Fibonacci.main([Ljava/lang/String;)V 0.38% jato [kernel] [k] get_page_from_freelist 0.06% jato [kernel] [k] kunmap_atomic 0.05% jato ./jato [.] utf8Hash 0.04% jato ./jato [.] executeJava 0.04% jato ./jato [.] defineClass Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: a.p.zijlstra@chello.nl Cc: acme@redhat.com LKML-Reference: <Pine.LNX.4.64.0906082111590.12407@melkki.cs.Helsinki.FI> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 6月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Before: 7549326754 cycles # 3201.811 M/sec 10007594937 instructions # 4244.408 M/sec After: 7542051194 cycles # 3201.996 M/sec 10007743852 instructions # 4248.811 M/sec # 1.327 per cycle Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 6月, 2009 6 次提交
-
-
由 Ingo Molnar 提交于
Before: $ perf report failed to open file: No such file or directory After: $ perf report failed to open file: perf.data (try 'perf record' first) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
If perf is run on a !CONFIG_PERF_COUNTER kernel right now it bails out with no messages or with confusing messages. Standardize this case some more and explain the situation. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
On architectures/CPUs without PMU support but with perfcounters enabled 'perf record' currently fails because it cannot create a cycle based hw-perfcounter. Fall back to the cpu-clock-tick sw-perfcounter in this case, which is hrtimer based and will always work (as long as perfcounters are enabled). Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
On architectures/CPUs without PMU support but with perfcounters enabled 'perf top' currently fails because it cannot create a cycle based hw-perfcounter. Fall back to the cpu-clock-tick sw-perfcounter in this case, which is hrtimer based and will always work (as long as perfcounters is enabled). Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Before: $ perf stat ~/hackbench 5 error: syscall returned with -1 (No such device) After: $ perf stat ~/hackbench 5 Time: 1.640 Performance counter stats for '/home/mingo/hackbench 5': 6524.570382 task-clock-ticks # 3.838 CPU utilization factor 35704 context-switches # 0.005 M/sec 191 CPU-migrations # 0.000 M/sec 8958 page-faults # 0.001 M/sec <not counted> cycles <not counted> instructions <not counted> cache-references <not counted> cache-misses Wall-clock time elapsed: 1699.999995 msecs Also add -v (--verbose) option to allow the printing of failed counter opens. Plus dont print 'inf' if wall-time is zero (due to jiffies granularity), instead skip the printing of the CPU utilization factor. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
The first snapshot reading often occur before any events have been read in the mapped perfcounter files. Just wait until we have at least one event before starting the snapshot, or the delay before the first set of entries to be displayed may be long in case of low refresh rate. Note: we could also use a semaphore to wait before "print_entries" number of eveents is reached, but again this value is tunable and we can't ensure we will even reach it. Also we could base on a default mimimum set of entries for the first refresh, say 15, but again, the minimal sample is tunable, and we could end up displaying nothing until we have a minimal default set of events, which can take some time in case of high samples filters. Hence this simple solution which partially covers the default case. [ Impact: fix display artifacts in perf top ] Signed-off-by: NFrederic Weisbecker <fweisbeec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1244322643-6447-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-