- 24 11月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Now that we can check the buildid to see if it really matches, this can be done safely: vmlinux /boot/vmlinux /boot/vmlinux-<uts.release> /lib/modules/<uts.release>/build/vmlinux /usr/lib/debug/lib/modules/%s/vmlinux More can be added - if you know about distros that put the vmlinux somewhere else please let us know. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259001550-8194-1-git-send-email-acme@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 11月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Before we were storing this in the DSO, but in fact this is a property of the 'symbol' class, not something that will vary among DSOs, so move it to a global variable and initialize it using the existing symbol__init routine. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1256927305-4628-2-git-send-email-acme@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 10月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We were using eprintf in some places, that looks at a global 'verbose' level, and at other places passing a 'v' parameter to specify the verbosity level, unify it by introducing pr_{err,warning,debug,etc}, just like in the kernel. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1256153646-10097-1-git-send-email-acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 10月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
Use the kernel bitmap library for internal perf tools uses. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1255792354-11304-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 10月, 2009 1 次提交
-
-
由 Julia Lawall 提交于
In each case, if the NULL test on thread is needed, then the dereference should be after the NULL test. A simplified version of the semantic match that detects this problem is as follows (http://coccinelle.lip6.fr/): // <smpl> @match exists@ expression x, E; identifier fld; @@ * x->fld ... when != \(x = E\|&x\) * x == NULL // </smpl> Signed-off-by: NJulia Lawall <julia@diku.dk> LKML-Reference: <Pine.LNX.4.64.0910170842500.9213@ask.diku.dk> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 10月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
This was just being copy'n'pasted all over. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091013141629.GD21809@ghostprotocols.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 10月, 2009 2 次提交
-
-
由 Mike Galbraith 提交于
To refresh, trying to sched record only one CPU results in bogus latencies as below. I fixed^Wmade it stop doing the bad thing today, by following task migration events properly. Before: marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10 marge:/root/tmp # perf sched lat ----------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------------- Xorg:4943 | 1.290 ms | 1 | avg: 1670.132 ms | max: 1670.132 ms | hald-addon-stor:3569 | 0.091 ms | 3 | avg: 658.609 ms | max: 1975.797 ms | hald-addon-stor:3573 | 0.209 ms | 4 | avg: 499.138 ms | max: 1990.565 ms | audispd:4270 | 0.012 ms | 1 | avg: 0.015 ms | max: 0.015 ms | .... marge:/root/tmp # perf sched trace|grep 'Xorg:4943' swapper-0 [000] 401.184013288: sched_stat_runtime: task: Xorg:4943 runtime: 1233188 [ns], vruntime: 19105169779 [ns] rt2870TimerQHan-4947 [000] 402.854140127: sched_stat_wait: task: Xorg:4943 wait: 580073 [ns] rt2870TimerQHan-4947 [000] 402.854141770: sched_migrate_task: task Xorg:4943 [140] from: 1 to: 0 rt2870TimerQHan-4947 [000] 402.854143854: sched_stat_wait: task: Xorg:4943 wait: 0 [ns] rt2870TimerQHan-4947 [000] 402.854145397: sched_switch: task rt2870TimerQHan:4947 [140] (D) ==> Xorg:4943 [140] Xorg-4943 [000] 402.854193133: sched_stat_runtime: task: Xorg:4943 runtime: 56546 [ns], vruntime: 11766332500 [ns] Xorg-4943 [000] 402.854196842: sched_switch: task Xorg:4943 [140] (S) ==> swapper:0 [140] After: marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10 marge:/root/tmp # perf sched lat ----------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------------- amarokapp:11150 | 271.297 ms | 878 | avg: 0.130 ms | max: 1.057 ms | konsole:5965 | 1.370 ms | 12 | avg: 0.092 ms | max: 0.855 ms | Xorg:4943 | 179.980 ms | 1109 | avg: 0.087 ms | max: 1.206 ms | hald-addon-stor:3574 | 0.212 ms | 9 | avg: 0.040 ms | max: 0.169 ms | hald-addon-stor:3570 | 0.223 ms | 9 | avg: 0.037 ms | max: 0.223 ms | klauncher:5864 | 0.550 ms | 8 | avg: 0.032 ms | max: 0.048 ms | The 'Maximum delay ms' results are now sane. Signed-off-by: NMike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Randy Dunlap 提交于
The following perf build warnings/errors in function argument types: builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type ... trigger because older GCC is not able to prove that sort_dimension__add() does not change the string. Some goes for test_type_token(). Fix this by improving type consistency. Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091005131729.78444bfb.randy.dunlap@oracle.com> [ Also remove ugly type cast now unnecessary. ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 10月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
This reverts commit 9a92b479 ("perf tools: Improve thread comm resolution in perf sched") and fixes the real bug. The bug was elsewhere: We are failing to resolve thread names in perf sched because the table of threads we are building, on top of comm events, has a per process granularity. But perf sched, unlike the other perf tools, needs a per thread granularity as we are profiling every tasks individually. So fix it by building our threads table using the tid instead of the pid as the thread identifier. v2: Revert the previous fix - it is not really needed Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 10月, 2009 2 次提交
-
-
由 Frederic Weisbecker 提交于
When we get sched traces that involve a task that was already created before opening the event, we won't have the comm event for it. So if we can't find the comm event for a given thread, we look at the traces that may contain these informations. Before: ata/1:371 | 0.000 ms | 1 | avg: 3988.693 ms | max: 3988.693 ms | kondemand/1:421 | 0.096 ms | 3 | avg: 345.346 ms | max: 1035.989 ms | kondemand/0:420 | 0.025 ms | 3 | avg: 421.332 ms | max: 964.014 ms | :5124:5124 | 0.103 ms | 5 | avg: 74.082 ms | max: 277.194 ms | :6244:6244 | 0.691 ms | 9 | avg: 125.655 ms | max: 271.306 ms | firefox:5080 | 0.924 ms | 5 | avg: 53.833 ms | max: 257.828 ms | npviewer.bin:6225 | 21.871 ms | 53 | avg: 22.462 ms | max: 220.835 ms | :6245:6245 | 9.631 ms | 21 | avg: 41.864 ms | max: 213.349 ms | After: ata/1:371 | 0.000 ms | 1 | avg: 3988.693 ms | max: 3988.693 ms | kondemand/1:421 | 0.096 ms | 3 | avg: 345.346 ms | max: 1035.989 ms | kondemand/0:420 | 0.025 ms | 3 | avg: 421.332 ms | max: 964.014 ms | firefox:5124 | 0.103 ms | 5 | avg: 74.082 ms | max: 277.194 ms | npviewer.bin:6244 | 0.691 ms | 9 | avg: 125.655 ms | max: 271.306 ms | firefox:5080 | 0.924 ms | 5 | avg: 53.833 ms | max: 257.828 ms | npviewer.bin:6225 | 21.871 ms | 53 | avg: 22.462 ms | max: 220.835 ms | npviewer.bin:6245 | 9.631 ms | 21 | avg: 41.864 ms | max: 213.349 ms | Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
This librarizes the perf.data file mapping and handling in various perf tools, roughly reducing the amount of code and fixing the places that mmap from beginning of the file whereas we want to mmap from the beginning of the data, leading to page fault because the mmap window is too small since the trace info are written in the file too. TODO: - convert perf timechart too Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091007104729.GD5043@nowhere> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 10月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
This drops the trace.info file and move its contents into the common perf.data file. This is done by creating a new trace_info section into this file. A user of perf headers needs to call perf_header__set_trace_info() to save the trace meta informations into the perf.data file. A file created by perf after his patch is unsupported by previous version because the size of the headers have increased. That said, it's two new fields that have been added in the end of the headers, and those could be ignored by previous versions if they just handled the dynamic header size and then ignore the unknow part. The offsets guarantee the compatibility. We'll do a -stable fix for that. But current previous versions handle the header size using its static size, not dynamic, then it's not backward compatible with trace records. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20091006213643.GA5343@nowhere> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 9月, 2009 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Several variables are not used at all, cut'n'paste leftovers. Also check if the sample_type is RAW earlier, to avoid needless searches. Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 9月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: NStephane Eranian <eranian@google.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPaul Mackerras <paulus@samba.org> Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 9月, 2009 2 次提交
-
-
由 Mike Galbraith 提交于
perf sched record passes unparsed args on to perf record, so specifying an output file via perf sched record -o FILE (cmd) just works. Ergo, provide an option to specify input file as well. Also add the missing 'map' command to help. Signed-off-by: NMike Galbraith <efault@gmx.de> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1253254944.20589.11.camel@marge.simson.net> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
For 'perf sched map' output, determine max_cpu automatically, instead of the static default of 15. [ v2: use sysconf() pointed out by Arjan van de Ven <arjan@infradead.org> ] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 9月, 2009 4 次提交
-
-
由 Ingo Molnar 提交于
This prints a textual context-switching outline of workload captured via perf sched record. For example, on a 16 CPU box it outputs: N1 O1 . . . S1 . . . B0 . *I0 C1 . M1 . 23002.773423 secs N1 O1 . *Q0 . S1 . . . B0 . I0 C1 . M1 . 23002.773423 secs N1 O1 . Q0 . S1 . . . B0 . *R1 C1 . M1 . 23002.773485 secs N1 O1 . Q0 . S1 . *S0 . B0 . R1 C1 . M1 . 23002.773478 secs *L0 O1 . Q0 . S1 . S0 . B0 . R1 C1 . M1 . 23002.773523 secs L0 O1 . *. . S1 . S0 . B0 . R1 C1 . M1 . 23002.773531 secs L0 O1 . . . S1 . S0 . B0 . R1 C1 *T1 M1 . 23002.773547 secs T1 => irqbalance:2089 L0 O1 . . . S1 . S0 . *P0 . R1 C1 T1 M1 . 23002.773549 secs *N1 O1 . . . S1 . S0 . P0 . R1 C1 T1 M1 . 23002.773566 secs N1 O1 . . . *J0 . S0 . P0 . R1 C1 T1 M1 . 23002.773571 secs N1 O1 . . . J0 . S0 *B0 P0 . R1 C1 T1 M1 . 23002.773592 secs N1 O1 . . . J0 . *U0 B0 P0 . R1 C1 T1 M1 . 23002.773582 secs N1 O1 . . . *S1 . U0 B0 P0 . R1 C1 T1 M1 . 23002.773604 secs N1 O1 . . . S1 . U0 B0 *. . R1 C1 T1 M1 . 23002.773615 secs N1 O1 . . . S1 . U0 B0 . . *K0 C1 T1 M1 . 23002.773631 secs N1 O1 . *M0 . S1 . U0 B0 . . K0 C1 T1 M1 . 23002.773624 secs N1 O1 . M0 . S1 . U0 *. . . K0 C1 T1 M1 . 23002.773644 secs N1 O1 . M0 . S1 . U0 . . . *R1 C1 T1 M1 . 23002.773662 secs N1 O1 . M0 . S1 . *. . . . R1 C1 T1 M1 . 23002.773648 secs N1 O1 . *. . S1 . . . . . R1 C1 T1 M1 . 23002.773680 secs N1 O1 . . . *L0 . . . . . R1 C1 T1 M1 . 23002.773717 secs *N0 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773709 secs *N1 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773747 secs Columns stand for individual CPUs, from CPU0 to CPU15, and the two-letter shortcuts stand for tasks that are running on a CPU. '*' denotes the CPU that had the event. A dot signals an idle CPU. New tasks are assigned new two-letter shortcuts - when they occur first they are printed. In the above example 'T1' stood for irqbalance: T1 => irqbalance:2089 Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Peter noticed that we have 3 ways of referring to the idle thread: [idle]:0 swapper:0 swapper-0 Standardize on 'swapper:0'. Reported-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Use 'perf sched latency' to track the current task based on context-switch events, and flag the cases where there's some impossible transition: such as a PID being switched out that was not switched in. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Output such lost event and state machine weirdness stats: TOTAL: | 14974.910 ms | 46384 | --------------------------------------------------- INFO: 8.865% lost events (19132 out of 215819, in 8 chunks) INFO: 0.198% state machine bugs (49 out of 24708) (due to lost events?) And increase buffering to -m 1024 (4 MB) by default. Since we use output multiplexing that kind of space is needed. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 9月, 2009 3 次提交
-
-
由 mingo 提交于
This allows more precise 'perf sched latency' output: --------------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | --------------------------------------------------------------------------------------- ksoftirqd/0-4 | 0.010 ms | 2 | avg: 2.476 ms | max: 2.977 ms | perf-12328 | 15.844 ms | 66 | avg: 1.118 ms | max: 9.979 ms | bdi-default-235 | 0.009 ms | 1 | avg: 0.998 ms | max: 0.998 ms | events/1-8 | 0.020 ms | 2 | avg: 0.998 ms | max: 0.998 ms | events/0-7 | 0.018 ms | 2 | avg: 0.992 ms | max: 0.996 ms | sleep-12329 | 0.742 ms | 3 | avg: 0.906 ms | max: 2.289 ms | sshd-12122 | 0.163 ms | 2 | avg: 0.283 ms | max: 0.562 ms | loop-getpid-lon-12322 | 1023.636 ms | 69 | avg: 0.208 ms | max: 5.996 ms | loop-getpid-lon-12321 | 1038.638 ms | 5 | avg: 0.073 ms | max: 0.171 ms | migration/1-5 | 0.000 ms | 1 | avg: 0.006 ms | max: 0.006 ms | --------------------------------------------------------------------------------------- TOTAL: | 2079.078 ms | 153 | ------------------------------------------------- Also, streamline the code a bit more, add asserts for various state machine failures (they should be debugged if they occur) and fix a few odd ends. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 mingo 提交于
Often it's useful to know the PID of the task as well - print it out too. ( While at it, reformat the output to be a bit more paste-into-commit-logs friendly. ) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Before: ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- perf |4853313.251 ms | 10 | avg: 0.046 ms | max: 0.337 ms | flush-8:0 |2426659.202 ms | 5 | avg: 0.015 ms | max: 0.016 ms | sleep |485331.966 ms | 1 | avg: 0.012 ms | max: 0.012 ms | ksoftirqd/1 |485331.320 ms | 1 | avg: 0.005 ms | max: 0.005 ms | ----------------------------------------------------------------------------------- TOTAL: |8250635.739 ms | 17 | --------------------------------------------- After: ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- perf | 0.206 ms | 10 | avg: 0.046 ms | max: 0.337 ms | flush-8:0 | 2.680 ms | 5 | avg: 0.015 ms | max: 0.016 ms | sleep | 0.662 ms | 1 | avg: 0.012 ms | max: 0.012 ms | ksoftirqd/1 | 0.015 ms | 1 | avg: 0.005 ms | max: 0.005 ms | ----------------------------------------------------------------------------------- TOTAL: | 3.563 ms | 17 | --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 9月, 2009 3 次提交
-
-
由 Ingo Molnar 提交于
Finish the -M/--multiplex option implementation: - separate it out from group_fd - correctly set it via the ioctl and dont mmap counters that are multiplexed - modify the perf record event loop to deal with buffer-less counters. - remove the -g option from perf sched record - account for unordered events in perf sched latency - (add -f to perf sched record to ease measurements) - skip idle threads (pid==0) in latency output The result is better latency output by 'perf sched latency': ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- ksoftirqd/8 | 0.071 ms | 2 | avg: 0.458 ms | max: 0.913 ms | at-spi-registry | 0.609 ms | 19 | avg: 0.013 ms | max: 0.023 ms | perf | 3.316 ms | 16 | avg: 0.013 ms | max: 0.054 ms | Xorg | 0.392 ms | 19 | avg: 0.011 ms | max: 0.018 ms | sleep | 0.537 ms | 2 | avg: 0.009 ms | max: 0.009 ms | ----------------------------------------------------------------------------------- TOTAL: | 4.925 ms | 58 | --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Currently it's possible to meet such too high latency results with 'perf sched latency'. ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- xfce4-panel | 0.222 ms | 2 | avg: 4718.345 ms | max: 9436.493 ms | scsi_eh_3 | 3.962 ms | 36 | avg: 55.957 ms | max: 1977.829 ms | The origin is on traces that are sometimes badly serialized across cpus. For example the raw traces that raised such results for xfce4-panel: (1) [init]-0 [000] 1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120] (2) xfce4-panel-4569 [000] 1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140] (3) Xorg-4276 [001] 1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (4) Xorg-4276 [001] 1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (5) perf-5219 [000] 1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120] The traces are processed in the order they arrive. Then in (2), xfce4-panel sleeps, it is first waken up in (3) and eventually scheduled in (5). The latency reported is then 1504 - 1495 = 9 secs, as reported by perf sched. But this is wrong, we are confident in the fact the traces are nicely serialized while we should actually more trust the timestamps. If we reorder by timestamps we get: (1) Xorg-4276 [001] 1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (2) [init]-0 [000] 1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120] (3) xfce4-panel-4569 [000] 1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140] (4) Xorg-4276 [001] 1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (5) perf-5219 [000] 1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120] Now the trace make more sense, xfce4-panel is sleeping. Then it is woken up in (1), scheduled in (2) It goes to sleep in (3), woken up in (4) and scheduled in (5). Now, latency captured between (1) and (2) is of 39 us. And between (4) and (5) it is 2.1 ms. Such pattern of bad serializing is the origin of the high latencies reported by perf sched. Basically, we need to check whether wake up time is higher than schedule out time. If it's not the case, we need to tag the current work atom as invalid. Beside that, we may need to work later on a better ordering of the traces given by the kernel. After this patch: xfce4-session | 0.221 ms | 1 | avg: 0.538 ms | max: 0.538 ms | Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Add an option to multiplex counters output in the channel of the group leader, ie: the first counter opened: -M --multiplex The effect is better serialized samples. This is especially useful for tracepoint samples that need to be well serialized for their post-processing. Also make use of this option in 'perf sched'. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 9月, 2009 14 次提交
-
-
由 Ingo Molnar 提交于
Alias 'perf sched trace' to 'perf trace', for workflow completeness. Add a bit of documentation for perf sched. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Implement the 'perf sched record' subcommand that adds a default list of events, turns on raw sampling and system-wide tracing and passes off the rest of the command to perf record. This is more convenient than having to specify the events all the time. Before: $ perf record -a -R -e sched:sched_switch:r -e sched:sched_stat_wait:r -e sched:sched_stat_sleep:r -e sched:sched_stat_iowait:r -e sched:sched_process_exit:r -e sched:sched_process_fork:r -e sched:sched_wakeup:r -e sched:sched_migrate_task:r -c 1 sleep 1 After: $ perf sched record -f sleep 1 Also fix an assumption in the event string parser that assumed that strings passed in can be modified. (In this case they wont be as they come from a readonly constant section.) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Use a sort list for thread atoms insertion as well - instead of hardcoded for PID. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
- Rename 'latency' field/variable names to the better 'atom' ones - Reduce the number of #include lines and consolidate them - Gather file scope variables at the top of the file - Remove unused bits No change in functionality. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Separate the option parsing cleanly and add two variants: - 'perf sched latency' (can be abbreviated via 'perf sched lat') - 'perf sched replay' (can be abbreviated via 'perf sched rep') Also add a repeat count option to replay and add a separation set of options for replay. Do the sorting setup only in the latency sub-command. Display separate help screens for 'perf sched' and 'perf sched replay -h' - i.e. further separation of the sub-commands. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Implement multidimensional sorting on perf sched so that you can sort either by number of switches, latency average, latency maximum, runtime. perf sched -l -s avg,max (this is the default) ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- gnome-power-man | 0.113 ms | 1 | avg: 4998.531 ms | max: 4998.531 ms | xfdesktop | 1.190 ms | 7 | avg: 136.475 ms | max: 940.933 ms | xfce-mcs-manage | 2.194 ms | 22 | avg: 38.534 ms | max: 735.174 ms | notification-da | 2.749 ms | 31 | avg: 27.436 ms | max: 731.791 ms | xfce4-session | 3.343 ms | 28 | avg: 26.796 ms | max: 734.891 ms | xfwm4 | 3.159 ms | 22 | avg: 12.406 ms | max: 241.333 ms | xchat | 42.789 ms | 214 | avg: 11.886 ms | max: 100.349 ms | xfce4-terminal | 5.386 ms | 22 | avg: 11.414 ms | max: 241.611 ms | firefox | 151.992 ms | 123 | avg: 9.543 ms | max: 153.717 ms | xfce4-panel | 24.324 ms | 47 | avg: 8.189 ms | max: 242.352 ms | :5090 | 6.932 ms | 111 | avg: 8.131 ms | max: 102.665 ms | events/0 | 0.758 ms | 12 | avg: 1.964 ms | max: 21.879 ms | Xorg | 280.558 ms | 340 | avg: 1.864 ms | max: 99.526 ms | geany | 63.391 ms | 295 | avg: 1.099 ms | max: 9.334 ms | reiserfs/0 | 0.039 ms | 2 | avg: 0.854 ms | max: 1.487 ms | kondemand/0 | 8.251 ms | 245 | avg: 0.691 ms | max: 34.372 ms | Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
We are dividing a time in ns by 1e9. This is a nsec to sec conversion. What we want is msecs. Fix it by dividing by 1e6. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Add a field in the thread atom list that keeps track of the total and max latencies and also the total runtime. This makes a faster output and also prepares for sorting. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
Currently in perf sched, we are measuring the scheduler wakeup latencies. Now we also want measure the time a task wait to be scheduled after it gets preempted. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
To measures the latencies, we capture the sched atoms data into a specific structure named struct lat_snapshot. As this structure can be used for other purposes of scheduler profiling and mirrors what happens in a thread work atom, lets rename it to struct work_atom and propagate this renaming in other functions and structures names to keep it coherent. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
After: ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- make | 0.678 ms | 13 | avg: 0.018 ms | max: 0.050 ms | gcc | 0.014 ms | 2 | avg: 0.320 ms | max: 0.627 ms | gcc | 0.000 ms | 2 | avg: 0.185 ms | max: 0.369 ms | ... ----------------------------------------------------------------------------------- TOTAL: | 21.316 ms | 63 | --------------------------------------------- Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Extend the latency tracking structure with scheduling atom runtime info - and sum it up during per task display. (Also clean up a few details.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
After: ----------------------------------------------------------------------------------- Task | runtime ms | switches | average delay ms | maximum delay ms | ----------------------------------------------------------------------------------- migration/0 | 0.000 ms | 1 | avg: 0.047 ms | max: 0.047 ms | ksoftirqd/0 | 0.000 ms | 1 | avg: 0.039 ms | max: 0.039 ms | migration/1 | 0.000 ms | 3 | avg: 0.013 ms | max: 0.016 ms | migration/3 | 0.000 ms | 2 | avg: 0.003 ms | max: 0.004 ms | migration/4 | 0.000 ms | 1 | avg: 0.022 ms | max: 0.022 ms | distccd | 0.000 ms | 1 | avg: 0.004 ms | max: 0.004 ms | distccd | 0.000 ms | 1 | avg: 0.014 ms | max: 0.014 ms | distccd | 0.000 ms | 2 | avg: 0.000 ms | max: 0.000 ms | distccd | 0.000 ms | 2 | avg: 0.012 ms | max: 0.019 ms | distccd | 0.000 ms | 1 | avg: 0.002 ms | max: 0.002 ms | as | 0.000 ms | 2 | avg: 0.019 ms | max: 0.019 ms | as | 0.000 ms | 3 | avg: 0.015 ms | max: 0.017 ms | as | 0.000 ms | 1 | avg: 0.009 ms | max: 0.009 ms | perf | 0.000 ms | 1 | avg: 0.001 ms | max: 0.001 ms | gcc | 0.000 ms | 1 | avg: 0.021 ms | max: 0.021 ms | run-mozilla.sh | 0.000 ms | 2 | avg: 0.010 ms | max: 0.017 ms | mozilla-plugin- | 0.000 ms | 1 | avg: 0.006 ms | max: 0.006 ms | gcc | 0.000 ms | 2 | avg: 0.013 ms | max: 0.013 ms | ----------------------------------------------------------------------------------- (The runtime ms column is not filled in yet.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
- Separate the latency and the replay commands more cleanly - Use consistent naming - Display help page on 'perf sched' outlining comments, instead of aborting Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-