perf_counter tools: optionally scale counter values in perfstat mode
Impact: new functionality This adds add an option to the perfstat mode of kerneltop to scale the reported counter values according to the fraction of time that each counter gets to count. This is invoked with the -l option (I used 'l' because s, c, a and e were all taken already.) This uses the new PERF_RECORD_TOTAL_TIME_{ENABLED,RUNNING} read format options. With this, we get output like this: $ ./perfstat -l -e 0:0,0:1,0:2,0:3,0:4,0:5 ./spin Performance counter stats for './spin': 4016072055 CPU cycles (events) (scaled from 66.53%) 2005887318 instructions (events) (scaled from 66.53%) 1762849 cache references (events) (scaled from 66.69%) 165229 cache misses (events) (scaled from 66.85%) 1001298009 branches (events) (scaled from 66.78%) 41566 branch misses (events) (scaled from 66.61%) Wall-clock time elapsed: 2438.227446 msecs This also lets us detect when a counter is zero because the counter never got to go on the CPU at all. In that case we print <not counted> rather than 0. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Orig-LKML-Reference: <20090330171023.871484899@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
Showing
想要评论请 注册 或 登录