1. 10 2月, 2011 1 次提交
    • A
      perf tools: Fix thread_map event synthesizing in top and record · 401b8e13
      Arnaldo Carvalho de Melo 提交于
      Jeff Moyer reported these messages:
      
        Warning:  ... trying to fall back to cpu-clock-ticks
      
      couldn't open /proc/-1/status
      couldn't open /proc/-1/maps
      [ls output]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]
      
      That lead me and David Ahern to see that something was fishy on the thread
      synthesizing routines, at least for the case where the workload is started
      from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
      parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
      PERF_RECORD_COMM events for the thread -1, a bug.
      
      So I investigated this and noticed that when we introduced support for
      recording a process and its threads using --pid some bugs were introduced and
      that the way to fix it was to instead of passing the target_tid to the event
      synthesizing routines we should better pass the thread_map that has the list of
      threads for a --pid or just the single thread for a --tid.
      
      Checked in the following ways:
      
      On a 8-way machine run cyclictest:
      
      [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
      policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798
      
      T: 0 (28791) P:99 I:100 C:  25072 Min:      4 Act:    5 Avg:    6 Max:     122
      T: 1 (28792) P:98 I:150 C:  16715 Min:      4 Act:    6 Avg:    5 Max:      27
      T: 2 (28793) P:97 I:200 C:  12534 Min:      4 Act:    5 Avg:    4 Max:       8
      T: 3 (28794) P:96 I:250 C:  10028 Min:      4 Act:    5 Avg:    5 Max:      96
      T: 4 (28795) P:95 I:300 C:   8357 Min:      5 Act:    6 Avg:    5 Max:      12
      T: 5 (28796) P:94 I:350 C:   7163 Min:      5 Act:    6 Avg:    5 Max:      12
      T: 6 (28797) P:93 I:400 C:   6267 Min:      4 Act:    5 Avg:    5 Max:       9
      T: 7 (28798) P:92 I:450 C:   5571 Min:      4 Act:    5 Avg:    5 Max:       9
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ]
      
      [root@emilia ~]#
      
      This will create one extra thread per CPU:
      
      [root@emilia ~]# tuna -t cyclictest -CP
                            thread       ctxt_switches
          pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
       28825   OTHER     0     0xff      2169          671      cyclictest
        28832   FIFO    93        6     52338            1      cyclictest
        28833   FIFO    92        7     46524            1      cyclictest
        28826   FIFO    99        0    209360            1      cyclictest
        28827   FIFO    98        1    139577            1      cyclictest
        28828   FIFO    97        2    104686            0      cyclictest
        28829   FIFO    96        3     83751            1      cyclictest
        28830   FIFO    95        4     69794            1      cyclictest
        28831   FIFO    94        5     59825            1      cyclictest
      [root@emilia ~]#
      
      So we should expect only samples for the above 9 threads when using the
      --dump-raw-trace|-D perf report switch to look at the column with the tid:
      
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
          629 28825
          110 28826
          491 28827
          308 28828
          198 28829
          621 28830
          225 28831
          203 28832
           89 28833
      [root@emilia ~]#
      
      So for workloads started by 'perf record' seems to work, now for existing workloads,
      just run cyclictest first, without 'perf record':
      
      [root@emilia ~]# tuna -t cyclictest -CP
                            thread       ctxt_switches
          pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
       28859   OTHER     0     0xff       594          200      cyclictest
        28864   FIFO    95        4     16587            1      cyclictest
        28865   FIFO    94        5     14219            1      cyclictest
        28866   FIFO    93        6     12443            0      cyclictest
        28867   FIFO    92        7     11062            1      cyclictest
        28860   FIFO    99        0     49779            1      cyclictest
        28861   FIFO    98        1     33190            1      cyclictest
        28862   FIFO    97        2     24895            1      cyclictest
        28863   FIFO    96        3     19918            1      cyclictest
      [root@emilia ~]#
      
      and then later did:
      
      [root@emilia ~]# perf record --pid 28859 sleep 3
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
      [root@emilia ~]#
      
      To collect 3 seconds worth of samples for pid 28859 and its children:
      
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
           15 28859
           33 28860
           19 28861
           13 28862
           13 28863
           10 28864
           11 28865
            9 28866
          255 28867
      [root@emilia ~]#
      
      Works, last thing is to check if looking at just one of those threads also works:
      
      [root@emilia ~]# perf record --tid 28866 sleep 3
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
            3 28866
      [root@emilia ~]#
      
      Works too.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      401b8e13
  2. 23 1月, 2011 1 次提交
    • A
      perf tools: Fix 64 bit integer format strings · 9486aa38
      Arnaldo Carvalho de Melo 提交于
      Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
      by making our 64 bit integers typedefs of stdint.h types and using
      PRI[ux]64 like, for instance, git does.
      
      Reported by Denis Kirjanov that provided a patch for one case, I went
      and changed all cases.
      Reported-by: NDenis Kirjanov <dkirjanov@kernel.org>
      Tested-by: NDenis Kirjanov <dkirjanov@kernel.org>
      LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
      Cc: Denis Kirjanov <dkirjanov@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pingtian Han <phan@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9486aa38
  3. 23 12月, 2010 1 次提交
    • A
      perf symbols: Improve kallsyms symbol end addr calculation · 3b01a413
      Arnaldo Carvalho de Melo 提交于
      For kallsyms we don't have the symbol address end, so we do an extra pass and
      set the symbol end addr as being the start of the next minus one.
      
      But this was being done just after we filtered the symbols of a
      particular type (functions, variables), so the symbol end was sometimes
      after what it really is.
      
      Fixing up symbol end also was falling apart when we have symbol aliases,
      then the end address of all but the last alias was being set to be
      before its start.
      
      Fix it up by checking for symbol aliases and making the kallsyms__parse
      routine use the next symbol, whatever its type, as the limit for the
      previous symbol, passing that end address to the callback.
      
      This was detected by the 'perf test' synthetic paranoid regression
      tests, fix it up so that even that case doesn't mislead us.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b01a413
  4. 09 12月, 2010 1 次提交
  5. 05 12月, 2010 2 次提交
    • A
      perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events · 9c90a61c
      Arnaldo Carvalho de Melo 提交于
      So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME:
      
        $ perf record -aT
        $ perf report -D | grep PERF_RECORD_
        <SNIP>
         3   5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3
         3   5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3
         3   5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811)
         3   5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3
         3   5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853
         3   5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find
         3   5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3
         3   5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so
         3   5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso]
         3   5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1
         3   5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so
         3   5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so
         3   5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1
         3   5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3
        <SNIP>
      
      First column is the cpu and the second the timestamp.
      
      That way we can investigate problems in the event stream.
      
      If the new perf binary is run on an older kernel, it will disable this feature
      automatically.
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9c90a61c
    • A
      perf session: Parse sample earlier · 640c03ce
      Arnaldo Carvalho de Melo 提交于
      At perf_session__process_event, so that we reduce the number of lines in eache
      tool sample processing routine that now receives a sample_data pointer already
      parsed.
      
      This will also be useful in the next patch, where we'll allow sample the
      identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
      timestamp) just after before every event.
      
      Also validate callchains in perf_session__process_event, i.e. as early as
      possible, and keep a counter of the number of events discarded due to invalid
      callchains, warning the user about it if it happens.
      
      There is an assumption that was kept that all events have the same sample_type,
      that will be dealt with in the future, when this preexisting limitation will be
      removed.
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      640c03ce
  6. 27 11月, 2010 1 次提交
  7. 04 8月, 2010 2 次提交
    • S
      perf: expose event__process function · b83f920e
      Srikar Dronamraju 提交于
      The event__process function is useful in processing /proc/<pid>/maps.  All of
      the functions that are called from event__process are defined in util/event.c.
      Though its defined in builtin-top.c, it could be reused for perf probe for
      uprobes. Hence moving it to util/event.c and exporting the function.
      
      LKML-Reference: <20100802123851.GD22812@linux.vnet.ibm.com>
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b83f920e
    • D
      perf events: Fix mmap offset determination · b5a63254
      Dave Martin 提交于
      Fix buggy-looking code which unnecessarily adjusts the file offset
      fields read from /proc/*/maps.
      
      This may have gone unnoticed since the offset is usually 0 (and the
      logic in util/symbol.c may work incorrectly for other offset values).
      
      Commiter note:
      
      This fixes a bug introduced in 4af8b35d, there is no need to shift pgoff
      twice, the show_map_vma routine in fs/proc/task_mmu.c already converts
      it from the number of pages to the size in bytes, and that is what
      appears in /proc/PID/map.
      
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Will Deacon <Will.Deacon@arm.com>
      LKML-Reference: <1280836116-6654-2-git-send-email-dave.martin@linaro.org>
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b5a63254
  8. 31 7月, 2010 1 次提交
  9. 30 7月, 2010 1 次提交
  10. 27 7月, 2010 1 次提交
  11. 23 7月, 2010 1 次提交
  12. 17 6月, 2010 1 次提交
    • A
      perf session: Remove threads from tree on PERF_RECORD_EXIT · 720a3aeb
      Arnaldo Carvalho de Melo 提交于
      Move them to a session->dead_threads list just like we do with maps that
      are replaced, because we may have hist_entries pointing to them.
      
      This fixes a bug when inserting maps for a new thread that reused the
      TID, mixing maps for two different threads, causing an endless loop.
      
      The code for insering maps should be made more robust but for .35 this
      is the minimalistic patch.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      720a3aeb
  13. 05 6月, 2010 3 次提交
    • A
      perf report: Implement --sort cpu · f60f3593
      Arun Sharma 提交于
      In a shared multi-core environment, users want to analyze why their
      program was slow. In particular, if the code ran slower only on certain
      CPUs due to interference from other programs or kernel threads, the user
      should be able to notice that.
      
      Sample usage:
      
      perf record -f -a -- sleep 3
      perf report --sort cpu,comm
      
      Workload:
      
      program is running on 16 CPUs
      Experiencing interference from an antagonist only on 4 CPUs.
      
        Samples: 106218177676 cycles
      
        Overhead  CPU          Command
        ........  ...  ...............
      
           6.25%  2            program
           6.24%  6            program
           6.24%  11           program
           6.24%  5            program
           6.24%  9            program
           6.24%  10           program
           6.23%  15           program
           6.23%  7            program
           6.23%  3            program
           6.23%  14           program
           6.22%  1            program
           6.20%  13           program
           3.17%  12           program
           3.15%  8            program
           3.14%  0            program
           3.13%  4            program
           3.11%  4         antagonist
           3.11%  0         antagonist
           3.10%  8         antagonist
           3.07%  12        antagonist
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100505181612.GA5091@sharma-home.net>
      Signed-off-by: NArun Sharma <aruns@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f60f3593
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
    • S
      perf report: Make -D print sampled CPU · 761844b9
      Stephane Eranian 提交于
      It is useful to know on which CPU a sample was captured on.
      The information is captured with perf record -R but it was
      not printed out by perf report -D. This patch adds this.
      
      When -R is not used, cpu is set to -1to indicate that
      the CPU is unknown (it is not captured).
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      761844b9
  14. 01 6月, 2010 2 次提交
    • F
      perf: Do the comm inheritance per thread in event__process_task · dd833d71
      Frederic Weisbecker 提交于
      event__process_task() doesn't propagate the comm copy on clone,
      but only on process fork. So we loose all the tid:comm resolution
      for tasks that aren't a main process thread.
      
      Progragate the per thread granularity to event__process_task for
      pid resolution.
      
      This fixes various unresolved pids in perf sched, especially when
      we trace multithread processes. The problem is quickly reproducible
      with the messaging benchmark using the multithread mode "-t" :
      
      	perf sched record perf bench sched messaging -t
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      dd833d71
    • F
      perf: Process comm events by tid · 13eb04fd
      Frederic Weisbecker 提交于
      When we synthetize the existing running tasks though procfs,
      we walk through every threads of a process, queuing one comm
      events per tid.
      
      But then on report time, event__process_comm() only creates and
      sets the comm on a per process granularity. This is the right
      thing for comm events that came from the kernel, as they are
      only created on exec. Sub-threads then inherit their comm
      from fork events. But that doesn't work with our synthetized
      comm events taken from procfs informations as the per thread
      granularity is done on comm events directly there.
      
      Hence we need event__process_comm() to work with the tid rather
      than the pid. It won't change anything for comm events coming
      from the kernel but this will fix the synthetized ones.
      
      Before:
      
      	$ ./perf report -D | grep COMM | grep firefox
      
      	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5297
      
      After:
      	$ ./perf report -D | grep COMM | grep firefox
      
      	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
      	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5299
      	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5300
      	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5308
      	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5309
      	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5312
      
      This fixes various unresolved pid on perf sched.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      13eb04fd
  15. 15 5月, 2010 1 次提交
    • A
      perf hist: Clarify events_stats fields usage · cee75ac7
      Arnaldo Carvalho de Melo 提交于
      The events_stats.total field is too generic, rename it to .total_period,
      and also add a comment explaining that it is the sum of all the .period
      fields in samples, that is needed because we use auto-freq to avoid
      sampling artifacts.
      
      Ditto for events_stats.lost, that is the sum of all lost_event.lost
      fields, i.e. the number of events the kernel dropped.
      
      Looking at the users, builtin-sched.c can make use of these fields and
      stop doing it again.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cee75ac7
  16. 14 5月, 2010 1 次提交
  17. 11 5月, 2010 1 次提交
    • A
      perf hist: Introduce hists class and move lots of methods to it · 1c02c4d2
      Arnaldo Carvalho de Melo 提交于
      In cbbc79a5 we introduced support for multiple events by introducing a
      new "event_stat_id" struct and then made several perf_session methods
      receive a point to it instead of a pointer to perf_session, and kept the
      event_stats and hists rb_tree in perf_session.
      
      While working on the new newt based browser, I realised that it would be
      better to introduce a new class, "hists" (short for "histograms"),
      renaming the "event_stat_id" struct and the perf_session methods that
      were really "hists" methods, as they manipulate only struct hists
      members, not touching anything in the other perf_session members.
      
      Other optimizations, such as calculating the maximum lenght of a symbol
      name present in an hists instance will be possible as we add them,
      avoiding a re-traversal just for finding that information.
      
      The rationale for the name "hists" to replace "event_stat_id" is that we
      may have multiple sets of hists for the same event_stat id, as, for
      instance, the 'perf diff' tool has, so event stat id is not what
      characterizes what this struct and the functions that manipulate it do.
      
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c02c4d2
  18. 10 5月, 2010 3 次提交
    • A
      perf session: Embed the host machine data on perf_session · 1f626bc3
      Arnaldo Carvalho de Melo 提交于
      We have just one host on a given session, and that is the most common
      setup right now, so embed a ->host_machine struct machine instance
      directly in the perf_session class, check if we're looking for it before
      going to the rb_tree.
      
      This also fixes a problem found when we try to process old perf.data
      files where we didn't have MMAP events for the kernel and modules and
      thus don't create the kernel maps, do it in event__preprocess_sample if
      it wasn't already.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f626bc3
    • A
      perf symbols: Check if a struct machine instance was found · 4cc49458
      Arnaldo Carvalho de Melo 提交于
      Which can happen when processing old files that had no fake kernel MMAP,
      events.
      
      That shouldn't result in perf_session__create_kernel_maps not being
      called, this will be fixed in a followup patch, for now do these checks
      to avoid segfaulting.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4cc49458
    • A
      perf symbols: Consider unresolved DSOs in the dso__col_widt calculation · 3ceb0d44
      Arnaldo Carvalho de Melo 提交于
      By using BITS_PER_LONG / 4, that is the number of chars that will be
      used in such cases as the DSO "name".
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ceb0d44
  19. 05 5月, 2010 1 次提交
  20. 04 5月, 2010 1 次提交
    • A
      perf: Fix performance issue with perf report · 02bf60aa
      Anton Blanchard 提交于
      On a large machine we spend a lot of time in perf_header__find_attr when
      running perf report.
      
      If we are parsing a file without PERF_SAMPLE_ID then for each sample we call
      perf_header__find_attr and loop through all counter IDs, never finding a match.
      As the machine gets larger there are more per cpu counters and we spend an
      awful lot of time in there.
      
      The patch below initialises each sample id to -1ULL and checks for this in
      perf_header__find_attr. We may need to do something more intelligent eventually
      (eg a hash lookup from counter id to attr) but this at least fixes the most
      common usage of perf report.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Acked-by: NEric B Munson <ebmunson@us.ibm.com>
      LKML-Reference: <20100504111915.GB14636@kryten>
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      --
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      02bf60aa
  21. 28 4月, 2010 3 次提交
    • A
      perf machine: Adopt some map_groups functions · d28c6223
      Arnaldo Carvalho de Melo 提交于
      Those functions operated on members now grouped in 'struct machine', so
      move those methods to this new class.
      
      The changes made to 'perf probe' shows that using this abstraction
      inserting probes on guests almost got supported for free.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d28c6223
    • A
      perf machine: Pass buffer size to machine__mmap_name · 48ea8f54
      Arnaldo Carvalho de Melo 提交于
      Don't blindly assume that the size of the buffer is enough, use
      snprintf.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48ea8f54
    • A
      perf tools: Rename "kernel_info" to "machine" · 23346f21
      Arnaldo Carvalho de Melo 提交于
      struct kernel_info and kerninfo__ are too vague, what they really
      describe are machines, virtual ones or hosts.
      
      There are more changes to introduce helpers to shorten function calls
      and to make more clear what is really being done, but I left that for
      subsequent patches.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      23346f21
  22. 19 4月, 2010 1 次提交
  23. 03 4月, 2010 1 次提交
  24. 26 3月, 2010 2 次提交
  25. 22 2月, 2010 1 次提交
  26. 09 2月, 2010 1 次提交
    • A
      perf: Fix hypervisor sample reporting · 7fbfc683
      Anton Blanchard 提交于
      cpumode bits are defined as such:
      
       #define PERF_RECORD_MISC_KERNEL                 (1 << 0)
       #define PERF_RECORD_MISC_USER                   (2 << 0)
       #define PERF_RECORD_MISC_HYPERVISOR             (3 << 0)
      
      We need to compare against the complete value of cpumode,
      otherwise hypervisor samples get incorrectly attributed as
      userspace.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100209034304.GA3702@kryten>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7fbfc683
  27. 04 2月, 2010 1 次提交
    • A
      perf symbols: Remove perf_session usage in symbols layer · 9de89fe7
      Arnaldo Carvalho de Melo 提交于
      I noticed while writing the first test in 'perf regtest' that to
      just test the symbol handling routines one needs to create a
      perf session, that is a layer centered on a perf.data file,
      events, etc, so I untied these layers.
      
      This reduces the complexity for the users as the number of
      parameters to most of the symbols and session APIs now was
      reduced while not adding more state to all the map instances by
      only having data that is needed to split the kernel (kallsyms
      and ELF symtab sections) maps and do vmlinux relocation on the
      main kernel map.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9de89fe7
  28. 20 1月, 2010 1 次提交
  29. 16 1月, 2010 2 次提交
    • A
      perf symbols: Accept an alias when looking for "_text" · 881516eb
      Arnaldo Carvalho de Melo 提交于
      As it is in PARISC64:
      
      parisc:~# uname -a
      Linux parisc 2.6.33-rc4-tip+ #1 SMP Thu Jan 14 13:33:34 BRST
      2010 parisc64 GNU/Linux parisc:~# grep -w _text /proc/kallsyms
      0000000040100000 A _text
      parisc:~# grep 0000000040100000 /proc/kallsyms
      0000000040100000 T stext
      0000000040100000 T _stext
      0000000040100000 A _text
      parisc:~#
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263586107-1756-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      881516eb
    • A
      perf symbols: The synthesized kernel modules MMAP must use the pathnames · 460848fc
      Arnaldo Carvalho de Melo 提交于
      Since we use ->long_name in dsos__find now.
      
      Now 'perf buildid_list' is not duplicating those and managing to
      show the proper build-ids for the DSOs with hits:
      
      [root@doppio linux-2.6-tip]# perf buildid-list -H
      74f9930ee94475b6b3238caf3725a50d59cb994b [kernel.kallsyms]
      9ffdcac0a7935922d1f04b6cc9029dfef0f066ef /lib/modules/2.6.33-rc4-tip+/kernel/arch/x86/crypto/aes-x86_64.ko
      3aaf89c32ebfc438ff546c93597d41788e3e65f3 /lib/modules/2.6.33-rc4-tip+/kernel/drivers/net/wireless/iwlwifi/iwl3945.ko
      19f46033f73e1ec612937189bb118c5daba5a0c8 /lib/modules/2.6.33-rc4-tip+/kernel/net/mac80211/mac80211.ko
      1772f014a7a7272859655acb0c64a20ab20b75ee /lib/modules/2.6.33-rc4-tip+/kernel/drivers/net/e1000e/e1000e.ko
      eb4ec8fa8b2a5eb18cad173c92f27ed8887ed1c1 /lib64/libc-2.10.2.so
      5c68f7afeb33309c78037e374b0deee84dd441f6 /lib64/libpthread-2.10.2.so
      e9c9ad5c138ef882e4507d2605645b597da43873 /bin/dbus-daemon
      bcda7d09eb6c9ee380dae0ed3d591d4311decc31 /lib64/libdbus-1.so.3.4.0
      7cc449a77f48b85d6088114000e970ced613bed8 /usr/lib64/libcrypto.so.0.9.8k
      fdd1ccd1ff7917ab020653147ab3bacf0a85b5b9 /lib64/libglib-2.0.so.0.2000.5
      e4417ebb8762e5f2eee93c8011a71115ff5edad8 /lib64/libgobject-2.0.so.0.2000.5
      931e49461f6df99104f0febcc52f6fed5e2efce6 /usr/sbin/sshd
      dab5f724c088f89fbd8304da553ed6cb30bbec96 /usr/lib64/libgdk-x11-2.0.so.0.1600.6
      f2037a091ef36b591187a858d75e203690ea9409 /usr/sbin/openvpn
      a8e4f743b40fb1fd8b85e2f9b88d93b661472b8f /bin/find
      81120aada06e68b1e85882925a0fc6d7345ef59a /home/acme/bin/perf
      [root@doppio linux-2.6-tip]#
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263568672-30323-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      460848fc