1. 06 9月, 2012 1 次提交
  2. 14 8月, 2012 1 次提交
    • C
      perf symbols: Remove unused 'end' arg in kallsyms parse cb · 82151520
      Cody P Schafer 提交于
      kallsyms__parse() takes a callback that is called on every discovered
      symbol. As /proc/kallsyms does not supply symbol sizes, the callback was
      simply called with end=start, faking the symbol size to 1.
      
      All of the callbacks (there are 2) used in calls to kallsyms__parse()
      are _only_ used as callbacks for kallsyms__parse().
      
      Given that kallsyms__parse() lacks real information about what
      end/length should be, don't make up a length in kallsyms__parse().
      Instead have the callbacks handle guessing the length.
      
      Also relocate a comment regarding symbol creation to the callback which
      does symbol creation (kallsyms__parse() is not in general used to create
      symbols).
      Signed-off-by: NCody P Schafer <cody@linux.vnet.ibm.com>
      Cc: David Hansen <dave@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Matt Hellsley <matthltc@us.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1344637382-22789-3-git-send-email-cody@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      82151520
  3. 01 3月, 2012 1 次提交
  4. 07 2月, 2012 1 次提交
    • J
      perf tools: Fix prefix matching for kernel maps · bf32c9eb
      Jiri Olsa 提交于
      In some perf ancient versions we used '[kernel.kallsyms._text]' as the
      name for the kernel map.
      
      This got changed with commit:
        perf: 'perf kvm' tool for monitoring guest performance from host
        commit a1645ce1
        Author: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      
      and we started to use following name '[kernel.kallsyms]_text'.
      
      This name change is important for the report code dealing with ancient
      perf data. When processing the kernel map event, we need to recognize
      the old naming (dont match the last ']') and initialize the kernel map
      correctly.
      
      The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the
      superfluous ']' to get correct symbol name.
      
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1328461865-6127-1-git-send-email-jolsa@redhat.comSigned-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bf32c9eb
  5. 24 12月, 2011 2 次提交
    • D
      perf tools: Look up thread names for system wide profiling · f5faf726
      David Ahern 提交于
      This handles multithreaded processes with named threads when doing
      system wide profiling: the comm for each thread is looked up allowing
      them to be different from the thread group leader.
      
      v2:
      - fixed sizeof arg to perf_event__get_comm_tgid
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1324578603-12762-3-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f5faf726
    • D
      perf tools: Fix comm for processes with named threads · defd8d38
      David Ahern 提交于
      perf does not properly handle monitoring of processes with named threads.
      For example:
      
      $ ps -C myapp -L
        PID   LWP TTY          TIME CMD
      25118 25118 ?        00:00:00 myapp
      25118 25119 ?        00:00:00 myapp:worker
      
      perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10
      perf report --stdio -i /tmp/perf.data
         100.00%  myapp:worker  [kernel.kallsyms]  [k] perf_event_task_sched_out
      
      The process name is set to the name of the last thread it finds for the
      process.
      
      The Problem:
      perf-top and perf-record both create a thread_map of threads to be
      monitored. That map is used in perf_event__synthesize_thread_map which
      loops over the entries in thread_map and calls __event__synthesize_thread
      to generate COMM and MMAP events.
      
      __event__synthesize_thread calls perf_event__synthesize_comm which opens
      /proc/pid/status, reads the name of the task and its thread group id.
      That's all fine. The problem is that it then reads /proc/pid/task and
      generates COMM events for each task it finds - but using the name found
      in /proc/pid/status where pid is the thread of interest.
      
      The end result (looping over thread_map + synthesizing comm events for
      each thread each time) means the name of the last thread processed sets
      the name for all threads in the process - which is not good for
      multithreaded processes with named threads.
      
      The Fix:
      perf_event__synthesize_comm has an input argument (full) that decides
      whether to process task entries for each pid it is passed. It currently
      never set to 0 (perf_event__synthesize_comm has a single caller and it
      always passes the value 1). Let's fix that.
      
      Add the full input argument to __event__synthesize_thread which passes
      it to perf_event__synthesize_comm. For thread/process monitoring set full
      to 0 which means COMM and MMAP events are only generated for the pid
      passed to it. For system wide monitoring set full to 1 so that COMM events
      are generated for all threads in a process.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      defd8d38
  6. 20 12月, 2011 1 次提交
  7. 02 12月, 2011 1 次提交
  8. 28 11月, 2011 3 次提交
  9. 24 9月, 2011 1 次提交
  10. 03 6月, 2011 1 次提交
    • A
      perf evlist: Don't die if sample_{id_all|type} is invalid · 56722381
      Arnaldo Carvalho de Melo 提交于
      Fixes two more cases where the python binding would not load:
      
      . Not finding die(), which it shouldn't anyway, not good to just stop the
        world because some particular perf.data file is invalid, just propagate
        the error to the caller.
      
      . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
        where it belongs, as most cases are moving to operate on an evsel object.o
      
      One of the fixed problems:
      
      [root@emilia ~]# python
      >>> import perf
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
      >>>
      [root@emilia ~]#
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56722381
  11. 02 6月, 2011 1 次提交
    • A
      perf evlist: Don't die if sample_{id_all|type} is invalid · c2a70653
      Arnaldo Carvalho de Melo 提交于
      Fixes two more cases where the python binding would not load:
      
      . Not finding die(), which it shouldn't anyway, not good to just stop the
        world because some particular perf.data file is invalid, just propagate
        the error to the caller.
      
      . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
        where it belongs, as most cases are moving to operate on an evsel object.o
      
      One of the fixed problems:
      
      [root@emilia ~]# python
      >>> import perf
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
      >>>
      [root@emilia ~]#
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2a70653
  12. 26 5月, 2011 1 次提交
    • A
      perf symbols: Handle /proc/sys/kernel/kptr_restrict · ec80fde7
      Arnaldo Carvalho de Melo 提交于
      Perf uses /proc/modules to figure out where kernel modules are loaded.
      
      With the advent of kptr_restrict, non root users get zeroes for all module
      start addresses.
      
      So check if kptr_restrict is non zero and don't generate the syntethic
      PERF_RECORD_MMAP events for them.
      
      Warn the user about it in perf record and in perf report.
      
      In perf report the reference relocation symbol being zero means that
      kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
      use it to fixup symbol addresses when using a valid kallsyms (in the buildid
      cache) or vmlinux (in the vmlinux path) build-id located automatically or
      specified by the user.
      
      Provide an explanation about it in 'perf report' if kernel samples were taken,
      checking if a suitable vmlinux or kallsyms was found/specified.
      
      Restricted /proc/kallsyms don't go to the buildid cache anymore.
      
      Example:
      
       [acme@emilia ~]$ perf record -F 100000 sleep 1
      
       WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
       /proc/sys/kernel/kptr_restrict.
      
       Samples in kernel functions may not be resolved if a suitable vmlinux file is
       not found in the buildid cache or in the vmlinux path.
      
       Samples in kernel modules won't be resolved at all.
      
       If some relocation was applied (e.g. kexec) symbols may be misresolved even
       with a suitable vmlinux or kallsyms file.
      
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
       [acme@emilia ~]$
      
       [acme@emilia ~]$ perf report --stdio
       Kernel address maps (/proc/{kallsyms,modules}) were restricted,
       check /proc/sys/kernel/kptr_restrict before running 'perf record'.
      
       If some relocation was applied (e.g. kexec) symbols may be misresolved.
      
       Samples in kernel modules can't be resolved as well.
      
       # Events: 13  cycles
       #
       # Overhead  Command      Shared Object                 Symbol
       # ........  .......  .................  .....................
       #
          20.24%    sleep  [kernel.kallsyms]  [k] page_fault
          20.04%    sleep  [kernel.kallsyms]  [k] filemap_fault
          19.78%    sleep  [kernel.kallsyms]  [k] __lru_cache_add
          19.69%    sleep  ld-2.12.so         [.] memcpy
          14.71%    sleep  [kernel.kallsyms]  [k] dput
           4.70%    sleep  [kernel.kallsyms]  [k] flush_signal_handlers
           0.73%    sleep  [kernel.kallsyms]  [k] perf_event_comm
           0.11%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe
      
       #
       # (For a higher level overview, try: perf report --sort comm,dso)
       #
       [acme@emilia ~]$
      
      This is because it found a suitable vmlinux (build-id checked) in
      /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
      file name).
      
      If we remove that file from the vmlinux path:
      
       [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
      		     /lib/modules/2.6.39-rc7+/build/vmlinux.OFF
       [acme@emilia ~]$ perf report --stdio
       [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
       not found, continuing without symbols
      
       Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
       /proc/sys/kernel/kptr_restrict before running 'perf record'.
      
       As no suitable kallsyms nor vmlinux was found, kernel samples can't be
       resolved.
      
       Samples in kernel modules can't be resolved as well.
      
       # Events: 13  cycles
       #
       # Overhead  Command      Shared Object  Symbol
       # ........  .......  .................  ......
       #
          80.31%    sleep  [kernel.kallsyms]  [k] 0xffffffff8103425a
          19.69%    sleep  ld-2.12.so         [.] memcpy
      
       #
       # (For a higher level overview, try: perf report --sort comm,dso)
       #
       [acme@emilia ~]$
      Reported-by: NStephane Eranian <eranian@google.com>
      Suggested-by: NDavid Miller <davem@davemloft.net>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ec80fde7
  13. 24 5月, 2011 1 次提交
  14. 23 5月, 2011 1 次提交
  15. 22 5月, 2011 1 次提交
  16. 29 3月, 2011 1 次提交
    • A
      perf symbols: Fix vsyscall symbol lookup · 6c6804fb
      Andrew Lutomirski 提交于
      Perf can't currently trace into the vsyscall page.  It looks like it was
      meant to work.
      
      Tested on 2.6.38 and today's -git.
      
      The bug is easy to reproduce.  Compile this:
      
      int main()
      {
      	int i;
      	struct timespec t;
      	for(i = 0; i < 10000000; i++)
      		clock_gettime(CLOCK_MONOTONIC, &t);
      	return 0;
      }
      
      and run it through perf record; perf report.  The top entry shows
      "[unknown]" and you can't zoom in.
      
      It looks like there are two issues.  The first is a that a test for user
      mode executing in kernel space is backwards.  (That's the first hunk
      below).  The second (I think) is that something's wrong with the code
      that generates lots of little struct dso objects for different sections
      -- when it runs on vmlinux it results in bogus long_name values which
      cause objdump to fail.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LPU-Reference: <AANLkTikxSw5+wJZUWNz++nL7mgivCh_Zf=2Kq6=f9Ce_@mail.gmail.com>
      Signed-off-by: NAndy Lutomirski <luto@mit.edu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6c6804fb
  17. 06 3月, 2011 1 次提交
    • A
      perf hists: Remove needless global col lenght calcs · d7603d51
      Arnaldo Carvalho de Melo 提交于
      To support multiple events we need to do these calcs per 'struct hists'
      instance, and it turns out we already do that at:
      
      	__hists__add_entry
      		hists__inc_nr_entries
      			hists__calc_col_len
      
      for all the unfiltered hist_entry instances we stash in the rb tree, so
      trow away the dead code.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d7603d51
  18. 10 2月, 2011 1 次提交
    • A
      perf tools: Fix thread_map event synthesizing in top and record · 401b8e13
      Arnaldo Carvalho de Melo 提交于
      Jeff Moyer reported these messages:
      
        Warning:  ... trying to fall back to cpu-clock-ticks
      
      couldn't open /proc/-1/status
      couldn't open /proc/-1/maps
      [ls output]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]
      
      That lead me and David Ahern to see that something was fishy on the thread
      synthesizing routines, at least for the case where the workload is started
      from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
      parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
      PERF_RECORD_COMM events for the thread -1, a bug.
      
      So I investigated this and noticed that when we introduced support for
      recording a process and its threads using --pid some bugs were introduced and
      that the way to fix it was to instead of passing the target_tid to the event
      synthesizing routines we should better pass the thread_map that has the list of
      threads for a --pid or just the single thread for a --tid.
      
      Checked in the following ways:
      
      On a 8-way machine run cyclictest:
      
      [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
      policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798
      
      T: 0 (28791) P:99 I:100 C:  25072 Min:      4 Act:    5 Avg:    6 Max:     122
      T: 1 (28792) P:98 I:150 C:  16715 Min:      4 Act:    6 Avg:    5 Max:      27
      T: 2 (28793) P:97 I:200 C:  12534 Min:      4 Act:    5 Avg:    4 Max:       8
      T: 3 (28794) P:96 I:250 C:  10028 Min:      4 Act:    5 Avg:    5 Max:      96
      T: 4 (28795) P:95 I:300 C:   8357 Min:      5 Act:    6 Avg:    5 Max:      12
      T: 5 (28796) P:94 I:350 C:   7163 Min:      5 Act:    6 Avg:    5 Max:      12
      T: 6 (28797) P:93 I:400 C:   6267 Min:      4 Act:    5 Avg:    5 Max:       9
      T: 7 (28798) P:92 I:450 C:   5571 Min:      4 Act:    5 Avg:    5 Max:       9
      ^C[ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ]
      
      [root@emilia ~]#
      
      This will create one extra thread per CPU:
      
      [root@emilia ~]# tuna -t cyclictest -CP
                            thread       ctxt_switches
          pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
       28825   OTHER     0     0xff      2169          671      cyclictest
        28832   FIFO    93        6     52338            1      cyclictest
        28833   FIFO    92        7     46524            1      cyclictest
        28826   FIFO    99        0    209360            1      cyclictest
        28827   FIFO    98        1    139577            1      cyclictest
        28828   FIFO    97        2    104686            0      cyclictest
        28829   FIFO    96        3     83751            1      cyclictest
        28830   FIFO    95        4     69794            1      cyclictest
        28831   FIFO    94        5     59825            1      cyclictest
      [root@emilia ~]#
      
      So we should expect only samples for the above 9 threads when using the
      --dump-raw-trace|-D perf report switch to look at the column with the tid:
      
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
          629 28825
          110 28826
          491 28827
          308 28828
          198 28829
          621 28830
          225 28831
          203 28832
           89 28833
      [root@emilia ~]#
      
      So for workloads started by 'perf record' seems to work, now for existing workloads,
      just run cyclictest first, without 'perf record':
      
      [root@emilia ~]# tuna -t cyclictest -CP
                            thread       ctxt_switches
          pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
       28859   OTHER     0     0xff       594          200      cyclictest
        28864   FIFO    95        4     16587            1      cyclictest
        28865   FIFO    94        5     14219            1      cyclictest
        28866   FIFO    93        6     12443            0      cyclictest
        28867   FIFO    92        7     11062            1      cyclictest
        28860   FIFO    99        0     49779            1      cyclictest
        28861   FIFO    98        1     33190            1      cyclictest
        28862   FIFO    97        2     24895            1      cyclictest
        28863   FIFO    96        3     19918            1      cyclictest
      [root@emilia ~]#
      
      and then later did:
      
      [root@emilia ~]# perf record --pid 28859 sleep 3
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
      [root@emilia ~]#
      
      To collect 3 seconds worth of samples for pid 28859 and its children:
      
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
           15 28859
           33 28860
           19 28861
           13 28862
           13 28863
           10 28864
           11 28865
            9 28866
          255 28867
      [root@emilia ~]#
      
      Works, last thing is to check if looking at just one of those threads also works:
      
      [root@emilia ~]# perf record --tid 28866 sleep 3
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
      [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
            3 28866
      [root@emilia ~]#
      
      Works too.
      Reported-by: NJeff Moyer <jmoyer@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      401b8e13
  19. 30 1月, 2011 3 次提交
    • A
      perf tools: Kill event_t typedef, use 'union perf_event' instead · 8115d60c
      Arnaldo Carvalho de Melo 提交于
      And move the event_t methods to the perf_event__ too.
      
      No code changes, just namespace consistency.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8115d60c
    • A
      perf tools: Rename 'struct sample_data' to 'struct perf_sample' · 8d50e5b4
      Arnaldo Carvalho de Melo 提交于
      Making the namespace more uniform.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d50e5b4
    • A
      perf events: Account PERF_RECORD_LOST events in event__process · ef2bf6d0
      Arnaldo Carvalho de Melo 提交于
      Right now this function is only used by perf top, that uses PROT_READ
      only, i.e. overwrite mode, so no PERF_RECORD_LOST events are generated,
      but don't forget those events.
      
      The patch that moved this out of perf top was made so that this routine
      could be used by 'perf probe' in the uprobes patchset, so perhaps there
      they need to check for LOST events and warn the user, as will be done in
      the following patches that will switch 'perf top' to non overwrite mode
      (mmap with PROT_READ|PROT_WRITE).
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef2bf6d0
  20. 24 1月, 2011 1 次提交
    • A
      perf tools: Move event__parse_sample to evsel.c · d0dd74e8
      Arnaldo Carvalho de Melo 提交于
      To avoid linking more stuff in the python binding I'm working on, future
      csets will make the sample type be taken from the evsel itself, but for
      that we need to first have one file per cpu and per sample_type, not a
      single perf.data file.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d0dd74e8
  21. 23 1月, 2011 1 次提交
    • A
      perf tools: Fix 64 bit integer format strings · 9486aa38
      Arnaldo Carvalho de Melo 提交于
      Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
      by making our 64 bit integers typedefs of stdint.h types and using
      PRI[ux]64 like, for instance, git does.
      
      Reported by Denis Kirjanov that provided a patch for one case, I went
      and changed all cases.
      Reported-by: NDenis Kirjanov <dkirjanov@kernel.org>
      Tested-by: NDenis Kirjanov <dkirjanov@kernel.org>
      LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
      Cc: Denis Kirjanov <dkirjanov@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pingtian Han <phan@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9486aa38
  22. 23 12月, 2010 1 次提交
    • A
      perf symbols: Improve kallsyms symbol end addr calculation · 3b01a413
      Arnaldo Carvalho de Melo 提交于
      For kallsyms we don't have the symbol address end, so we do an extra pass and
      set the symbol end addr as being the start of the next minus one.
      
      But this was being done just after we filtered the symbols of a
      particular type (functions, variables), so the symbol end was sometimes
      after what it really is.
      
      Fixing up symbol end also was falling apart when we have symbol aliases,
      then the end address of all but the last alias was being set to be
      before its start.
      
      Fix it up by checking for symbol aliases and making the kallsyms__parse
      routine use the next symbol, whatever its type, as the limit for the
      previous symbol, passing that end address to the callback.
      
      This was detected by the 'perf test' synthetic paranoid regression
      tests, fix it up so that even that case doesn't mislead us.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b01a413
  23. 09 12月, 2010 1 次提交
  24. 05 12月, 2010 2 次提交
    • A
      perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events · 9c90a61c
      Arnaldo Carvalho de Melo 提交于
      So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME:
      
        $ perf record -aT
        $ perf report -D | grep PERF_RECORD_
        <SNIP>
         3   5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3
         3   5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3
         3   5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811)
         3   5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3
         3   5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853
         3   5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find
         3   5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3
         3   5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so
         3   5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso]
         3   5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1
         3   5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so
         3   5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so
         3   5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1
         3   5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3
        <SNIP>
      
      First column is the cpu and the second the timestamp.
      
      That way we can investigate problems in the event stream.
      
      If the new perf binary is run on an older kernel, it will disable this feature
      automatically.
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9c90a61c
    • A
      perf session: Parse sample earlier · 640c03ce
      Arnaldo Carvalho de Melo 提交于
      At perf_session__process_event, so that we reduce the number of lines in eache
      tool sample processing routine that now receives a sample_data pointer already
      parsed.
      
      This will also be useful in the next patch, where we'll allow sample the
      identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
      timestamp) just after before every event.
      
      Also validate callchains in perf_session__process_event, i.e. as early as
      possible, and keep a counter of the number of events discarded due to invalid
      callchains, warning the user about it if it happens.
      
      There is an assumption that was kept that all events have the same sample_type,
      that will be dealt with in the future, when this preexisting limitation will be
      removed.
      Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      640c03ce
  25. 27 11月, 2010 1 次提交
  26. 04 8月, 2010 2 次提交
    • S
      perf: expose event__process function · b83f920e
      Srikar Dronamraju 提交于
      The event__process function is useful in processing /proc/<pid>/maps.  All of
      the functions that are called from event__process are defined in util/event.c.
      Though its defined in builtin-top.c, it could be reused for perf probe for
      uprobes. Hence moving it to util/event.c and exporting the function.
      
      LKML-Reference: <20100802123851.GD22812@linux.vnet.ibm.com>
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b83f920e
    • D
      perf events: Fix mmap offset determination · b5a63254
      Dave Martin 提交于
      Fix buggy-looking code which unnecessarily adjusts the file offset
      fields read from /proc/*/maps.
      
      This may have gone unnoticed since the offset is usually 0 (and the
      logic in util/symbol.c may work incorrectly for other offset values).
      
      Commiter note:
      
      This fixes a bug introduced in 4af8b35d, there is no need to shift pgoff
      twice, the show_map_vma routine in fs/proc/task_mmu.c already converts
      it from the number of pages to the size in bytes, and that is what
      appears in /proc/PID/map.
      
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Will Deacon <Will.Deacon@arm.com>
      LKML-Reference: <1280836116-6654-2-git-send-email-dave.martin@linaro.org>
      Signed-off-by: NDave Martin <dave.martin@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b5a63254
  27. 31 7月, 2010 1 次提交
  28. 30 7月, 2010 1 次提交
  29. 27 7月, 2010 1 次提交
  30. 23 7月, 2010 1 次提交
  31. 17 6月, 2010 1 次提交
    • A
      perf session: Remove threads from tree on PERF_RECORD_EXIT · 720a3aeb
      Arnaldo Carvalho de Melo 提交于
      Move them to a session->dead_threads list just like we do with maps that
      are replaced, because we may have hist_entries pointing to them.
      
      This fixes a bug when inserting maps for a new thread that reused the
      TID, mixing maps for two different threads, causing an endless loop.
      
      The code for insering maps should be made more robust but for .35 this
      is the minimalistic patch.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      720a3aeb
  32. 05 6月, 2010 2 次提交
    • A
      perf report: Implement --sort cpu · f60f3593
      Arun Sharma 提交于
      In a shared multi-core environment, users want to analyze why their
      program was slow. In particular, if the code ran slower only on certain
      CPUs due to interference from other programs or kernel threads, the user
      should be able to notice that.
      
      Sample usage:
      
      perf record -f -a -- sleep 3
      perf report --sort cpu,comm
      
      Workload:
      
      program is running on 16 CPUs
      Experiencing interference from an antagonist only on 4 CPUs.
      
        Samples: 106218177676 cycles
      
        Overhead  CPU          Command
        ........  ...  ...............
      
           6.25%  2            program
           6.24%  6            program
           6.24%  11           program
           6.24%  5            program
           6.24%  9            program
           6.24%  10           program
           6.23%  15           program
           6.23%  7            program
           6.23%  3            program
           6.23%  14           program
           6.22%  1            program
           6.20%  13           program
           3.17%  12           program
           3.15%  8            program
           3.14%  0            program
           3.13%  4            program
           3.11%  4         antagonist
           3.11%  0         antagonist
           3.10%  8         antagonist
           3.07%  12        antagonist
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <20100505181612.GA5091@sharma-home.net>
      Signed-off-by: NArun Sharma <aruns@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f60f3593
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20