1. 05 6月, 2010 2 次提交
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
    • S
      perf tools: Add the ability to specify list of cpus to monitor · c45c6ea2
      Stephane Eranian 提交于
      This patch adds a -C option to stat, record, top to designate a list of CPUs to
      monitor. CPUs can be specified as a comma-separated list or ranges, no space
      allowed.
      
      Examples:
      $ perf record -a -C0-1,4-7 sleep 1
      $ perf top -C0-4
      $ perf stat -a -C1,2,3,4 sleep 1
      
      With perf record in per-thread mode with inherit mode on, samples are collected
      only when the thread runs on the designated CPUs.
      
      The -C option does not turn on system-wide mode automatically.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c45c6ea2
  2. 27 5月, 2010 1 次提交
    • A
      perf symbols: Add the build id cache to the vmlinux path · 5ad90e4e
      Arnaldo Carvalho de Melo 提交于
      So that if the kernel DSO has a build id because record inserted it in
      the perf.data build id table in the header, or a BUILD_ID event was
      inserted in the stream, we first look at the build id cache
      ($HOME/.debug/).
      
      If we find it there, try to use it, allowing offline annotation in
      addition to 'perf report'.
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5ad90e4e
  3. 18 5月, 2010 2 次提交
    • A
      perf options: Type check all the remaining OPT_ variants · edb7c60e
      Arnaldo Carvalho de Melo 提交于
      OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
      tools is to set something that has an enum type, that is builtin
      compatible with unsigned int.
      
      Several string constifications were done to make OPT_STRING require a
      const char * type.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      edb7c60e
    • A
      perf options: Check v type in OPT_U?INTEGER · 1967936d
      Arnaldo Carvalho de Melo 提交于
      To avoid problems like the one fixed by Stephane Eranian in 3de29cab, now
      we'll got this instead:
      
      	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
      	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’
      
      Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
      hackers should be already used to this.
      
      With it in place found some problems, fixed by changing the affected
      variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.
      
      Next csets will go thru converting each of the remaining OPT_ so that
      review can be made easier by grouping changes per type per patch.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1967936d
  4. 07 5月, 2010 1 次提交
    • P
      perf, x86: Improve the PEBS ABI · ab608344
      Peter Zijlstra 提交于
      Rename perf_event_attr::precise to perf_event_attr::precise_ip and
      widen it to 2 bits. This new field describes the required precision of
      the PERF_SAMPLE_IP field:
      
        0 - SAMPLE_IP can have arbitrary skid
        1 - SAMPLE_IP must have constant skid
        2 - SAMPLE_IP requested to have 0 skid
        3 - SAMPLE_IP must have 0 skid
      
      And modify the Intel PEBS code accordingly. The PEBS implementation
      now supports up to precise_ip == 2, where we perform the IP fixup.
      
      Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit
      should be set for each PERF_SAMPLE_IP field known to match the actual
      instruction triggering the event.
      
      This new scheme allows for a PEBS mode that uses the buffer for more
      than a single event.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab608344
  5. 03 5月, 2010 1 次提交
    • T
      perf: add perf-inject builtin · 454c407e
      Tom Zanussi 提交于
      Currently, perf 'live mode' writes build-ids at the end of the
      session, which isn't actually useful for processing live mode events.
      
      What would be better would be to have the build-ids sent before any of
      the samples that reference them, which can be done by processing the
      event stream and retrieving the build-ids on the first hit.  Doing
      that in perf-record itself, however, is off-limits.
      
      This patch introduces perf-inject, which does the same job while
      leaving perf-record untouched.  Normal mode perf still records the
      build-ids at the end of the session as it should, but for live mode,
      perf-inject can be injected in between the record and report steps
      e.g.:
      
      perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -
      
      perf-inject reads a perf-record event stream and repipes it to stdout.
      At any point the processing code can inject other events into the
      event stream - in this case build-ids (-b option) are read and
      injected as needed into the event stream.
      
      Build-ids are just the first user of perf-inject - potentially
      anything that needs userspace processing to augment the trace stream
      with additional information could make use of this facility.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      454c407e
  6. 28 4月, 2010 2 次提交
    • A
      perf machines: Make the machines class adopt the dsos__fprintf methods · cbf69680
      Arnaldo Carvalho de Melo 提交于
      Now those methods don't operate on a global list of dsos, but on lists
      of machines, so make this clear by renaming the functions.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cbf69680
    • A
      perf tools: Rename "kernel_info" to "machine" · 23346f21
      Arnaldo Carvalho de Melo 提交于
      struct kernel_info and kerninfo__ are too vague, what they really
      describe are machines, virtual ones or hosts.
      
      There are more changes to introduce helpers to shorten function calls
      and to make more clear what is really being done, but I left that for
      subsequent patches.
      
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      23346f21
  7. 19 4月, 2010 1 次提交
  8. 14 4月, 2010 1 次提交
    • I
      perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR() · c0555642
      Ian Munsie 提交于
      Parsing an option from the command line with OPT_BOOLEAN on a
      bool data type would not work on a big-endian machine due to the
      manner in which the boolean was being cast into an int and
      incremented. For example, running 'perf probe --list' on a
      PowerPC machine would fail to properly set the list_events bool
      and would therefore print out the usage information and
      terminate.
      
      This patch makes OPT_BOOLEAN work as expected with a bool
      datatype. For cases where the original OPT_BOOLEAN was
      intentionally being used to increment an int each time it was
      passed in on the command line, this patch introduces OPT_INCR
      with the old behaviour of OPT_BOOLEAN (the verbose variable is
      currently the only such example of this).
      
      I have reviewed every use of OPT_BOOLEAN to verify that a true
      C99 bool was passed. Where integers were used, I verified that
      they were only being used for boolean logic and changed them to
      bools to ensure that they would not be mistakenly used as ints.
      The major exception was the verbose variable which now uses
      OPT_INCR instead of OPT_BOOLEAN.
      Signed-off-by: NIan Munsie <imunsie@au.ibm.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: <stable@kernel.org> # NOTE: wont apply to .3[34].x cleanly, please backport
      Cc: Git development list <git@vger.kernel.org>
      Cc: Ian Munsie <imunsie@au1.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Valdis.Kletnieks@vt.edu
      Cc: WANG Cong <amwang@redhat.com>
      Cc: Thiago Farina <tfransosi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1271147857-11604-1-git-send-email-imunsie@au.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c0555642
  9. 26 3月, 2010 1 次提交
  10. 18 3月, 2010 1 次提交
    • Z
      perf events: Change perf parameter --pid to process-wide collection instead of thread-wide · d6d901c2
      Zhang, Yanmin 提交于
      Parameter --pid (or -p) of perf currently means a thread-wide
      collection. For exmaple, if a process whose id is 8888 has 10
      threads, 'perf top -p 8888' just collects the main thread
      statistics. That's misleading. Users are used to attach a whole
      process when debugging a process by gdb. To follow normal usage
      style, the patch change --pid to process-wide collection and add
      --tid (-t) to mean a thread-wide collection.
      
      Usage example is:
      
       # perf top -p 8888
       # perf record -p 8888 -f sleep 10
       # perf stat -p 8888 -f sleep 10
      
      Above commands collect the statistics of all threads of process
      8888.
      Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: zhiteng.huang@intel.com
      Cc: Zachary Amsden <zamsden@redhat.com>
      LKML-Reference: <1268922965-14774-3-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d6d901c2
  11. 17 3月, 2010 1 次提交
  12. 16 3月, 2010 1 次提交
  13. 15 3月, 2010 1 次提交
    • A
      perf top: Properly notify the user that vmlinux is missing · b0a9ab62
      Arnaldo Carvalho de Melo 提交于
      Before this patch this message would very briefly appear on the
      screen and then the screen would get updates only on the top,
      for number of interrupts received, etc, but no annotation would
      be performed:
      
       [root@doppio linux-2.6-tip]# perf top -s n_tty_write > /tmp/bla
       objdump: '[kernel.kallsyms]': No such file
      
      Now this is what the user gets:
      
       [root@doppio linux-2.6-tip]# perf top -s n_tty_write
       Can't annotate n_tty_write: No vmlinux file was found in the
       path: [0] vmlinux
       [1] /boot/vmlinux
       [2] /boot/vmlinux-2.6.33-rc5
       [3] /lib/modules/2.6.33-rc5/build/vmlinux
       [4] /usr/lib/debug/lib/modules/2.6.33-rc5/vmlinux
       [root@doppio linux-2.6-tip]#
      
      This bug was introduced when we added automatic search for
      vmlinux, before that time the user had to specify a vmlinux
      file.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: <stable@kernel.org>
      LKML-Reference: <1268664418-28328-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0a9ab62
  14. 12 3月, 2010 1 次提交
  15. 11 3月, 2010 1 次提交
    • P
      perf tools: Fix sparse CPU numbering related bugs · a12b51c4
      Paul Mackerras 提交于
      At present, the perf subcommands that do system-wide monitoring
      (perf stat, perf record and perf top) don't work properly unless
      the online cpus are numbered 0, 1, ..., N-1.  These tools ask
      for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
      and then try to create events for cpus 0, 1, ..., N-1.
      
      This creates problems for systems where the online cpus are
      numbered sparsely.  For example, a POWER6 system in
      single-threaded mode (i.e. only running 1 hardware thread per
      core) will have only even-numbered cpus online.
      
      This fixes the problem by reading the /sys/devices/system/cpu/online
      file to find out which cpus are online.  The code that does that is in
      tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
      function that sets up a cpumap[] array and returns the number of
      online cpus.  If /sys/devices/system/cpu/online can't be read or
      can't be parsed successfully, it falls back to using sysconf to
      ask how many cpus are online and sets up an identity map in cpumap[].
      
      The perf record, perf stat and perf top code then calls
      read_cpu_map() in the system-wide monitoring case (instead of
      sysconf) and uses cpumap[] to get the cpu numbers to pass to
      perf_event_open.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a12b51c4
  16. 10 3月, 2010 1 次提交
  17. 25 2月, 2010 1 次提交
    • A
      perf top: Use a macro instead of a constant variable · c7ad21af
      Arnaldo Carvalho de Melo 提交于
      To overcome a silly gcc warning:
      
       cc1: warnings being treated as errors
       builtin-top.c: In function ‘lookup_sym_source’:
       builtin-top.c:291: warning: not protecting local variables:
       variable length buffer make: *** [builtin-top.o] Error 1
       make: *** Waiting for unfinished jobs....
      
      That is emitted for this:
      
      	const size_t pattern_len = BITS_PER_LONG / 4 + 2;
      	char pattern[pattern_len + 1];
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1266866062-6287-1-git-send-email-acme@infradead.org>
      [ -v2: macroify the naming style ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c7ad21af
  18. 14 2月, 2010 1 次提交
    • K
      perf top: Fix help text alignment · 1a72cfa6
      Kirill Smelkov 提交于
      Print this:
      
      Mapped keys:
              [d]     display refresh delay.                  (2)
              [e]     display entries (lines).                (46)
              [f]     profile display filter (count).         (5)
              [F]     annotate display filter (percent).      (5%)
              [s]     annotate symbol.                        (NULL)
              [S]     stop annotation.
              [K]     hide kernel_symbols symbols.            (no)
              [U]     hide user symbols.                      (no)
              [z]     toggle sample zeroing.                  (0)
              [qQ]    quit.
      
      instead of:
      
      Mapped keys:
              [d]     display refresh delay.                  (2)
              [e]     display entries (lines).                (46)
              [f]     profile display filter (count).         (5)
              [F]     annotate display filter (percent).      (5%)
              [s]     annotate symbol.                        (NULL)
              [S]     stop annotation.
              [K]     hide kernel_symbols symbols.                    (no)
              [U]     hide user symbols.                      (no)
              [z]     toggle sample zeroing.                  (0)
              [qQ]    quit.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100212162059.GA30041@landau.phys.spbu.ru>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1a72cfa6
  19. 08 2月, 2010 2 次提交
    • A
      perf top: Use address pattern in lookup_sym_source · 5f485364
      Arnaldo Carvalho de Melo 提交于
      Because we may have aliases, like __GI___strcoll_l in
      /lib64/libc-2.10.2.so that appears in objdump as:
      
      $ objdump --start-address=0x0000003715a86420 \
                 --stop-address=0x0000003715a872dc -dS /lib64/libc-2.10.2.so
      
      0000003715a86420 <__strcoll_l>:
        3715a86420:	55                   	push   %rbp
        3715a86421:	48 89 e5             	mov    %rsp,%rbp
        3715a86424:	41 57                	push   %r15
      [root@doppio linux-2.6-tip]#
      
      So look for the address exactly at the start of the line instead
      so that annotation can work for in these cases.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Kirill Smelkov <kirr@landau.phys.spbu.ru>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1265550376-12665-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5f485364
    • K
      perf top: Fix annotate for userspace · ee11b90b
      Kirill Smelkov 提交于
      First, for programs and prelinked libraries, annotate code was
      fooled by objdump output IPs (src->eip in the code) being
      wrongly converted to absolute IPs. In such case there were no
      conversion needed, but in
      
         src->eip = strtoull(src->line, NULL, 16);
         src->eip = map->unmap_ip(map, src->eip); // = eip + map->start - map->pgoff
      
      we were reading absolute address from objdump (e.g. 8048604) and
      then almost doubling it, because eip & map->start are
      approximately close for small programs.
      
      Needless to say, that later, in record_precise_ip() there was no
      matching with real runtime IPs.
      
      And second, like with `perf annotate` the problem with
      non-prelinked *.so was that we were doing rip -> objdump address
      conversion wrong.
      
      Also, because unlike `perf annotate`, `perf top` code does
      annotation based on absolute IPs for performance reasons(*), new
      helper for mapping objdump addresse to IP is introduced.
      
      (*) we get samples info in absolute IPs, and since we do lots of
          hit-testing on absolute IPs at runtime in record_precise_ip(), it's
          better to convert objdump addresses to IPs once and do no conversion
          at runtime.
      
      I also had to fix how objdump output is parsed (with hardcoded
      8/16 characters format, which was inappropriate for ET_DYN dsos
      with small addresses like '4ac')
      
      Also note, that not all objdump output lines has associtated
      IPs, e.g. look at source lines here:
      
          000004ac <my_strlen>:
          extern "C"
          int my_strlen(const char *s)
           4ac:   55                      push   %ebp
           4ad:   89 e5                   mov    %esp,%ebp
           4af:   83 ec 10                sub    $0x10,%esp
          {
              int len = 0;
           4b2:   c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%ebp)
           4b9:   eb 08                   jmp    4c3 <my_strlen+0x17>
      
              while (*s) {
                  ++len;
           4bb:   83 45 fc 01             addl   $0x1,-0x4(%ebp)
                  ++s;
           4bf:   83 45 08 01             addl   $0x1,0x8(%ebp)
      
      So we mark them with eip=0, and ignore such lines in annotate
      lookup code.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      [ Note: one hunk of this patch was applied by Mike in 57d81889 ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1265550376-12665-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ee11b90b
  20. 04 2月, 2010 2 次提交
    • M
      perf annotate: Fix perf top module symbol annotation · 57d81889
      Mike Galbraith 提交于
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Kirill Smelkov <kirr@landau.phys.spbu.ru>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1265265106.6364.5.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      57d81889
    • K
      perf top: Teach it to autolocate vmlinux · 6cff0e8d
      Kirill Smelkov 提交于
      By relying on logic in dso__load_kernel_sym(), we can
      automatically load vmlinux.
      
      The only thing which needs to be adjusted, is how --sym-annotate
      option is handled - now we can't rely on vmlinux been loaded
      until full successful pass of dso__load_vmlinux(), but that's
      not the case if we'll do sym_filter_entry setup in
      symbol_filter().
      
      So move this step right after event__process_sample() where we
      know the whole dso__load_kernel_sym() pass is done.
      
      By the way, though conceptually similar `perf top` still can't
      annotate userspace - see next patches with fixes.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1265223128-11786-9-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6cff0e8d
  21. 29 1月, 2010 2 次提交
  22. 27 1月, 2010 2 次提交
  23. 14 1月, 2010 2 次提交
    • K
      perf top: Fix code typo in prompt_symbol() · 66aeb6d5
      Kirill Smelkov 提交于
      sym_filter is what was (if ever) passed with -s option. What was
      typed by user, and what we were looking for, is in buf.
      Signed-off-by: NKirill Smelkov <kirr@landau.phys.spbu.ru>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1263396139-4798-3-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66aeb6d5
    • A
      perf tools: Encode kernel module mappings in perf.data · b7cece76
      Arnaldo Carvalho de Melo 提交于
      We were always looking at the running machine /proc/modules,
      even when processing a perf.data file, which only makes sense
      when we're doing 'perf record' and 'perf report' on the same
      machine, and in close sucession, or if we don't use modules at
      all, right Peter? ;-)
      
      Now, at 'perf record' time we read /proc/modules, find the long
      path for modules, and put them as PERF_MMAP events, just like we
      did to encode the reloc reference symbol for vmlinux. Talking
      about that now it is encoded in .pgoff, so that we can use
      .{start,len} to store the address boundaries for the kernel so
      that when we reconstruct the kmaps tree we can do lookups right
      away, without having to fixup the end of the kernel maps like we
      did in the past (and now only in perf record).
      
      One more step in the 'perf archive' direction when we'll finally
      be able to collect data in one machine and analyse in another.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b7cece76
  24. 16 12月, 2009 2 次提交
    • A
      perf symbols: Move symbol filtering to event__preprocess_sample() · c410a338
      Arnaldo Carvalho de Melo 提交于
      So that --dsos, --comm, --symbols can bem used in more tools,
      like in perf diff:
      
      $ perf record -f find / > /dev/null
      $ perf record -f find / > /dev/null
      $ perf diff --dsos /lib64/libc-2.10.1.so | head -5
         1        +22392124     /lib64/libc-2.10.1.so   _IO_vfprintf_internal
         2         +6410655     /lib64/libc-2.10.1.so   __GI_memmove
         3    +1   +9192692     /lib64/libc-2.10.1.so   _int_malloc
         4    -1  -15158605     /lib64/libc-2.10.1.so   _int_free
         5           +45669     /lib64/libc-2.10.1.so   _IO_new_file_xsputn
      $
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260914682-29652-3-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c410a338
    • A
      perf symbols: Make symbol_conf global · 75be6cf4
      Arnaldo Carvalho de Melo 提交于
      This simplifies a lot of functions, less stuff to be done by
      tool writers.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260914682-29652-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75be6cf4
  25. 15 12月, 2009 1 次提交
  26. 14 12月, 2009 3 次提交
    • A
      perf session: Move kmaps to perf_session · 4aa65636
      Arnaldo Carvalho de Melo 提交于
      There is still some more work to do to disentangle map creation
      from DSO loading, but this happens only for the kernel, and for
      the early adopters of perf diff, where this disentanglement
      matters most, we'll be testing different kernels, so no problem
      here.
      
      Further clarification: right now we create the kernel maps for
      the various modules and discontiguous kernel text maps when
      loading the DSO, we should do it as a two step process, first
      creating the maps, for multiple mappings with the same DSO
      store, then doing the dso load just once, for the first hit on
      one of the maps sharing this DSO backing store.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4aa65636
    • A
      perf session: Move the global threads list to perf_session · b3165f41
      Arnaldo Carvalho de Melo 提交于
      So that we can process two perf.data files.
      
      We still need to add a O_MMAP mode for perf_session so that we
      can do all the mmap stuff in it.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-5-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3165f41
    • A
      perf session: Pass the perf_session to the event handling operations · d8f66248
      Arnaldo Carvalho de Melo 提交于
      They will need it to get the right threads list, etc.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1260741029-4430-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d8f66248
  27. 28 11月, 2009 3 次提交
    • A
      perf tools: Consolidate symbol resolving across all tools · 1ed091c4
      Arnaldo Carvalho de Melo 提交于
      Now we have a very high level routine for simple tools to
      process IP sample events:
      
      	int event__preprocess_sample(const event_t *self,
      				     struct addr_location *al,
      				     symbol_filter_t filter)
      
      It receives the event itself and will insert new threads in the
      global threads list and resolve the map and symbol, filling all
      this info into the new addr_location struct, so that tools like
      annotate and report can further process the event by creating
      hist_entries in their specific way (with or without callgraphs,
      etc).
      
      It in turn uses the new next layer function:
      
      	void thread__find_addr_location(struct thread *self, u8 cpumode,
      					enum map_type type, u64 addr,
      					struct addr_location *al,
      					symbol_filter_t filter)
      
      This one will, given a thread (userspace or the kernel kthread
      one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE
      too in the near future) at the given cpumode, taking vdsos into
      account (userspace hit, but kernel symbol) and will fill all
      these details in the addr_location given.
      
      Tools that need a more compact API for plain function
      resolution, like 'kmem', can use this other one:
      
      	struct symbol *thread__find_function(struct thread *self, u64 addr,
      					     symbol_filter_t filter)
      
      So, to resolve a kernel symbol, that is all the 'kmem' tool
      needs, its just a matter of calling:
      
      	sym = thread__find_function(kthread, addr, NULL);
      
      The 'filter' parameter is needed because we do lazy
      parsing/loading of ELF symtabs or /proc/kallsyms.
      
      With this we remove more code duplication all around, which is
      always good, huh? :-)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1ed091c4
    • A
      perf tools: Reorganize event processing routines, lotsa dups killed · 62daacb5
      Arnaldo Carvalho de Melo 提交于
      While implementing event__preprocess_sample, that will do all of
      the symbol lookup in one convenient function, I noticed that
      util/process_event.[ch] were not being used at all, then started
      looking if there were other functions that could be shared
      and...
      
      All those functions really don't need to receive offset + head,
      the only thing they did was common to all of them, so do it at
      one place instead.
      
      Stats about number of each type of event processed now is done
      in a central place.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259346563-12568-11-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      62daacb5
    • A
      perf symbols: Support multiple symtabs in struct thread · 95011c60
      Arnaldo Carvalho de Melo 提交于
      Making the routines that were so far specific to the kernel maps
      useful for all threads.
      
      This is done by making the kernel maps be contained in a kernel
      "thread".
      
      This gets the kernel specific routines closer to the userspace
      counterparts, which will help in reducing the boilerplate for
      resolving a symbol, as will be demonstrated in the next patches.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1259346563-12568-9-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      95011c60