1. 15 3月, 2010 2 次提交
    • A
      perf top: Properly notify the user that vmlinux is missing · b0a9ab62
      Arnaldo Carvalho de Melo 提交于
      Before this patch this message would very briefly appear on the
      screen and then the screen would get updates only on the top,
      for number of interrupts received, etc, but no annotation would
      be performed:
      
       [root@doppio linux-2.6-tip]# perf top -s n_tty_write > /tmp/bla
       objdump: '[kernel.kallsyms]': No such file
      
      Now this is what the user gets:
      
       [root@doppio linux-2.6-tip]# perf top -s n_tty_write
       Can't annotate n_tty_write: No vmlinux file was found in the
       path: [0] vmlinux
       [1] /boot/vmlinux
       [2] /boot/vmlinux-2.6.33-rc5
       [3] /lib/modules/2.6.33-rc5/build/vmlinux
       [4] /usr/lib/debug/lib/modules/2.6.33-rc5/vmlinux
       [root@doppio linux-2.6-tip]#
      
      This bug was introduced when we added automatic search for
      vmlinux, before that time the user had to specify a vmlinux
      file.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: <stable@kernel.org>
      LKML-Reference: <1268664418-28328-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b0a9ab62
    • E
      perf record: Enable the enable_on_exec flag if record forks the target · bedbfdea
      Eric B Munson 提交于
      When forking its target, perf record can capture data from
      before the target application is started.  Perf stat uses the
      enable_on_exec flag in the event attributes to keep from
      displaying events from before the target program starts, this
      patch adds the same functionality to perf record when it is will
      fork the target process.
      Signed-off-by: NEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1268664418-28328-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bedbfdea
  2. 13 3月, 2010 4 次提交
    • A
      perf tools: Fix non-newt build · 567e5479
      Arnaldo Carvalho de Melo 提交于
      The use_browser needs to be in a file that is always built and
      also we need a browser__show_help stub in that case.
      Reported-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268438710-32697-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      567e5479
    • A
      perf hist: Don't fprintf the callgraph unconditionally · 3997d377
      Arnaldo Carvalho de Melo 提交于
      [root@doppio ~]# perf report -i newt.data | head -10
        # Samples: 11999679868
        #
        # Overhead  Command                  Shared Object  Symbol
        # ........  .......  .............................  ......
        #
            63.61%     perf  libslang.so.2.1.4              [.] SLsmg_write_chars
             6.30%     perf  perf                           [.] symbols__find
             2.19%     perf  libnewt.so.0.52.10             [.] newtListboxAppendEntry
             2.08%     perf  libslang.so.2.1.4              [.] SLsmg_write_chars@plt
             1.99%     perf  libc-2.10.2.so                 [.] _IO_vfprintf_internal
        [root@doppio ~]#
      
      Not good, the newt form for report works, but slang has to eat
      the cost of the additional callgraph lines everytime it prints a
      line, and the callgraph doesn't appear on the screen, so move
      the callgraph printing to a separate function and don't use it
      in newt.c.
      
      Newt tree widgets are being investigated to properly support
      callgraphs, but till that gets merged, lets remove this huge
      overhead and show at least the symbol overheads for a callgraph
      rich perf.data with good performance.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268408808-13595-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3997d377
    • A
      perf newt: Use newtGetScreenSize · cb7afb70
      Arnaldo Carvalho de Melo 提交于
      For consistency, use the newt API more fully.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268408808-13595-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cb7afb70
    • A
      perf newt: Add 'Q', 'q' and Ctrl+C as ways to exit from forms · 7081e087
      Arnaldo Carvalho de Melo 提交于
      These are keys people expect when pressed to exit the current
      widget, so have associate all of them to this semantic.
      Suggested-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268401692-9361-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7081e087
  3. 12 3月, 2010 7 次提交
    • A
      perf report: Implement initial UI using newt · f9224c5c
      Arnaldo Carvalho de Melo 提交于
      Newt has widespread availability and provides a rather simple
      API as can be seen by the size of this patch.
      
      The work needed to support it will benefit other frontends too.
      
      In this initial patch it just checks if the output is a tty, if
      not it falls back to the previous behaviour, also if
      newt-devel/libnewt-dev is not installed the previous behaviour
      is maintaned.
      
      Pressing enter on a symbol will annotate it, ESC in the
      annotation window will return to the report symbol list.
      
      More work will be done to remove the special casing in
      color_fprintf, stop using fmemopen/FILE in the printing of
      hist_entries, etc.
      
      Also the annotation doesn't need to be done via spawning "perf
      annotate" and then browsing its output, we can do better by
      calling directly the builtin-annotate.c functions, that would
      then be moved to tools/perf/util/annotate.c and shared with perf
      top, etc
      
      But lets go by baby steps, this patch already improves perf
      usability by allowing to quickly do annotations on symbols from
      the report screen and provides a first experimentation with
      libnewt/TUI integration of tools.
      
      Tested on RHEL5 and Fedora12 X86_64 and on Debian PARISC64 to
      browse a perf.data file collected on a Fedora12 x86_64 box.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268349164-5822-5-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f9224c5c
    • A
      perf tools: Add missing bytes printed in hist_entry__fprintf · dd2ee78d
      Arnaldo Carvalho de Melo 提交于
      We need those to properly size the browser widht in the newt
      TUI.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268349164-5822-4-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dd2ee78d
    • A
      perf tools: Use eprintf for pr_{err,warning,info} too · b4f5296f
      Arnaldo Carvalho de Melo 提交于
      Just like we do for pr_debug, so that we can have a single point
      where to redirect to the currently used output system, be it
      stdio or newt.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268349164-5822-3-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b4f5296f
    • A
      perf top: Export get_window_dimensions · 895f0edc
      Arnaldo Carvalho de Melo 提交于
      Will be used by the newt code too.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268349164-5822-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      895f0edc
    • A
      perf symbols: Bump plt synthesizing warning debug level · fe2197b8
      Arnaldo Carvalho de Melo 提交于
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268349164-5822-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      fe2197b8
    • A
      perf record: Mention paranoid sysctl when failing to create counter · 6230f2c7
      Arnaldo Carvalho de Melo 提交于
      [acme@mica linux-2.6-tip]$ perf record -a -f
         Fatal: Permission error - are you root?
       	 Consider tweaking /proc/sys/kernel/perf_event_paranoid.
      
       [acme@mica linux-2.6-tip]$
      Suggested-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268333592-30872-2-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6230f2c7
    • A
      perf record: Don't try to find buildids in a zero sized file · 9f591fd7
      Arnaldo Carvalho de Melo 提交于
      Fixing this symptom:
      
       [acme@mica linux-2.6-tip]$ perf record -a -f
         Fatal: Permission error - are you root?
      
       Bus error
       [acme@mica linux-2.6-tip]$
      
      I.e. if for some reason no data is collected, in this case a non
      root user trying to do systemwide profiling, no data will be
      collected, and then we end up trying to mmap a zero sized file
      and access the file header, b00m.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: <stable@kernel.org>
      LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f591fd7
  4. 11 3月, 2010 2 次提交
    • J
      perf: Make the install relative to DESTDIR if specified · 7ae5f213
      John Kacur 提交于
      Without this change, the install path is relative to
      prefix/DESTDIR where prefix is automatically set to $HOME.
      
      This can produce unexpected results. For example:
      
        make -C tools/perf DESTDIR=/home/jkacur/tmp install-man
      
      creates the directory:		/home/jkacur/home/jkacur/tmp/share/...
      instead of the expected:	/home/jkacur/tmp/share/...
      Signed-off-by: NJohn Kacur <jkacur@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Kyle McMartin <kyle@redhat.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <1268312220-12880-1-git-send-email-jkacur@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ae5f213
    • P
      perf tools: Fix sparse CPU numbering related bugs · a12b51c4
      Paul Mackerras 提交于
      At present, the perf subcommands that do system-wide monitoring
      (perf stat, perf record and perf top) don't work properly unless
      the online cpus are numbered 0, 1, ..., N-1.  These tools ask
      for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
      and then try to create events for cpus 0, 1, ..., N-1.
      
      This creates problems for systems where the online cpus are
      numbered sparsely.  For example, a POWER6 system in
      single-threaded mode (i.e. only running 1 hardware thread per
      core) will have only even-numbered cpus online.
      
      This fixes the problem by reading the /sys/devices/system/cpu/online
      file to find out which cpus are online.  The code that does that is in
      tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
      function that sets up a cpumap[] array and returns the number of
      online cpus.  If /sys/devices/system/cpu/online can't be read or
      can't be parsed successfully, it falls back to using sysconf to
      ask how many cpus are online and sets up an identity map in cpumap[].
      
      The perf record, perf stat and perf top code then calls
      read_cpu_map() in the system-wide monitoring case (instead of
      sysconf) and uses cpumap[] to get the cpu numbers to pass to
      perf_event_open.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a12b51c4
  5. 10 3月, 2010 10 次提交
  6. 04 3月, 2010 4 次提交
    • T
      perf trace: Don't use pager if scripting · cf4fee50
      Tom Zanussi 提交于
      It's useful for paging through raw traces, but just gets in the
      way when scripting.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      LKML-Reference: <1267599873-8193-3-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      cf4fee50
    • T
      perf trace/scripting: Remove extraneous header read · 10c95f4f
      Tom Zanussi 提交于
      perf_header__read() is already done in perf_session__open(), so
      remove it from the script gen case.
      Signed-off-by: NTom Zanussi <tzanussi@gmail.com>
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      LKML-Reference: <1267599873-8193-2-git-send-email-tzanussi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      10c95f4f
    • W
      perf, ARM: Modify kuser rmb() call to compile for Thumb-2 · da7196e1
      Will Deacon 提交于
      The Thumb-2 instruction set does not provide an encoding
      for sub pc, r0, #95 as present in the rmb() definition used
      by perf. This results in compilation failure when using a
      compiler targetting an instruction set other than ARM.
      
      This patch redefines rmb() for ARM by casting the address
      of the kuser helper to a function pointer, therefore getting
      the compiler to take care of making the call.
      
      Patch taken against tip/master.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Jamie Iles <jamie.iles@picochip.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1267616878-2154-1-git-send-email-will.deacon@arm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      da7196e1
    • M
      perf probe: Correct probe syntax on command line help · 32cb0dd5
      Masami Hiramatsu 提交于
      Move @src right after FUNC in syntax according to syntax change
      on command line help.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      LKML-Reference: <20100304033843.3819.10087.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      32cb0dd5
  7. 03 3月, 2010 1 次提交
    • A
      perf archive: Don't try to collect files without a build-id · 66301254
      Arnaldo Carvalho de Melo 提交于
      To avoid these error:
      
         [root@doppio ~]# perf archive
         tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
         No such file or directory
         tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
         No such file or directory
         tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
         No such file or directory
         tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
         No such file or directory
         tar: Exiting with failure status due to previous errors
         [root@doppio ~]#
      
      More work is needed to support archiving symtabs for binaries
      without a build-id, perhaps creating a perf.data UUID + adding
      build-ids for the binaries copied into the cache and then have
      this perf.data session UUID be a directory with symlinks to the
      by now calculated build-id of the files inside it.
      
      Or just do an extra pass and insert the calculated build-ids in
      the perf.data header.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      66301254
  8. 28 2月, 2010 2 次提交
    • F
      perf lock: Drop the buffers multiplexing dependency · b67577df
      Frederic Weisbecker 提交于
      We need to deal with time ordered events to build a correct
      state machine of lock events. This is why we multiplex the lock
      events buffers. But the ordering is done from the kernel, on
      the tracing fast path, leading to high contention between cpus.
      
      Without multiplexing, the events appears in a weak order.
      If we have four events, each split per cpu, perf record will
      read the events buffers in the following order:
      
      [ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....]
      
      To handle a post processing reordering, we could just read and sort
      the whole in memory, but it just doesn't scale with high amounts
      of events: lock events can fill huge amounts in few times.
      
      Basically we need to sort in memory and find a "grace period"
      point when we know that a given slice of previously sorted events
      can be committed for post-processing, so that we can unload the
      memory usage step by step and keep a scalable sorting list.
      
      There is no strong rules about how to define such "grace period".
      What does this patch is:
      
      We define a FLUSH_PERIOD value that defines a grace period in
      seconds.
      We want to have a slice of events covering 2 * FLUSH_PERIOD in our
      sorted list.
      If FLUSH_PERIOD is big enough, it ensures every events that occured
      in the first half of the timeslice have all been buffered and there
      are none remaining and there won't be further to put inside this
      first timeslice. Then once we reach the 2 * FLUSH_PERIOD
      timeslice, we flush the first half to be gentle with the memory
      (the second half can still get new events in the middle, so wait
      another period to flush it)
      
      FLUSH_PERIOD is defined to 5 seconds. Say the first event started on
      time t0. We can safely assume that at the time we are processing
      events of t0 + 10 seconds, ther won't be anymore events to read
      from perf.data that occured between t0 and t0 + 5 seconds. Hence
      we can safely flush the first half.
      
      To point out funky bugs, we have a guardian that checks a new event
      timestamp is not below the last event's timestamp flushed and that
      displays a warning in this case.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      b67577df
    • H
      perf lock: Fix and add misc documentally things · 84c6f88f
      Hitoshi Mitake 提交于
      I've forgot to add 'perf lock' line to command-list.txt,
      so users of perf could not find perf lock when they type 'perf'.
      
      Fixing command-list.txt requires document
      (tools/perf/Documentation/perf-lock.txt).
      But perf lock is too much "under construction" to write a
      stable document, so this is something like pseudo document for now.
      
      And I wrote description of perf lock at help section of
      CONFIG_LOCK_STAT, this will navigate users of lock trace events.
      Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      LKML-Reference: <1265267295-8388-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      84c6f88f
  9. 26 2月, 2010 8 次提交
    • D
      perf tools: Flush maps on COMM events · 4385d580
      David S. Miller 提交于
      Even though we don't register the counters until the child is right about
      to exec(), we're still going to get at least a few events while the
      fork()'d child is still executing 'perf' and in particular we're going to
      get the MMAP events.
      
      We can't distinguish the ones in the newly executed process because the
      PID will be the same.
      
      One way to solve this would be to have a PERF_RECORD_EXEC event, and when
      this is seen 'perf' can flush it's map cache.  We can't use
      PERF_RECORD_COMM since that's generated by other things, not just exec().
      
      Actually, thinking about it some more, using PERF_RECORD_COMM might be a
      good enough approximation.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267196914-16238-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4385d580
    • A
      perf annotate: Handle samples not at objdump output addr boundaries · 48fb4fdd
      Arnaldo Carvalho de Melo 提交于
      Without this patch we get this for need_resched:
      
      [root@mica ~]# perf annotate need_resched
      
      ------------------------------------------------
       Percent |      Source code & Disassembly of vmlinux
      ------------------------------------------------
               :
               :
               :      Disassembly of section .text:
               :
               :      ffffffff810095ed <need_resched>:
               :              return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
               :      }
               :
               :      static inline int need_resched(void)
               :      {
          0.00 :      ffffffff810095ed:       55                      push   %rbp
               :              return unlikely(test_thread_flag(TIF_NEED_RESCHED));
          0.00 :      ffffffff810095ee:       be 03 00 00 00          mov    $0x3,%esi
               :
               :      static inline struct thread_info *current_thread_info(void)
               :      {
               :              struct thread_info *ti;
               :              ti = (void *)(percpu_read_stable(kernel_stack) +
          0.00 :      ffffffff810095f3:       65 48 8b 3c 25 48 b5    mov    %gs:0xb548,%rdi
          0.00 :      ffffffff810095fa:       00 00
               :              return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
               :      }
               :
               :      static inline int need_resched(void)
               :      {
          0.00 :      ffffffff810095fc:       48 89 e5                mov    %rsp,%rbp
               :              return unlikely(test_thread_flag(TIF_NEED_RESCHED));
          0.00 :      ffffffff810095ff:       48 81 ef d8 1f 00 00    sub    $0x1fd8,%rdi
          0.00 :      ffffffff81009606:       e8 9d ff ff ff          callq  ffffffff810095a8 <test_ti_thread_flag>
               :      }
          0.00 :      ffffffff8100960b:       c9                      leaveq
          0.00 :      ffffffff8100960c:       85 c0                   test   %eax,%eax
          0.00 :      ffffffff8100960e:       0f 95 c0                setne  %al
          0.00 :      ffffffff81009611:       0f b6 c0                movzbl %al,%eax
               :      Disassembly of section .vsyscall_0:
               :      Disassembly of section .vsyscall_fn:
               :      Disassembly of section .vsyscall_1:
               :      Disassembly of section .vsyscall_2:
               :      Disassembly of section .init.text:
               :      Disassembly of section .altinstr_replacement:
               :      Disassembly of section .exit.text:
      [root@mica ~]#
      
      But from the 'perf report' result we know that there are hits
      for need_resched on a 4 way machine mostly doing nothing, so
      after adding code to show what is in each hist offset and
      collapsing IP hits for what happens between objdump lines we
      get, for the same perf.data file:
      
      [root@mica ~]# perf annotate -v need_resched
      
      ------------------------------------------------
       Percent |      Source code & Disassembly of vmlinux
      ------------------------------------------------
               :
               :
               :      Disassembly of section .text:
               :
               :      ffffffff810095ed <need_resched>:
               :              return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
               :      }
               :
               :      static inline int need_resched(void)
               :      {
          0.00 :      ffffffff810095ed:       55                      push   %rbp
               :              return unlikely(test_thread_flag(TIF_NEED_RESCHED));
         52.78 :      ffffffff810095ee:       be 03 00 00 00          mov    $0x3,%esi
               :
               :      static inline struct thread_info *current_thread_info(void)
               :      {
               :              struct thread_info *ti;
               :              ti = (void *)(percpu_read_stable(kernel_stack) +
          0.00 :      ffffffff810095f3:       65 48 8b 3c 25 48 b5    mov    %gs:0xb548,%rdi
          0.00 :      ffffffff810095fa:       00 00
               :              return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
               :      }
               :
               :      static inline int need_resched(void)
               :      {
          0.00 :      ffffffff810095fc:       48 89 e5                mov    %rsp,%rbp
               :              return unlikely(test_thread_flag(TIF_NEED_RESCHED));
          9.72 :      ffffffff810095ff:       48 81 ef d8 1f 00 00    sub    $0x1fd8,%rdi
          0.00 :      ffffffff81009606:       e8 9d ff ff ff          callq  ffffffff810095a8 <test_ti_thread_flag>
               :      }
          0.00 :      ffffffff8100960b:       c9                      leaveq
          0.00 :      ffffffff8100960c:       85 c0                   test   %eax,%eax
         37.50 :      ffffffff8100960e:       0f 95 c0                setne  %al
          0.00 :      ffffffff81009611:       0f b6 c0                movzbl %al,%eax
               :      Disassembly of section .vsyscall_0:
               :      Disassembly of section .vsyscall_fn:
               :      Disassembly of section .vsyscall_1:
               :      Disassembly of section .vsyscall_2:
               :      Disassembly of section .init.text:
               :      Disassembly of section .altinstr_replacement:
               :      Disassembly of section .exit.text:
      [root@mica ~]#
      
      And now 'perf annotate -v', verbose mode, will show the hits per
      precise IP, so that one can make sense of the attribution to
      each objdumop line:
      
      [root@mica ~]# perf annotate -v need_resched
      Looking at the vmlinux_path (5 entries long)
      Using /lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux
      for symbols annotate_sym: filename=/lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux, sym=need_resched, start=0xffffffff810095ed, end=0xffffffff81009614
      
      ------------------------------------------------
       Percent |      Source code & Disassembly of vmlinux
      ------------------------------------------------
                      ffffffff810095f1: 152
                      ffffffff81009603: 28
                      ffffffff8100960f: 55
                      ffffffff81009610: 53
                                h->sum: 288
      <SNIP same annotation>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267194194-15670-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      48fb4fdd
    • M
      perf probe: Add lazy line matching support · 2a9c8c36
      Masami Hiramatsu 提交于
      Add lazy line matching support for specifying new probes.
      This also changes the syntax of perf probe a bit. Now
      perf probe accepts one of below probe event definitions.
      
      1) Define event based on function name
       [EVENT=]FUNC[@src][:RLN|+OFF|%return|;PTN] [ARG ...]
      
      2) Define event based on source file with line number
       [EVENT=]SRC:ALN [ARG ...]
      
      3) Define event based on source file with lazy pattern
       [EVENT=]SRC;PTN [ARG ...]
      
      - New lazy matching pattern(PTN) follows ';' (semicolon). And it
        must be put the end of the definition.
      - So, @src is no longer the part which must be put at the end
        of the definition.
      
      Note that ';' (semicolon) can be interpreted as the end of
      a command by the shell. This means that you need to quote it.
      (anyway you will need to quote the lazy pattern itself too,
      because it may contains other sensitive characters, like
      '[',']' etc.).
      
      Lazy matching
      -------------
      The lazy line matching is similar to glob matching except
      ignoring spaces in both of pattern and target.
      
      e.g.
      'a=*' can matches 'a=b', 'a = b', 'a == b' and so on.
      
      This provides some sort of flexibility and robustness to
      probe point definitions against minor code changes.
      (for example, actual 10th line of schedule() can be changed
       easily by modifying schedule(), but the same line matching
       'rq=cpu_rq*' may still exist.)
      
      Changes in v3:
       - Cast Dwarf_Addr to uintmax_t for printf-formats.
      
      Changes in v2:
       - Cast Dwarf_Addr to unsigned long long for printf-formats.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133611.6725.45078.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2a9c8c36
    • M
      perf probe: Show more lines after last line · 5c8d1cbb
      Masami Hiramatsu 提交于
      Show 2 more lines after the last probe-able line.
      This will clearly show the last closed-brace of
      inline functions.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133604.6725.76820.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5c8d1cbb
    • M
      perf probe: Check function address range strictly in line finder · 161a26b0
      Masami Hiramatsu 提交于
      Check (inlined) function address range strictly for
      improving output of probe-able lines of inline functions.
      
      Without this change, perf probe --line <function> sometimes
      showed other inline function bodies too, because it didn't
      filter out inlined functions.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133557.6725.20697.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      161a26b0
    • M
      perf probe: Use libdw callback routines · e92b85e1
      Masami Hiramatsu 提交于
      Use libdw callback functions aggressively, and remove
      local tree-search API. This change simplifies the code.
      
      Changes in v3:
       - Cast Dwarf_Addr to uintmax_t for printf-formats.
      
      Changes in v2:
       - Cast Dwarf_Addr to unsigned long long for printf-formats.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133549.6725.81499.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      e92b85e1
    • M
      perf probe: Use elfutils-libdw for analyzing debuginfo · 804b3606
      Masami Hiramatsu 提交于
      Newer gcc introduces newer & richer debuginfo, and only libdw
      in elfutils project can support it. So perf probe moves onto
      elfutils-libdw from libdwarf.
      
      Changes in v3:
       - Cast Dwarf_Addr/Dwarf_Word to uintmax_t for printf-formats.
       - Recover a sign-prefix which was removed in v2 by mistake.
      
      Changes in v2:
       - Fix a type-casting bug in Makefile.
       - Cast Dwarf_Addr/Dwarf_Word to unsigned long long for printf-formats.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133542.6725.34724.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      804b3606
    • M
      perf probe: Rename probe finder functions · 81cb8aa3
      Masami Hiramatsu 提交于
      Rename *_probepoint to *_probe_point, for nothing
      but a cosmetic reason.
      Signed-off-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      LKML-Reference: <20100225133534.6725.52615.stgit@localhost6.localdomain6>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      81cb8aa3