1. 23 8月, 2010 1 次提交
    • F
      perf: Keep track of the max depth of a callchain · d2009c51
      Frederic Weisbecker 提交于
      In order to implement callchains collapsing, we need to keep
      track of the maximum depth in a histogram tree of callchains.
      This way we'll avoid allocating an arbitrary temporary buffer
      size on callchain merge time.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      d2009c51
  2. 08 7月, 2010 2 次提交
    • F
      perf: Sync callchains with period based hits · 108553e1
      Frederic Weisbecker 提交于
      Hists have their hits increased by the event period. And this
      period based counting is the foundation of all the stats in
      perf report.
      
      But callchains still use the raw number of hits, without taking
      the period into account. So when we compute the percentage,
      absolute based percentages are totally broken, and relative ones
      too in the first parent level. Because we pass the number of events
      muliplied by their period as the total number of hits to the
      callchain filtering, while callchains expect this number to be
      the number of raw hits.
      
      perf report -g graph was simply not working, showing no graph unless
      the min percent was zero. And even there the percentage of the
      branches was always 0. And may be fractal filtering was broken on
      the first branch level too.
      
      flat also was broken, but it was hidden because of other breakages.
      
      Anyway fix this by counting using periods on callchains.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      108553e1
    • F
      perf: Resurrect flat callchains · 97aa1052
      Frederic Weisbecker 提交于
      Initialize the callchain radix tree root correctly.
      
      When we walk through the parents, we must stop after the root, but
      since it wasn't well initialized, its parent pointer was random.
      
      Also the number of hits was random because uninitialized, hence it
      was part of the callchain while the root doesn't contain anything.
      
      This fixes segfaults and percentages followed by empty callchains
      while running:
      
      	perf report -g flat
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: 2.6.31.x-2.6.34.x <stable@kernel.org>
      97aa1052
  3. 05 6月, 2010 1 次提交
    • A
      perf tools: Make event__preprocess_sample parse the sample · 41a37e20
      Arnaldo Carvalho de Melo 提交于
      Simplifying the tools that were using both in sequence and allowing
      upcoming simplifications, such as Arun's patch to sort by cpus.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41a37e20
  4. 20 5月, 2010 1 次提交
    • A
      perf annotate: Use build-ids to find the right DSO · b36f19d5
      Arnaldo Carvalho de Melo 提交于
      We were still using the pathname found on the MMAP event, that could not
      be the one we used when recording, so use the build-id cache for that,
      only falling back to use the pathname in the MMAP event if no build-ids
      are available.
      
      With this we now also are able to do secure, seamless offline annotation.
      
      Example:
      
      [root@doppio linux-2.6-tip]# perf report -g none -v 2> /dev/null | head -10
           8.12%     Xorg  /usr/lib64/libpixman-1.so.0.14.0       0x0000000000026d02 B [.] pixman_rasterize_edges
           4.68%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005dbdba B [.] 0x000000005dbdba
           3.70%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
           2.96%     init  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
           2.73%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff8100a738 ! [k] mwait_idle_with_hints
      [root@doppio linux-2.6-tip]# perf annotate -v pixman_rasterize_edges 2>&1 | grep Executing
      Executing: objdump --start-address=0x000000371ce26670 --stop-address=0x000000371ce2709f -dS /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|grep -v /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|expand
      [root@doppio linux-2.6-tip]# perf buildid-list | grep libpixman-1.so.0.14.0
      bd6ac5199137aaeb279f864717d8d061477466c1 /usr/lib64/libpixman-1.so.0.14.0
      [root@doppio linux-2.6-tip]#
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b36f19d5
  5. 10 5月, 2010 2 次提交
  6. 26 3月, 2010 1 次提交
  7. 23 3月, 2010 1 次提交
    • F
      perf: Fix orphan callchain branches · 301fde27
      Frederic Weisbecker 提交于
      Callchains have markers inside their capture to tell we
      enter a context (kernel, user, ...).
      
      Those are not displayed in the callchains but they are
      incidentally an active part of the radix tree where
      callchains are stored, just like any other address.
      
      If we have the two following callchains:
      
      addr1 -> addr2 -> user context -> addr3
      addr1 -> addr2 -> user context -> addr4
      addr1 -> addr2 -> addr 5
      
      This is pretty common if addr1 and addr2 are part of an
      interrupt path, addr3 and addr4 are user addresses and
      addr5 is a kernel non interrupt path.
      
      This will be stored as follows in the tree:
      
                         addr1
                         addr2
                         /   \
                        /     addr5
                  user context
                     /    \
                   addr3  addr4
      
      But we ignore the context markers in the report, hence
      the addr3 and addr4 will appear as orphan branches:
      
          |--28.30%-- hrtimer_interrupt
          |          smp_apic_timer_interrupt
          |          apic_timer_interrupt
          |          |           <------------- here, no parent!
          |          |          |
          |          |          |--11.11%-- 0x7fae7bccb875
          |          |          |
          |          |          |--11.11%-- 0xffffffffff60013b
          |          |          |
          |          |          |--11.11%-- __pthread_mutex_lock_internal
          |          |          |
          |          |          |--11.11%-- __errno_location
      
      Fix this by removing the context markers when we process the
      callchains to the tree.
      Reported-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1269274173-20328-1-git-send-email-acme@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      301fde27
  8. 25 9月, 2009 1 次提交
  9. 12 8月, 2009 1 次提交
  10. 09 8月, 2009 2 次提交
    • F
      perf tools: callchain: Fix 'perf report' display to be callchain by default · b1a88349
      Frederic Weisbecker 提交于
      If we recorded with -g option to record the callchain, right now
      we require a -g option to perf report as well - and people reported
      this as unnecessary complication: the user already specified -g
      once, no need to require it a second time.
      
      So if the recording includes call-chains, display the callchain by
      default from perf report.
      
      ( The user can override this default using "-g none" option from
        perf report. )
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b1a88349
    • F
      perf tools: Fix call-chain cumul hit based sub-total (fractal mode) · 1953287b
      Frederic Weisbecker 提交于
      The callchain fractal mode builds each new total hits in a new
      branch of profiling by using the parent's hits of the current
      branch plus the hits of the children.
      
      This is wrong, the total hits of a branch should be made of the
      sum of every children hits, we must ignore the parent hits in
      this scope.
      
      This patch also fixes another mistake with the hit counting.
      
      Now the rates are correct.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1953287b
  11. 05 7月, 2009 1 次提交
    • F
      perf report: Add "Fractal" mode output - support callchains with relative overhead rate · 805d127d
      Frederic Weisbecker 提交于
      The current callchain displays the overhead rates as absolute:
      relative to the total overhead.
      
      This patch provides relative overhead percentage, in which each
      branch of the callchain tree is a independant instrumentated object.
      
      This provides a 'fractal' view of the call-chain profile: each
      sub-graph looks like a profile in itself - relative to its parent.
      
      You can produce such output by using the "fractal" mode
      that you can abbreviate via f, fr, fra, frac, etc...
      
      ./perf report -s sym -c fractal
      
      Example:
      
           8.46%  [k] copy_user_generic_string
                      |
                      |--52.01%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |          |--97.20%-- sys_pread64
                      |          |          system_call_fastpath
                      |          |          pread64
                      |          |
                      |           --2.81%-- sys_read
                      |                     system_call_fastpath
                      |                     __read
                      |
                      |--39.85%-- generic_file_buffered_write
                      |          __generic_file_aio_write_nolock
                      |          generic_file_aio_write
                      |          do_sync_write
                      |          reiserfs_file_write
                      |          vfs_write
                      |          |
                      |          |--97.05%-- sys_pwrite64
                      |          |          system_call_fastpath
                      |          |          __pwrite64
                      |          |
                      |           --2.95%-- sys_write
                      |                     system_call_fastpath
                      |                     __write_nocancel
      [...]
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      805d127d
  12. 03 7月, 2009 2 次提交
    • F
      perf_counter tools: Set the minimum percent for callchains to be displayed · c20ab37e
      Frederic Weisbecker 提交于
      Callchains output may become a burden on a trace because even
      rarely hit site are exposed. This can be too much information.
      
      Let the user set a threshold as a minimum percent of hits using
      the new pattern for the -c option:
      
          -c mode,min_percent
      
      Example:
      
      $ perf report -s sym -c flat,4
      
           8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
           5.39%  [k] search_by_key
           4.63%  0x00000000009e0a
           2.36%  [k] memcpy_c
      [...]
      
      $ perf report -s sym -c graph,2
      
           8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |           --4.19%-- sys_pread64
                      |                     system_call_fastpath
                      |                     pread64
                      |
                       --3.24%-- generic_file_buffered_write
                                 __generic_file_aio_write_nolock
                                 generic_file_aio_write
                                 do_sync_write
                                 reiserfs_file_write
                                 vfs_write
                                 |
                                  --3.14%-- sys_pwrite64
                                            system_call_fastpath
                                            __pwrite64
      
           5.39%  [k] search_by_key
                      |
                       --2.23%-- reiserfs_update_sd_size
      
           4.63%  0x00000000009e0a
      
           2.36%  [k] memcpy_c
      [...]
      
      You can also omit it and it will default to 0.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246558475-10624-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c20ab37e
    • F
      perf report: Add support for callchain graph output · 4eb3e478
      Frederic Weisbecker 提交于
      Currently, the printing of callchains is done in a single
      vertical level, this is the "flat" mode:
      
      8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
      This patch introduces a new "graph" mode which provides a
      hierarchical output of factorized paths recursively sorted:
      
       8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |          |--4.19%-- sys_pread64
                      |          |          system_call_fastpath
                      |          |          pread64
                      |          |
                      |           --0.12%-- sys_read
                      |                     system_call_fastpath
                      |                     __read
                      |
                      |--3.24%-- generic_file_buffered_write
                      |          __generic_file_aio_write_nolock
                      |          generic_file_aio_write
                      |          do_sync_write
                      |          reiserfs_file_write
                      |          vfs_write
                      |          |
                      |          |--3.14%-- sys_pwrite64
                      |          |          system_call_fastpath
                      |          |          __pwrite64
                      |          |
                      |           --0.10%-- sys_write
      [...]
      
      The command line has then changed.
      
      By providing the -c option, the callchain will output in the
      flat mode by default.
      
      But you can override it:
      
          perf report -c graph
      
      or
      
          perf report -c flat
      
      You can also pass the abreviated mode:
      
          perf report -c g
      
      or
      
          perf report -c gra
      
      will both make use of the graph mode.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4eb3e478
  13. 02 7月, 2009 2 次提交
  14. 01 7月, 2009 2 次提交
    • I
      perf_counter tools: Add more warnings and fix/annotate them · f37a291c
      Ingo Molnar 提交于
      Enable -Wextra. This found a few real bugs plus a number
      of signed/unsigned type mismatches/uncleanlinesses. It
      also required a few annotations
      
      All things considered it was still worth it so lets try with
      this enabled for now.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f37a291c
    • F
      perf_counter tools: Resolve symbols in callchains · 4424961a
      Frederic Weisbecker 提交于
      This patch resolves the names, when possible, of each ip
      present in the callchains while using the -c option with perf
      report.
      
      Example:
      
      5.40%  [k] __d_lookup
                   5.37%
                      perf_callchain
                      perf_counter_overflow
                      intel_pmu_handle_irq
                      perf_counter_nmi_handler
                      notifier_call_chain
                      atomic_notifier_call_chain
                      notify_die
                      do_nmi
                      nmi
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      sys_faccessat
                      sys_access
                      system_call_fastpath
                      0x7fb609846f77
      
                   0.01%
                      perf_callchain
                      perf_counter_overflow
                      intel_pmu_handle_irq
                      perf_counter_nmi_handler
                      notifier_call_chain
                      atomic_notifier_call_chain
                      notify_die
                      do_nmi
                      nmi
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      sys_faccessat
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246419315-9968-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4424961a
  15. 26 6月, 2009 1 次提交
    • F
      perf_counter tools: Prepare a small callchain framework · 8cb76d99
      Frederic Weisbecker 提交于
      We plan to display the callchains depending on some user-configurable
      parameters.
      
      To gather the callchains stats from the recorded stream in a fast way,
      this patch introduces an ad hoc radix tree adapted for callchains and also
      a rbtree to sort these callchains once we have gathered every events
      from the stream.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8cb76d99