1. 03 7月, 2009 3 次提交
    • F
      perf_counter tools: Set the minimum percent for callchains to be displayed · c20ab37e
      Frederic Weisbecker 提交于
      Callchains output may become a burden on a trace because even
      rarely hit site are exposed. This can be too much information.
      
      Let the user set a threshold as a minimum percent of hits using
      the new pattern for the -c option:
      
          -c mode,min_percent
      
      Example:
      
      $ perf report -s sym -c flat,4
      
           8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
           5.39%  [k] search_by_key
           4.63%  0x00000000009e0a
           2.36%  [k] memcpy_c
      [...]
      
      $ perf report -s sym -c graph,2
      
           8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |           --4.19%-- sys_pread64
                      |                     system_call_fastpath
                      |                     pread64
                      |
                       --3.24%-- generic_file_buffered_write
                                 __generic_file_aio_write_nolock
                                 generic_file_aio_write
                                 do_sync_write
                                 reiserfs_file_write
                                 vfs_write
                                 |
                                  --3.14%-- sys_pwrite64
                                            system_call_fastpath
                                            __pwrite64
      
           5.39%  [k] search_by_key
                      |
                       --2.23%-- reiserfs_update_sd_size
      
           4.63%  0x00000000009e0a
      
           2.36%  [k] memcpy_c
      [...]
      
      You can also omit it and it will default to 0.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246558475-10624-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c20ab37e
    • F
      perf report: Add support for callchain graph output · 4eb3e478
      Frederic Weisbecker 提交于
      Currently, the printing of callchains is done in a single
      vertical level, this is the "flat" mode:
      
      8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
      This patch introduces a new "graph" mode which provides a
      hierarchical output of factorized paths recursively sorted:
      
       8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |          |--4.19%-- sys_pread64
                      |          |          system_call_fastpath
                      |          |          pread64
                      |          |
                      |           --0.12%-- sys_read
                      |                     system_call_fastpath
                      |                     __read
                      |
                      |--3.24%-- generic_file_buffered_write
                      |          __generic_file_aio_write_nolock
                      |          generic_file_aio_write
                      |          do_sync_write
                      |          reiserfs_file_write
                      |          vfs_write
                      |          |
                      |          |--3.14%-- sys_pwrite64
                      |          |          system_call_fastpath
                      |          |          __pwrite64
                      |          |
                      |           --0.10%-- sys_write
      [...]
      
      The command line has then changed.
      
      By providing the -c option, the callchain will output in the
      flat mode by default.
      
      But you can override it:
      
          perf report -c graph
      
      or
      
          perf report -c flat
      
      You can also pass the abreviated mode:
      
          perf report -c g
      
      or
      
          perf report -c gra
      
      will both make use of the graph mode.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4eb3e478
    • F
      perf_counter tools: Create new chain_for_each_child() iterator · 14f4654c
      Frederic Weisbecker 提交于
      Iterating through children of a node in the callchain tree
      shows something that may be quite confusing at a first glance.
      The head is the children field of the parent and the list nodes
      are in the brothers field of the children.
      
      This is because the childs are linked to the parent as a list
      of "brothers" using the "children" list of the parent as a
      head:
      
        ---------------
       | Parent (head) |-------------------------------------
        ---------------                                      |
           |                                                 |
        children                                             |
           |                                                 |
        -----------               -----------                |
       | 1st child |---brother---| 2nd child |---brother-----
        -----------               -----------
      
      This makes the following strange pattern often occuring:
      
       list_for_each_entry(child, &parent->children, brothers) {
              // do something with children
       }
      
      Abstract it to chain_for_each_child() to factorize and simplify
      this pattern.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246550301-8954-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14f4654c
  2. 01 7月, 2009 4 次提交
    • I
      perf_counter tools: Add more warnings and fix/annotate them · f37a291c
      Ingo Molnar 提交于
      Enable -Wextra. This found a few real bugs plus a number
      of signed/unsigned type mismatches/uncleanlinesses. It
      also required a few annotations
      
      All things considered it was still worth it so lets try with
      this enabled for now.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f37a291c
    • F
      perf_counter tools: Various fixes for callchains · deac911c
      Frederic Weisbecker 提交于
      The symbol resolving has of course revealed some bugs in the
      callchain tree handling. This patch fixes some of them,
      including:
      
      - inherit the children from the parents while splitting a node
      - fix list range moving
      - fix indexes setting in callchains
      - create a child on the current node if the path doesn't match in
        the existent children (was only done on the root)
      - compare using symbols when possible so that we can match a function
        using any ip inside by referring to its start address.
      
      The practical effects are:
      
      - remove double callchains
      - fix upside down or any random order of callchains
      - fix wrong paths
      - fix bad hits and percentage accounts
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246419315-9968-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      deac911c
    • F
      perf_counter tools: Resolve symbols in callchains · 4424961a
      Frederic Weisbecker 提交于
      This patch resolves the names, when possible, of each ip
      present in the callchains while using the -c option with perf
      report.
      
      Example:
      
      5.40%  [k] __d_lookup
                   5.37%
                      perf_callchain
                      perf_counter_overflow
                      intel_pmu_handle_irq
                      perf_counter_nmi_handler
                      notifier_call_chain
                      atomic_notifier_call_chain
                      notify_die
                      do_nmi
                      nmi
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      sys_faccessat
                      sys_access
                      system_call_fastpath
                      0x7fb609846f77
      
                   0.01%
                      perf_callchain
                      perf_counter_overflow
                      intel_pmu_handle_irq
                      perf_counter_nmi_handler
                      notifier_call_chain
                      atomic_notifier_call_chain
                      notify_die
                      do_nmi
                      nmi
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      sys_faccessat
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246419315-9968-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4424961a
    • F
      perf_counter tools: Fix storage size allocation of callchain list · 9198aa77
      Frederic Weisbecker 提交于
      Fix a confusion while giving the size of a callchain list
      during its allocation. We are using the wrong structure size.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246419315-9968-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9198aa77
  3. 26 6月, 2009 1 次提交
    • F
      perf_counter tools: Prepare a small callchain framework · 8cb76d99
      Frederic Weisbecker 提交于
      We plan to display the callchains depending on some user-configurable
      parameters.
      
      To gather the callchains stats from the recorded stream in a fast way,
      this patch introduces an ad hoc radix tree adapted for callchains and also
      a rbtree to sort these callchains once we have gathered every events
      from the stream.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8cb76d99