1. 03 7月, 2009 3 次提交
    • F
      perf report: Add support for callchain graph output · 4eb3e478
      Frederic Weisbecker 提交于
      Currently, the printing of callchains is done in a single
      vertical level, this is the "flat" mode:
      
      8.25%  [k] copy_user_generic_string
                   4.19%
                      copy_user_generic_string
                      generic_file_aio_read
                      do_sync_read
                      vfs_read
                      sys_pread64
                      system_call_fastpath
                      pread64
      
      This patch introduces a new "graph" mode which provides a
      hierarchical output of factorized paths recursively sorted:
      
       8.25%  [k] copy_user_generic_string
                      |
                      |--4.31%-- generic_file_aio_read
                      |          do_sync_read
                      |          vfs_read
                      |          |
                      |          |--4.19%-- sys_pread64
                      |          |          system_call_fastpath
                      |          |          pread64
                      |          |
                      |           --0.12%-- sys_read
                      |                     system_call_fastpath
                      |                     __read
                      |
                      |--3.24%-- generic_file_buffered_write
                      |          __generic_file_aio_write_nolock
                      |          generic_file_aio_write
                      |          do_sync_write
                      |          reiserfs_file_write
                      |          vfs_write
                      |          |
                      |          |--3.14%-- sys_pwrite64
                      |          |          system_call_fastpath
                      |          |          __pwrite64
                      |          |
                      |           --0.10%-- sys_write
      [...]
      
      The command line has then changed.
      
      By providing the -c option, the callchain will output in the
      flat mode by default.
      
      But you can override it:
      
          perf report -c graph
      
      or
      
          perf report -c flat
      
      You can also pass the abreviated mode:
      
          perf report -c g
      
      or
      
          perf report -c gra
      
      will both make use of the graph mode.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246550301-8954-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4eb3e478
    • F
      perf_counter tools: Add new OPT_CALLBACK_DEFAULT option · 5a4b1817
      Frederic Weisbecker 提交于
      There is no predefined macro to create an option that can have
      a custom value or a default one if none is given.
      
      This patch provides a new helper OPT_CALLBACK_DEFAULT() which
      defines such kind of option.
      
      For example, considering an option -c, we want to get the
      default value in the following cases:
      
          perf command -c -d
          perf command -d -c
      
      And the foo value when it's given:
      
          perf command -c foo -d
          perf command -d -c foo
      
      That's also why PARSE_OPT_LASTARG_DEFAULT is extended here to
      support default values whatever the position of the option, not
      only in the end.
      
      Should it now be renamed to PARSE_OPT_ARG_DEFAULT ?
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: git@vger.kernel.org
      LKML-Reference: <1246550301-8954-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5a4b1817
    • F
      perf_counter tools: Create new chain_for_each_child() iterator · 14f4654c
      Frederic Weisbecker 提交于
      Iterating through children of a node in the callchain tree
      shows something that may be quite confusing at a first glance.
      The head is the children field of the parent and the list nodes
      are in the brothers field of the children.
      
      This is because the childs are linked to the parent as a list
      of "brothers" using the "children" list of the parent as a
      head:
      
        ---------------
       | Parent (head) |-------------------------------------
        ---------------                                      |
           |                                                 |
        children                                             |
           |                                                 |
        -----------               -----------                |
       | 1st child |---brother---| 2nd child |---brother-----
        -----------               -----------
      
      This makes the following strange pattern often occuring:
      
       list_for_each_entry(child, &parent->children, brothers) {
              // do something with children
       }
      
      Abstract it to chain_for_each_child() to factorize and simplify
      this pattern.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246550301-8954-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      14f4654c
  2. 02 7月, 2009 8 次提交
    • M
      perf_counter tools: Enable kernel module symbol loading in tools · 42976487
      Mike Galbraith 提交于
      Add the -m/--modules option to perf report and perf annotate,
      which enables live module symbol/image loading. To be used
      with -k/--vmlinux.
      
      (Also give perf annotate a -P/--full-paths option.)
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246514986.13293.48.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      42976487
    • M
      perf_counter tools: Connect module support infrastructure to symbol loading infrastructure · 6cfcc53e
      Mike Galbraith 提交于
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246514916.13293.46.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6cfcc53e
    • M
      perf_counter tools: Add infrastructure to support loading of kernel module symbols · 208b4b4a
      Mike Galbraith 提交于
      Add infrastructure for module path discovery and section load addresses.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246514830.13293.44.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      208b4b4a
    • M
      perf_counter tools: Make symbol loading consistently return number of loaded symbols · 9974f496
      Mike Galbraith 提交于
      perf_counter tools: Make symbol loading consistently return number of loaded symbols.
      Signed-off-by: NMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246514758.13293.42.camel@marge.simson.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9974f496
    • F
      perf stat: Handle pipe read failures in perf stat · a92bef0f
      Frederic Weisbecker 提交于
      Building builtin-stat.c reports the following errors:
      
      cc1: warnings being treated as errors
      builtin-stat.c: In function ‘run_perf_stat’:
      builtin-stat.c:242: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
      builtin-stat.c:255: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
      make: *** [builtin-stat.o] Erreur 1
      
      This patch handles the possible pipe read failures.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246474930-6088-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a92bef0f
    • F
      perf_counter: Ignore the nmi call frames in the x86-64 backtraces · 0406ca6d
      Frederic Weisbecker 提交于
      About every callchains recorded with perf record are filled up
      including the internal perfcounter nmi frame:
      
       perf_callchain
       perf_counter_overflow
       intel_pmu_handle_irq
       perf_counter_nmi_handler
       notifier_call_chain
       atomic_notifier_call_chain
       notify_die
       do_nmi
       nmi
      
      We want ignore this frame as it's not interesting for
      instrumentation. To solve this, we simply ignore every frames
      from nmi context.
      
      New example of "perf report -s sym -c" after this patch:
      
      9.59%  [k] search_by_key
                   4.88%
                      search_by_key
                      reiserfs_read_locked_inode
                      reiserfs_iget
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      
                   3.19%
                      search_by_key
                      search_by_entry_key
                      reiserfs_find_entry
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      [...]
      
      For now this patch only solves the problem in x86-64.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0406ca6d
    • A
      perf_counter tools: Share list.h with the kernel · 5da50258
      Arnaldo Carvalho de Melo 提交于
      The copy we were using came from another copy I did for the dwarves
      (pahole) package, that came from the kernel years ago.
      
      The only function that is used by the perf tools and that isn't in the
      kernel is list_del_range, that I'm leaving in the perf tools only for
      now.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090701174608.GA5823@ghostprotocols.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5da50258
    • A
      perf_counter tools: Share rbtree.with the kernel · 43cbcd8a
      Arnaldo Carvalho de Melo 提交于
      The tools/perf/util/rbtree.c copy already drifted by three
      csets:
      
       4b324126
       4c601178
       16c047ad
      
      So remove the copy and use the lib/rbtree.c directly, sharing
      the source code while still generating a separate object file,
      since tools/perf uses a far more agressive -O6 switch.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20090701152837.GG15682@ghostprotocols.net>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      43cbcd8a
  3. 01 7月, 2009 29 次提交