1. 28 6月, 2012 1 次提交
  2. 20 6月, 2012 2 次提交
  3. 31 5月, 2012 1 次提交
  4. 30 5月, 2012 1 次提交
    • A
      perf report: Use the right symbol for annotation · 91557e84
      Arnaldo Carvalho de Melo 提交于
      In non symbolic views, i.e. --sort without "symbol", as in:
      
       perf report --sort comm
      
      We're segfaulting in the --tui because we're testing the symbol resolved
      and then trying to use the symbol on the histogram entry where we're
      coalescing all hits for a COMM, and the first hist_entry for a comm may
      have a NULL symbol, i.e. the RIP didn't resolve to any symbol.
      
      In this case we're segfaulting, fix it by testing against the symbol in
      the histogram entry.
      Reported-by: NWilliam Cohen <wcohen@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-8ylwubbcmu27ucc9ffrku3yv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91557e84
  5. 29 5月, 2012 1 次提交
  6. 09 5月, 2012 1 次提交
  7. 03 5月, 2012 3 次提交
  8. 16 4月, 2012 1 次提交
  9. 08 4月, 2012 1 次提交
  10. 20 3月, 2012 1 次提交
    • P
      perf report: Add a simple GTK2-based 'perf report' browser · c31a9457
      Pekka Enberg 提交于
      This patch adds a simple GTK2-based browser to 'perf report' that's
      based on the TTY-based browser in builtin-report.c.
      
      To launch "perf report" using the new GTK interface just type:
      
        $ perf report --gtk
      
      The interface is somewhat limited in features at the moment:
      
        - No callgraph support
      
        - No KVM guest profiling support
      
        - No color coding for percentages
      
        - No sorting from the UI
      
        - ..and many, many more!
      
      That said, I think this patch a reasonable start to build future features on.
      Signed-off-by: NPekka Enberg <penberg@kernel.org>
      Cc: Colin Walters <walters@verbum.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202231952410.6689@tux.localdomain
      [ committer note: Added #pragma to make gtk no strict prototype problem go
        away as suggested by Colin Walters modulo avoiding push/pop ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c31a9457
  11. 17 3月, 2012 2 次提交
  12. 09 3月, 2012 3 次提交
    • S
      perf report: Enable TUI in branch view mode · a68c2c58
      Stephane Eranian 提交于
      This patch updates perf report to support TUI mode
      when the perf.data file contains samples with branch
      stacks.
      
      For each row in the report, it is possible to annotate
      either the source or target of each branch.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: asharma@fb.com
      Cc: ravitillo@lbl.gov
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1331246868-19905-5-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      a68c2c58
    • S
      perf report: Auto-detect branch stack sampling mode · 993ac88d
      Stephane Eranian 提交于
      This patch enhances perf report to auto-detect when the
      perf.data file contains samples with branch stacks. That way it
      is not necessary to use the -b option.
      
      To force branch view mode to off, simply use --no-branch-stack.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: asharma@fb.com
      Cc: ravitillo@lbl.gov
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      993ac88d
    • R
      perf report: Add support for taken branch sampling · b50311dc
      Roberto Agostino Vitillo 提交于
      This patch adds support for taken branch sampling, i.e, the
      PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
      words, to display histograms based on taken branches rather
      than executed instructions addresses.
      
      The new option is called -b and it takes no argument. To
      generate meaningful output, the perf.data must have been
      obtained using perf record -b xxx ... where xxx is a branch
      filter option.
      
      The output shows symbols, modules, sorted by 'who branches
      where' the most often. The percentages reported in the first
      column refer to the total number of branches captured and
      not the usual number of samples.
      
      Here is a quick example.
      Here branchy is simple test program which looks as follows:
      
      void f2(void)
      {}
      void f3(void)
      {}
      void f1(unsigned long n)
      {
        if (n & 1UL)
          f2();
        else
          f3();
      }
      int main(void)
      {
        unsigned long i;
      
        for (i=0; i < N; i++)
         f1(i);
        return 0;
      }
      
      Here is the output captured on Nehalem, if we are
      only interested in user level function calls.
      
      $ perf record -b any_call,u -e cycles:u branchy
      
      $ perf report -b --sort=symbol
          52.34%  [.] main                   [.] f1
          24.04%  [.] f1                     [.] f3
          23.60%  [.] f1                     [.] f2
           0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
           0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
           0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
           0.01%  [k] __printf               [k] _IO_vfprintf_internal
           0.01%  [k] main                   [k] __printf
      
      About half (52%) of the call branches captured are from main()
      -> f1(). The second half (24%+23%) is split in two equal shares
      between f1() -> f2(), f1() ->f3(). The output is as expected
      given the code.
      
      It should be noted, that using -b in perf record does not
      eliminate information in the perf.data file. Consequently, a
      typical profile can also be obtained by perf report by simply
      not using its -b option.
      
      It is possible to sort on branch related columns:
      
         - dso_from, symbol_from
         - dso_to, symbol_to
         - mispredict
      Signed-off-by: NRoberto Agostino Vitillo <ravitillo@lbl.gov>
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: peterz@infradead.org
      Cc: acme@redhat.com
      Cc: robert.richter@amd.com
      Cc: ming.m.lin@intel.com
      Cc: andi@firstfloor.org
      Cc: asharma@fb.com
      Cc: vweaver1@eecs.utk.edu
      Cc: khandual@linux.vnet.ibm.com
      Cc: dsahern@gmail.com
      Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b50311dc
  13. 24 12月, 2011 1 次提交
    • R
      perf report: Accept fifos as input file · efad1415
      Robert Richter 提交于
      The default input file for perf report is not handled the same way as
      perf record does it for its output file. This leads to unexpected
      behavior of perf report, etc. E.g.:
      
       # perf record -a -e cpu-cycles sleep 2 | perf report | cat
       failed to open perf.data: No such file or directory  (try 'perf record' first)
      
      While perf record writes to a fifo, perf report expects perf.data to be
      read. This patch changes this to accept fifos as input file.
      
      Applies to the following commands:
      
       perf annotate
       perf buildid-list
       perf evlist
       perf kmem
       perf lock
       perf report
       perf sched
       perf script
       perf timechart
      
      Also fixes char const* -> const char* type declaration for filename
      strings.
      
      v2:
      * Prevent potential null pointer access to input_name in
        builtin-report.c. Needed due to removal of patch "perf report: Setup
        browser if stdout is a pipe"
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.comSigned-off-by: NRobert Richter <robert.richter@amd.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      efad1415
  14. 22 12月, 2011 1 次提交
  15. 20 12月, 2011 1 次提交
  16. 28 11月, 2011 7 次提交
  17. 08 10月, 2011 2 次提交
    • A
      perf tools: Make --no-asm-raw the default · 64c6f0c7
      Arnaldo Carvalho de Melo 提交于
      And add the annotation output knobs to all the tools that have
      integrated annotation (top, report).
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-gnlob67mke6sji2kf4nstp7m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      64c6f0c7
    • S
      perf tools: Make perf.data more self-descriptive (v8) · fbe96f29
      Stephane Eranian 提交于
      The goal of this patch is to include more information about the host
      environment into the perf.data so it is more self-descriptive. Overtime,
      profiles are captured on various machines and it becomes hard to track
      what was recorded, on what machine and when.
      
      This patch provides a way to solve this by extending the perf.data file
      with basic information about the host machine. To add those extensions,
      we leverage the feature bits capabilities of the perf.data format.  The
      change is backward compatible with existing perf.data files.
      
      We define the following useful new extensions:
       - HEADER_HOSTNAME: the hostname
       - HEADER_OSRELEASE: the kernel release number
       - HEADER_ARCH: the hw architecture
       - HEADER_CPUDESC: generic CPU description
       - HEADER_NRCPUS: number of online/avail cpus
       - HEADER_CMDLINE: perf command line
       - HEADER_VERSION: perf version
       - HEADER_TOPOLOGY: cpu topology
       - HEADER_EVENT_DESC: full event description (attrs)
       - HEADER_CPUID: easy-to-parse low level CPU identication
      
      The small granularity for the entries is to make it easier to extend
      without breaking backward compatiblity. Many entries are provided as
      ASCII strings.
      
      Perf report/script have been modified to print the basic information as
      easy-to-parse ASCII strings. Extended information about CPU and NUMA
      topology may be requested with the -I option.
      
      Thanks to David Ahern for reviewing and testing the many versions of
      this patch.
      
       $ perf report --stdio
       # ========
       # captured on : Mon Sep 26 15:22:14 2011
       # hostname : quad
       # os release : 3.1.0-rc4-tip
       # perf version : 3.1.0-rc4
       # arch : x86_64
       # nrcpus online : 4
       # nrcpus avail : 4
       # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
       # cpuid : GenuineIntel,6,15,11
       # total memory : 8105360 kB
       # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
       # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
       # HEADER_CPU_TOPOLOGY info available, use -I to display
       # HEADER_NUMA_TOPOLOGY info available, use -I to display
       # ========
       #
       ...
      
       $ perf report --stdio -I
       # ========
       # captured on : Mon Sep 26 15:22:14 2011
       # hostname : quad
       # os release : 3.1.0-rc4-tip
       # perf version : 3.1.0-rc4
       # arch : x86_64
       # nrcpus online : 4
       # nrcpus avail : 4
       # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
       # cpuid : GenuineIntel,6,15,11
       # total memory : 8105360 kB
       # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
       # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
       # sibling cores   : 0-3
       # sibling threads : 0
       # sibling threads : 1
       # sibling threads : 2
       # sibling threads : 3
       # node0 meminfo  : total = 8320608 kB, free = 7571024 kB
       # node0 cpu list : 0-3
       # ========
       #
       ...
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20110930134040.GA5575@quadSigned-off-by: NStephane Eranian <eranian@google.com>
      [ committer notes: Use --show-info in the tools as was in the docs, rename
        perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
        conflict with f69b64f7 "perf: Support setting the disassembler style" ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fbe96f29
  18. 07 10月, 2011 3 次提交
  19. 30 9月, 2011 2 次提交
  20. 03 8月, 2011 1 次提交
  21. 05 7月, 2011 1 次提交
    • A
      perf report/annotate/script: Add option to specify a CPU range · 5d67be97
      Anton Blanchard 提交于
      Add an option to perf report/annotate/script to specify which
      CPUs to operate on. This enables us to take a single system wide
      profile and analyse each CPU (or group of CPUs) in isolation.
      
      This was useful when profiling a multiprocess workload where the
      bottleneck was on one CPU but this was hidden in the overall
      profile. Per process and per thread breakdowns didn't help
      because multiple processes were running on each CPU and no
      single process consumed an entire CPU.
      
      The patch converts the list of CPUs returned by cpu_map__new
      into a bitmap for fast lookup. I wanted to use -C to be
      consistent with perf top/record/stat, but unfortunately perf
      report already uses -C <comms>.
      
       v2: Incorporate suggestions from David Ahern:
      	- Added -c to perf script
      	- Check that SAMPLE_CPU is set when -c is used
      	- Update documentation
      
       v3: Create perf_session__cpu_bitmap()
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/20110704215750.11647eb9@krytenSigned-off-by: NIngo Molnar <mingo@elte.hu>
      5d67be97
  22. 30 6月, 2011 2 次提交
    • F
      perf tools: Only display parent field if explictly sorted · cb1955b8
      Frederic Weisbecker 提交于
      We don't need to display the parent field if the parent
      sorting machinery is only used for parent filtering
      (as in "-p foo").
      
      However if parent filtering is used in combination with
      explicit parent sorting ( -s parent), we want to
      display it.
      
      Result with:
      
        perf report -p kernel_thread -s parent
      
      Before:
      
       # Overhead  Parent symbol
       # ........  .............
       #
           0.07%
                  |
                  --- ioread8
                      ata_sff_check_status
                      ata_sff_tf_load
                      ata_sff_qc_issue
                      ata_bmdma_qc_issue
                      ata_qc_issue
                      ata_scsi_translate
                      ata_scsi_queuecmd
                      scsi_dispatch_cmd
                      scsi_request_fn
                      __blk_run_queue
                      __make_request
                      generic_make_request
                      submit_bio
                      submit_bh
                      journal_submit_commit_record
                      jbd2_journal_commit_transaction
                      kjournald2
                      kthread
                      kernel_thread_helpe
      
      After:
      
       # Overhead  Parent symbol
       # ........  .............
       #
           0.07%  kernel_thread_helper
                  |
                  --- ioread8
                      ata_sff_check_status
                      ata_sff_tf_load
                      ata_sff_qc_issue
                      ata_bmdma_qc_issue
                      ata_qc_issue
                      ata_scsi_translate
                      ata_scsi_queuecmd
                      scsi_dispatch_cmd
                      scsi_request_fn
                      __blk_run_queue
                      __make_request
                      generic_make_request
                      submit_bio
                      submit_bh
                      journal_submit_commit_record
                      jbd2_journal_commit_transaction
                      kjournald2
                      kthread
                      kernel_thread_helper
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Sam Liao <phyomh@gmail.com>
      cb1955b8
    • S
      perf tools: Add inverted call graph report support. · d797fdc5
      Sam Liao 提交于
      Add "caller/callee" option to support inverted butterfly report,
      in the inverted report (with caller option), the call graph start
      from the callee's ancestor. Users can use such view to catch system's
      performance bottleneck from a sysprof like view. Using this option
      with specified sort order like pid gives us high level view of call
      graph statistics.
      
      Also add "-G" alias for inverted call graph.
      Signed-off-by: NSam Liao <phyomh@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      d797fdc5
  23. 28 5月, 2011 1 次提交