1. 25 7月, 2012 1 次提交
  2. 31 5月, 2012 1 次提交
  3. 18 5月, 2012 1 次提交
    • J
      perf hists: Fix callchain ip printf format · a0187060
      Jiri Olsa 提交于
      The callchain address is stored as u64. Current code uses following
      format string to display callchain address:
      
        "%p\n", (void *)(long)chain->ip
      
      This way we lose upper 32 bits if we report 64 bit addresses in 32 bit
      environment. Fixing this to always display whole 64 bits.
      
      Note, running following to test perf endianity handling:
      test 1)
        - origin system:
          # perf record -a -- sleep 10 (any perf record will do)
          # perf report > report.origin
          # perf archive perf.data
      
        - copy the perf.data, report.origin and perf.data.tar.bz2
          to a target system and run:
          # tar xjvf perf.data.tar.bz2 -C ~/.debug
          # perf report > report.target
          # diff -u report.origin report.target
      
        - the diff should produce no output
          (besides some white space stuff and possibly different
           date/TZ output)
      
      test 2)
        - origin system:
          # perf record -ag -fo /tmp/perf.data -- sleep 1
        - mount origin system root to the target system on /mnt/origin
        - target system:
          # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
           --kallsyms /mnt/origin/proc/kallsyms
        - complete perf.data header is displayed
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a0187060
  4. 06 4月, 2012 1 次提交
    • D
      perf hists: Catch and handle out-of-date hist entry maps. · 63fa471d
      David Miller 提交于
      When a process exec()'s, all the maps are retired, but we keep the hist
      entries around which hold references to those outdated maps.
      
      If the same library gets mapped in for which we have hist entries, a new
      map will be created.  But when we take a perf entry hit within that map,
      we'll find the existing hist entry with the older map.
      
      This causes symbol translations to be done incorrectly.  For example,
      the perf entry processing will lookup the correct uptodate map entry and
      use that to calculate the symbol and DSO relative address.  But later
      when we update the histogram we'll translate the address using the
      outdated map file instead leading to conditions such as out-of-range
      offsets in symbol__inc_addr_samples().
      
      Therefore, update the map of the hist_entry dynamically at lookup/
      creation time.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/20120327.031418.1220315351537060808.davem@davemloft.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63fa471d
  5. 27 3月, 2012 1 次提交
    • F
      perf tools: Fix display of first level of callchains · 6d4818c5
      Frederic Weisbecker 提交于
      The callchain stdio mode display was written using a sorted by symbol
      report. In this mode we have only one callchain root per hist so we
      forgot to handle cases where we have multiple callchain root, as in per
      dso sorting for example.
      
      Fix this by handling these roots like any other branch, with the hist as
      the parent.
      
      Before:
      
           1.97%  libpthread-2.12.1.so
                  |
                  --- __libc_write
                      create_worker
                      bench_sched_messaging
                      cmd_bench
                      run_builtin
                      main
                      __libc_start_main
      
                  |
                  --- __libc_read
                      create_worker
                      bench_sched_messaging
                      cmd_bench
                      run_builtin
                      main
                      __libc_start_main
      
      After:
      
           1.97%  libpthread-2.12.1.so
                  |
                  |--36.97%-- __libc_write
                  |          create_worker
                  |          bench_sched_messaging
                  |          cmd_bench
                  |          run_builtin
                  |          main
                  |          __libc_start_main
                  |
                  |--31.47%-- __libc_read
                  |          create_worker
                  |          bench_sched_messaging
                  |          cmd_bench
                  |          run_builtin
                  |          main
                  |          __libc_start_main
                 ...
      
      Single roots keep their entry without percentage because they have
      the same overhead than the hist they refer to. ie: 100% in fractal
      mode and the percentage of the hist in graph mode:
      
           0.00%  [k] reschedule_interrupt
                  |
                  --- default_idle
                      amd_e400_idle
                      cpu_idle
                      start_secondary
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1332526010-15400-1-git-send-email-fweisbec@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d4818c5
  6. 23 3月, 2012 1 次提交
  7. 17 3月, 2012 1 次提交
  8. 14 3月, 2012 1 次提交
  9. 09 3月, 2012 1 次提交
  10. 08 1月, 2012 1 次提交
    • N
      perf report: Fix --stdio output alignment when --showcpuutilization used · 0ed35abc
      Namhyung Kim 提交于
      Current perf report output is broken if --showcpuutilization is used.
      Combination with -n and/or --show-total-period make things worse.
      This patch fixes it as follows:
      
      before:
          48.25%    48.25%     0.00%    sleep  [kernel.kallsyms]  [k] trace_hardirqs_off
          34.99%    34.99%     0.00%    sleep  [kernel.kallsyms]  [k] __find_get_block_slow
          15.99%    15.99%     0.00%    sleep  [kernel.kallsyms]  [k] lock_release_holdtime
           0.77%     0.77%     0.00%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe
      
      after:
          48.25%    48.25%     0.00%    sleep  [kernel.kallsyms]  [k] trace_hardirqs_off
          34.99%    34.99%     0.00%    sleep  [kernel.kallsyms]  [k] __find_get_block_slow
          15.99%    15.99%     0.00%    sleep  [kernel.kallsyms]  [k] lock_release_holdtime
           0.77%     0.77%     0.00%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1325957132-10600-8-git-send-email-namhyung@gmail.comSigned-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ed35abc
  11. 07 1月, 2012 2 次提交
  12. 16 11月, 2011 1 次提交
    • A
      perf python: Fix undefined symbol problem · 0e2a5f10
      Arnaldo Carvalho de Melo 提交于
      Recently we made perf_evsel__init call hists__init, which broke the perf
      python binding:
      
      [root@emilia linux]# ./tools/perf/python/twatch.py
      Traceback (most recent call last):
        File "./tools/perf/python/twatch.py", line 16, in <module>
          import perf
      ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: hists__init
      
      Fix it by moving the hists__init function to its only caller, evsel.c.
      
      This way we avoid dragging in other parts of tools/perf/util/ to the
      perf python binding.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-5nffmdt5mu6ozxgj54oi4qon@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0e2a5f10
  13. 27 10月, 2011 1 次提交
  14. 20 10月, 2011 2 次提交
  15. 19 10月, 2011 3 次提交
  16. 17 10月, 2011 1 次提交
  17. 13 10月, 2011 2 次提交
  18. 08 10月, 2011 2 次提交
    • S
      perf tools: Fix broken number of samples for perf report -n · e39622ce
      Stephane Eranian 提交于
      The perf report -n option was broken because it was not reporting the
      correct number of samples depending on the sorting mode. By default,
      samples are sorted by comm,dso,sym. That means that samples for the same
      command (binary) get collapsed.
      
      The hists__collapse_insert_entry() had a bug whereby it was aggregating
      the number of events observed (periods) but not the number of samples.
      Consequently, the number of samples reported could be below reality. The
      percentage remained correct because based on the periods.
      
      This patch fixes the problem by also aggregating the number of samples.
      Here is an example:
      
      $ perf report -n --stdio
          12.38%        842     pong  [kernel.kallsyms]     [k] __lock_acquire
      
      Here pong (a ctxsw stress test), is the only program running
      and thus it is the only one responsible for the lock_acquire samples.
      
      If we change the sorting mode:
      
      $ perf report -n --stdio --sort=sym
          12.38%       1732  [k] __lock_acquire
      
      The actual number of samples is shown.
      
      With the fix:
      
      $ perf report -n --stdio
          12.38%       1732     pong  [kernel.kallsyms]     [k] __lock_acquire
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20111003093815.GA6393@quadSigned-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e39622ce
    • A
      perf top: Reuse the 'report' hist_entry/hists classes · ab81f3fd
      Arnaldo Carvalho de Melo 提交于
      This actually fixes several problems we had in the old 'perf top':
      
      1. Unresolved symbols not show, limitation that came from the old
         "KernelTop" codebase, to solve it we would need to do changes
         that would make sym_entry have most of the hist_entry fields.
      2. It was using the number of samples, not the sum of sample->period.
      
      And brings the --sort code that allows us to have all the views in
      'perf report', for instance:
      
      [root@emilia ~]# perf top --sort dso
      PerfTop: 5903 irqs/sec kernel:77.5% exact: 0.0% [1000Hz cycles], (all, 8 CPUs)
      ------------------------------------------------------------------------------
      
          31.59%  libcrypto.so.1.0.0
          21.55%  [kernel]
          18.57%  libpython2.6.so.1.0
           7.04%  libc-2.12.so
           6.99%  _backend_agg.so
           4.72%  sshd
           1.48%  multiarray.so
           1.39%  libfreetype.so.6.3.22
           1.37%  perf
           0.71%  libgobject-2.0.so.0.2200.5
           0.53%  [tg3]
           0.48%  libglib-2.0.so.0.2200.5
           0.44%  libstdc++.so.6.0.13
           0.40%  libcairo.so.2.10800.8
           0.38%  libm-2.12.so
           0.34%  umath.so
           0.30%  libgdk-x11-2.0.so.0.1800.9
           0.22%  libpthread-2.12.so
           0.20%  libgtk-x11-2.0.so.0.1800.9
           0.20%  librt-2.12.so
           0.15%  _path.so
           0.13%  libpango-1.0.so.0.2800.1
           0.11%  libatlas.so.3.0
           0.09%  ft2font.so
           0.09%  libpangoft2-1.0.so.0.2800.1
           0.08%  libX11.so.6.3.0
           0.07%  [vdso]
           0.06%  cyclictest
      ^C
      
      All the filter lists can be used as well: --dsos, --comms, --symbols,
      etc.
      
      The 'perf report' TUI is also reused, being possible to apply all the
      zoom operations, do annotation, etc.
      
      This change will allow multiple simplifications in the symbol system as
      well, that will be detailed in upcoming changesets.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-xzaaldxq7zhqrrxdxjifk1mh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab81f3fd
  19. 07 10月, 2011 4 次提交
  20. 30 6月, 2011 2 次提交
    • F
      perf tools: Don't display ignored entries on stdio ui · e84d2122
      Frederic Weisbecker 提交于
      As for newt ui, don't display entries that have been marked
      as ignored.
      
      The practical current effect of this is to make parent
      filtering really working. Before, entries that were ignored
      were given a null parent but were still displayed. This
      resulted in some weird effects:
      
       # Overhead      Command      Shared Object        Symbol
       # ........  ...........  .................  ............
       #
      ^A
                         |
                         --- __lock_acquire
                            |
                            |--95.97%-- lock_acquire
                            |          |
                            |          |--30.75%-- _raw_spin_lock
      
      Discard these from the stdio display.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Sam Liao <phyomh@gmail.com>
      e84d2122
    • S
      perf tools: Add inverted call graph report support. · d797fdc5
      Sam Liao 提交于
      Add "caller/callee" option to support inverted butterfly report,
      in the inverted report (with caller option), the call graph start
      from the callee's ancestor. Users can use such view to catch system's
      performance bottleneck from a sysprof like view. Using this option
      with specified sort order like pid gives us high level view of call
      graph statistics.
      
      Also add "-G" alias for inverted call graph.
      Signed-off-by: NSam Liao <phyomh@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      d797fdc5
  21. 07 3月, 2011 1 次提交
    • A
      perf tools: Improve support for sessions with multiple events · e248de33
      Arnaldo Carvalho de Melo 提交于
      By creating an perf_evlist out of the attributes in the perf.data file
      header, so that we can use evlists and evsels when reading recorded
      sessions in addition to when we record sessions.
      
      More work is needed to allow tools to allow the user to select which
      events are wanted when browsing sessions, be it just one or a subset of
      them, aggregated or showed at the same time but with different
      indications on the UI to allow seeing workloads thru different views at
      the same time.
      
      But the overall goal/trend is to more uniformly use evsels and evlists.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e248de33
  22. 06 3月, 2011 1 次提交
    • A
      perf hists: Remove needless global col lenght calcs · d7603d51
      Arnaldo Carvalho de Melo 提交于
      To support multiple events we need to do these calcs per 'struct hists'
      instance, and it turns out we already do that at:
      
      	__hists__add_entry
      		hists__inc_nr_entries
      			hists__calc_col_len
      
      for all the unfiltered hist_entry instances we stash in the rb tree, so
      trow away the dead code.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d7603d51
  23. 25 2月, 2011 1 次提交
    • A
      perf hists: Print number of samples, not the period sum · 69cf0218
      Arnaldo Carvalho de Melo 提交于
      So that we match the header where we state the number of events with the
      "Samples" column when using 'perf report -n/--show-nr-samples':
      
       [root@emilia ~]# perf record -a sleep 1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.111 MB perf.data (~4860 samples) ]
       [root@emilia ~]# perf report --stdio --show-nr-samples
       # Events: 11  cycles
       #
       # Overhead  Samples        Command       Shared Object                        Symbol
       # ........ ..........  ...........  ..................  ............................
       #
           16.65%          1        sleep  [kernel.kallsyms]   [k] unmap_vmas
           16.10%          1         perf  libpthread-2.12.so  [.] __pthread_cleanup_push_defer
           15.79%          2         perf  [kernel.kallsyms]   [k] format_decode
           12.88%          1  kworker/1:2  [kernel.kallsyms]   [k] cache_reap
           10.69%          1      swapper  [kernel.kallsyms]   [k] _raw_spin_lock
            7.55%          1        sleep  [kernel.kallsyms]   [k] prepare_exec_creds
            6.00%          1         perf  [jbd2]              [k] start_this_handle
            5.29%          1         perf  [kernel.kallsyms]   [k] seq_read
            4.75%          1         perf  [kernel.kallsyms]   [k] get_pid_task
            4.30%          1         perf  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore
      
       #
       # (For a higher level overview, try: perf report --sort comm,dso)
       #
       [root@emilia ~]#
      Reported-by: NStephane Eranian <eranian@google.com>
      Reported-by: NCliff Wickman <cpw@sgi.com>
      Acked-by: NStephane Eranian <eranian@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      [ cherry-picked it from perf/core, as it has been reported by others as well. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      69cf0218
  24. 17 2月, 2011 1 次提交
    • A
      perf hists: Print number of samples, not the period sum · fec9cbd1
      Arnaldo Carvalho de Melo 提交于
      So that we match the header where we state the number of events with the
      "Samples" column when using 'perf report -n/--show-nr-samples':
      
       [root@emilia ~]# perf record -a sleep 1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.111 MB perf.data (~4860 samples) ]
       [root@emilia ~]# perf report --stdio --show-nr-samples
       # Events: 11  cycles
       #
       # Overhead  Samples        Command       Shared Object                        Symbol
       # ........ ..........  ...........  ..................  ............................
       #
           16.65%          1        sleep  [kernel.kallsyms]   [k] unmap_vmas
           16.10%          1         perf  libpthread-2.12.so  [.] __pthread_cleanup_push_defer
           15.79%          2         perf  [kernel.kallsyms]   [k] format_decode
           12.88%          1  kworker/1:2  [kernel.kallsyms]   [k] cache_reap
           10.69%          1      swapper  [kernel.kallsyms]   [k] _raw_spin_lock
            7.55%          1        sleep  [kernel.kallsyms]   [k] prepare_exec_creds
            6.00%          1         perf  [jbd2]              [k] start_this_handle
            5.29%          1         perf  [kernel.kallsyms]   [k] seq_read
            4.75%          1         perf  [kernel.kallsyms]   [k] get_pid_task
            4.30%          1         perf  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore
      
       #
       # (For a higher level overview, try: perf report --sort comm,dso)
       #
       [root@emilia ~]#
      Reported-by: NStephane Eranian <eranian@google.com>
      Acked-by: NStephane Eranian <eranian@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fec9cbd1
  25. 09 2月, 2011 1 次提交
    • A
      perf annotate: Move locking to struct annotation · ce6f4fab
      Arnaldo Carvalho de Melo 提交于
      Since we'll need it when implementing the live annotate TUI browser.
      
      This also simplifies things a bit by having the list head for the source
      code to be in the dynamicly allocated part of struct annotation, that
      way we don't have to pass it around, it can be found from the struct
      symbol that is passed everywhere.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ce6f4fab
  26. 05 2月, 2011 2 次提交
    • A
      perf annotate: Support multiple histograms in annotation · 2f525d01
      Arnaldo Carvalho de Melo 提交于
      The perf annotate tool continues aggregating everything on just one
      histograms, but to support the top model add support for one histogram
      perf evsel in the evlist.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f525d01
    • A
      perf annotate: Move annotate functions to util/ · 78f7defe
      Arnaldo Carvalho de Melo 提交于
      They will be used by perf top, so that we have just one set of routines
      to do annotation.
      
      Rename "struct sym_priv" to "struct annotation", etc, to clarify this
      code a bit.
      
      Rename "struct sym_ext" to "struct source_line", to give it a meaningful
      name, that clarifies that it is a the result of an addr2line call, that
      is sorted by percentage one particular source code line appeared in the
      annotation.
      
      And since we're moving things around also rename 'sym_hist->ip' to
      'sym_hist->addr' as we want to do data structure annotation at some
      point.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      78f7defe
  27. 30 1月, 2011 1 次提交
  28. 23 1月, 2011 2 次提交
    • F
      perf callchain: Rename cumul_hits into callchain_cumul_hits · f08c3154
      Frederic Weisbecker 提交于
      That makes the callchain API naming more consistent and
      reduce potential naming clashes.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1294977121-5700-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f08c3154
    • F
      perf callchain: Feed callchains into a cursor · 1b3a0e95
      Frederic Weisbecker 提交于
      The callchains are fed with an array of a fixed size.
      As a result we iterate over each callchains three times:
      
      - 1st to resolve symbols
      - 2nd to filter out context boundaries
      - 3rd for the insertion into the tree
      
      This also involves some pairs of memory allocation/deallocation
      everytime we insert a callchain, for the filtered out array of
      addresses and for the array of symbols that comes along.
      
      Instead, feed the callchains through a linked list with persistent
      allocations. It brings several pros like:
      
      - Merge the 1st and 2nd iterations in one. That was possible before
      but in a way that would involve allocating an array slightly taller
      than necessary because we don't know in advance the number of context
      boundaries to filter out.
      
      - Much lesser allocations/deallocations. The linked list keeps
      persistent empty entries for the next usages and is extendable at
      will.
      
      - Makes it easier for multiple sources of callchains to feed a
      stacktrace together. This is deemed to pave the way for cfi based
      callchains wherein traditional frame pointer based kernel
      stacktraces will precede cfi based user ones, producing an overall
      callchain which size is hardly predictable. This requirement
      makes the static array obsolete and makes a linked list based
      iterator a much more flexible fit.
      
      Basic testing on a big perf file containing callchains (~ 176 MB)
      has shown a throughput gain of about 11% with perf report.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1b3a0e95