1. 15 11月, 2012 1 次提交
  2. 09 11月, 2012 3 次提交
    • A
      perf hists: Introduce hists__link · 494d70a1
      Arnaldo Carvalho de Melo 提交于
      That given two hists will find the hist_entries (buckets) in the second
      hists that are for the same bucket in the first and link them, then it
      will look for all buckets in the second that don't have a counterpart in
      the first and will create a dummy counterpart that will then be linked
      to the entry in the second.
      
      For multiple events this will be done pairing the leader with all the
      other events in the group, so that in the end the leader will have all
      the buckets in all the hists in a group, dummy or not while the other
      hists will be left untouched.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-l9l9ieozqdhn9lieokd95okw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      494d70a1
    • A
      perf diff: Move hists__match to the hists lib · 95529be4
      Arnaldo Carvalho de Melo 提交于
      Its not 'diff' specific and will be useful for other use cases, like
      bucketizing multiple events in a single session.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-o35urjgxfxxm70aw1wa81s4w@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      95529be4
    • A
      perf diff: Start moving to support matching more than two hists · b821c732
      Arnaldo Carvalho de Melo 提交于
      We want to match more than two hists, so that we can match more than two
      perf.data files and moreover, match hist_entries (buckets) in multiple
      events in a group.
      
      So the "baseline"/"leader" will instead of a ->pair pointer, use a
      list_head, that will link to the pairs and hists__match use it.
      
      Following that perf_evlist__link will link the hists in its evsel
      groups.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-2kbmzepoi544ygj9godseqpv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b821c732
  3. 08 11月, 2012 1 次提交
  4. 05 10月, 2012 4 次提交
  5. 27 9月, 2012 1 次提交
    • N
      perf hists: Add missing period_* fields when collapsing a hist entry · 9ec60972
      Namhyung Kim 提交于
      So that the perf report won't lost the cpu utilization information.
      
      For example, if there're two process that have same name.
      
        $ perf report --stdio --showcpuutilization -s pid
        [SNIP]
        #   Overhead       sys        us  Command:  Pid
        #   ........  ........  ........  .............
        #
              55.12%     0.01%    55.10%  noploop:28781
              44.88%     0.06%    44.83%  noploop:28782
      
      Before:
        $ perf report --stdio --showcpuutilization -s comm
        [SNIP]
        #   Overhead       sys        us
        #   ........  ........  ........
        #
             100.00%     0.06%    44.83%
      
      After:
        $ perf report --stdio --showcpuutilization -s comm
        [SNIP]
        #   Overhead       sys        us
        #   ........  ........  ........
        #
             100.00%     0.07%    99.93%
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1348645663-25303-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ec60972
  6. 11 9月, 2012 1 次提交
    • I
      perf tools: Use __maybe_used for unused variables · 1d037ca1
      Irina Tirdea 提交于
      perf defines both __used and __unused variables to use for marking
      unused variables. The variable __used is defined to
      __attribute__((__unused__)), which contradicts the kernel definition to
      __attribute__((__used__)) for new gcc versions. On Android, __used is
      also defined in system headers and this leads to warnings like: warning:
      '__used__' attribute ignored
      
      __unused is not defined in the kernel and is not a standard definition.
      If __unused is included everywhere instead of __used, this leads to
      conflicts with glibc headers, since glibc has a variables with this name
      in its headers.
      
      The best approach is to use __maybe_unused, the definition used in the
      kernel for __attribute__((unused)). In this way there is only one
      definition in perf sources (instead of 2 definitions that point to the
      same thing: __used and __unused) and it works on both Linux and Android.
      This patch simply replaces all instances of __used and __unused with
      __maybe_unused.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
      [ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d037ca1
  7. 09 9月, 2012 1 次提交
  8. 20 8月, 2012 1 次提交
  9. 25 7月, 2012 2 次提交
  10. 31 5月, 2012 1 次提交
  11. 18 5月, 2012 1 次提交
    • J
      perf hists: Fix callchain ip printf format · a0187060
      Jiri Olsa 提交于
      The callchain address is stored as u64. Current code uses following
      format string to display callchain address:
      
        "%p\n", (void *)(long)chain->ip
      
      This way we lose upper 32 bits if we report 64 bit addresses in 32 bit
      environment. Fixing this to always display whole 64 bits.
      
      Note, running following to test perf endianity handling:
      test 1)
        - origin system:
          # perf record -a -- sleep 10 (any perf record will do)
          # perf report > report.origin
          # perf archive perf.data
      
        - copy the perf.data, report.origin and perf.data.tar.bz2
          to a target system and run:
          # tar xjvf perf.data.tar.bz2 -C ~/.debug
          # perf report > report.target
          # diff -u report.origin report.target
      
        - the diff should produce no output
          (besides some white space stuff and possibly different
           date/TZ output)
      
      test 2)
        - origin system:
          # perf record -ag -fo /tmp/perf.data -- sleep 1
        - mount origin system root to the target system on /mnt/origin
        - target system:
          # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
           --kallsyms /mnt/origin/proc/kallsyms
        - complete perf.data header is displayed
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a0187060
  12. 06 4月, 2012 1 次提交
    • D
      perf hists: Catch and handle out-of-date hist entry maps. · 63fa471d
      David Miller 提交于
      When a process exec()'s, all the maps are retired, but we keep the hist
      entries around which hold references to those outdated maps.
      
      If the same library gets mapped in for which we have hist entries, a new
      map will be created.  But when we take a perf entry hit within that map,
      we'll find the existing hist entry with the older map.
      
      This causes symbol translations to be done incorrectly.  For example,
      the perf entry processing will lookup the correct uptodate map entry and
      use that to calculate the symbol and DSO relative address.  But later
      when we update the histogram we'll translate the address using the
      outdated map file instead leading to conditions such as out-of-range
      offsets in symbol__inc_addr_samples().
      
      Therefore, update the map of the hist_entry dynamically at lookup/
      creation time.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Cc: stable@kernel.org
      Link: http://lkml.kernel.org/r/20120327.031418.1220315351537060808.davem@davemloft.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63fa471d
  13. 27 3月, 2012 1 次提交
    • F
      perf tools: Fix display of first level of callchains · 6d4818c5
      Frederic Weisbecker 提交于
      The callchain stdio mode display was written using a sorted by symbol
      report. In this mode we have only one callchain root per hist so we
      forgot to handle cases where we have multiple callchain root, as in per
      dso sorting for example.
      
      Fix this by handling these roots like any other branch, with the hist as
      the parent.
      
      Before:
      
           1.97%  libpthread-2.12.1.so
                  |
                  --- __libc_write
                      create_worker
                      bench_sched_messaging
                      cmd_bench
                      run_builtin
                      main
                      __libc_start_main
      
                  |
                  --- __libc_read
                      create_worker
                      bench_sched_messaging
                      cmd_bench
                      run_builtin
                      main
                      __libc_start_main
      
      After:
      
           1.97%  libpthread-2.12.1.so
                  |
                  |--36.97%-- __libc_write
                  |          create_worker
                  |          bench_sched_messaging
                  |          cmd_bench
                  |          run_builtin
                  |          main
                  |          __libc_start_main
                  |
                  |--31.47%-- __libc_read
                  |          create_worker
                  |          bench_sched_messaging
                  |          cmd_bench
                  |          run_builtin
                  |          main
                  |          __libc_start_main
                 ...
      
      Single roots keep their entry without percentage because they have
      the same overhead than the hist they refer to. ie: 100% in fractal
      mode and the percentage of the hist in graph mode:
      
           0.00%  [k] reschedule_interrupt
                  |
                  --- default_idle
                      amd_e400_idle
                      cpu_idle
                      start_secondary
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1332526010-15400-1-git-send-email-fweisbec@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d4818c5
  14. 23 3月, 2012 1 次提交
  15. 17 3月, 2012 1 次提交
  16. 14 3月, 2012 1 次提交
  17. 09 3月, 2012 1 次提交
  18. 08 1月, 2012 1 次提交
    • N
      perf report: Fix --stdio output alignment when --showcpuutilization used · 0ed35abc
      Namhyung Kim 提交于
      Current perf report output is broken if --showcpuutilization is used.
      Combination with -n and/or --show-total-period make things worse.
      This patch fixes it as follows:
      
      before:
          48.25%    48.25%     0.00%    sleep  [kernel.kallsyms]  [k] trace_hardirqs_off
          34.99%    34.99%     0.00%    sleep  [kernel.kallsyms]  [k] __find_get_block_slow
          15.99%    15.99%     0.00%    sleep  [kernel.kallsyms]  [k] lock_release_holdtime
           0.77%     0.77%     0.00%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe
      
      after:
          48.25%    48.25%     0.00%    sleep  [kernel.kallsyms]  [k] trace_hardirqs_off
          34.99%    34.99%     0.00%    sleep  [kernel.kallsyms]  [k] __find_get_block_slow
          15.99%    15.99%     0.00%    sleep  [kernel.kallsyms]  [k] lock_release_holdtime
           0.77%     0.77%     0.00%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1325957132-10600-8-git-send-email-namhyung@gmail.comSigned-off-by: NNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ed35abc
  19. 07 1月, 2012 2 次提交
  20. 16 11月, 2011 1 次提交
    • A
      perf python: Fix undefined symbol problem · 0e2a5f10
      Arnaldo Carvalho de Melo 提交于
      Recently we made perf_evsel__init call hists__init, which broke the perf
      python binding:
      
      [root@emilia linux]# ./tools/perf/python/twatch.py
      Traceback (most recent call last):
        File "./tools/perf/python/twatch.py", line 16, in <module>
          import perf
      ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: hists__init
      
      Fix it by moving the hists__init function to its only caller, evsel.c.
      
      This way we avoid dragging in other parts of tools/perf/util/ to the
      perf python binding.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-5nffmdt5mu6ozxgj54oi4qon@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0e2a5f10
  21. 27 10月, 2011 1 次提交
  22. 20 10月, 2011 2 次提交
  23. 19 10月, 2011 3 次提交
  24. 17 10月, 2011 1 次提交
  25. 13 10月, 2011 2 次提交
  26. 08 10月, 2011 2 次提交
    • S
      perf tools: Fix broken number of samples for perf report -n · e39622ce
      Stephane Eranian 提交于
      The perf report -n option was broken because it was not reporting the
      correct number of samples depending on the sorting mode. By default,
      samples are sorted by comm,dso,sym. That means that samples for the same
      command (binary) get collapsed.
      
      The hists__collapse_insert_entry() had a bug whereby it was aggregating
      the number of events observed (periods) but not the number of samples.
      Consequently, the number of samples reported could be below reality. The
      percentage remained correct because based on the periods.
      
      This patch fixes the problem by also aggregating the number of samples.
      Here is an example:
      
      $ perf report -n --stdio
          12.38%        842     pong  [kernel.kallsyms]     [k] __lock_acquire
      
      Here pong (a ctxsw stress test), is the only program running
      and thus it is the only one responsible for the lock_acquire samples.
      
      If we change the sorting mode:
      
      $ perf report -n --stdio --sort=sym
          12.38%       1732  [k] __lock_acquire
      
      The actual number of samples is shown.
      
      With the fix:
      
      $ perf report -n --stdio
          12.38%       1732     pong  [kernel.kallsyms]     [k] __lock_acquire
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20111003093815.GA6393@quadSigned-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e39622ce
    • A
      perf top: Reuse the 'report' hist_entry/hists classes · ab81f3fd
      Arnaldo Carvalho de Melo 提交于
      This actually fixes several problems we had in the old 'perf top':
      
      1. Unresolved symbols not show, limitation that came from the old
         "KernelTop" codebase, to solve it we would need to do changes
         that would make sym_entry have most of the hist_entry fields.
      2. It was using the number of samples, not the sum of sample->period.
      
      And brings the --sort code that allows us to have all the views in
      'perf report', for instance:
      
      [root@emilia ~]# perf top --sort dso
      PerfTop: 5903 irqs/sec kernel:77.5% exact: 0.0% [1000Hz cycles], (all, 8 CPUs)
      ------------------------------------------------------------------------------
      
          31.59%  libcrypto.so.1.0.0
          21.55%  [kernel]
          18.57%  libpython2.6.so.1.0
           7.04%  libc-2.12.so
           6.99%  _backend_agg.so
           4.72%  sshd
           1.48%  multiarray.so
           1.39%  libfreetype.so.6.3.22
           1.37%  perf
           0.71%  libgobject-2.0.so.0.2200.5
           0.53%  [tg3]
           0.48%  libglib-2.0.so.0.2200.5
           0.44%  libstdc++.so.6.0.13
           0.40%  libcairo.so.2.10800.8
           0.38%  libm-2.12.so
           0.34%  umath.so
           0.30%  libgdk-x11-2.0.so.0.1800.9
           0.22%  libpthread-2.12.so
           0.20%  libgtk-x11-2.0.so.0.1800.9
           0.20%  librt-2.12.so
           0.15%  _path.so
           0.13%  libpango-1.0.so.0.2800.1
           0.11%  libatlas.so.3.0
           0.09%  ft2font.so
           0.09%  libpangoft2-1.0.so.0.2800.1
           0.08%  libX11.so.6.3.0
           0.07%  [vdso]
           0.06%  cyclictest
      ^C
      
      All the filter lists can be used as well: --dsos, --comms, --symbols,
      etc.
      
      The 'perf report' TUI is also reused, being possible to apply all the
      zoom operations, do annotation, etc.
      
      This change will allow multiple simplifications in the symbol system as
      well, that will be detailed in upcoming changesets.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-xzaaldxq7zhqrrxdxjifk1mh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab81f3fd
  27. 07 10月, 2011 2 次提交