1. 20 5月, 2016 1 次提交
  2. 17 5月, 2016 1 次提交
  3. 11 5月, 2016 1 次提交
    • C
      perf symbols: Add dso__insert_symbol function · ae93a6c7
      Chris Phlipot 提交于
      The current method for inserting symbols is to use the symbols__insert()
      function. However symbols__insert() does not update the dso symbol
      cache.  This causes problems in the following scenario:
      
      1. symbol not found at addr using dso__find_symbol
      
      2. symbol inserted at addr using the existing symbols__insert function
      
      3. symbol still not found at addr using dso__find_symbol() because cache isn't
         updated. This is undesired behavior.
      
      The undesired behavior in (3) is addressed by creating a new function,
      dso__insert_symbol() to both insert the symbol and update the symbol
      cache if necessary.
      
      If dso__insert_symbol() is used in (2) instead of symbols__insert(),
      then the undesired behavior in (3) is avoided.
      Signed-off-by: NChris Phlipot <cphlipot0@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ae93a6c7
  4. 06 5月, 2016 1 次提交
  5. 19 4月, 2016 1 次提交
  6. 15 4月, 2016 1 次提交
  7. 12 4月, 2016 1 次提交
    • A
      perf evsel: Allow unresolved symbol names to be printed as addresses · fd4be130
      Arnaldo Carvalho de Melo 提交于
      The fprintf_sym() and fprintf_callchain() methods now allow users to
      change the existing behaviour of showing "[unknown]" as the name of
      unresolved symbols to instead show "[0x123456]", i.e. its address.
      
      The current patch doesn't change tools to use this facility, the results
      from 'perf trace' and 'perf script' cotinue like:
      
      70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
                                         [unknown] (/usr/lib64/libc-2.22.so)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         [unknown] (/usr/lib64/libspice-server.so.1.10.0)
                                         start_thread+0xca (/usr/lib64/libpthread-2.22.so)
                                         __clone+0x6d (/usr/lib64/libc-2.22.so)
      
      The next patch will make 'perf trace' use the new formatting.
      Suggested-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fd4be130
  8. 24 3月, 2016 1 次提交
  9. 25 2月, 2016 1 次提交
  10. 07 1月, 2016 1 次提交
    • N
      perf report/top: Add --raw-trace option · 053a3989
      Namhyung Kim 提交于
      The --raw-trace option allows disabling pretty printing by the event's
      print_fmt or plugin.  Besides that, each dynamic sort key now can
      receive a 'raw' suffix separated by '/' to ask for the raw trace of a
      specific field.
      
        $ perf report -s comm,kmem:kmalloc.gfp_flags
        ...
        # Overhead  Command            gfp_flags
        # ........  .......  ...................
        #
            99.89%  perf       GFP_NOFS|GFP_ZERO
             0.06%  sleep             GFP_KERNEL
             0.03%  perf     GFP_KERNEL|GFP_ZERO
             0.01%  perf              GFP_KERNEL
      
      Now
      
        $ perf report -s comm,kmem:kmalloc.gfp_flags --raw-trace
      or
        $ perf report -s comm,kmem:kmalloc.gfp_flags/raw
        ...
        # Overhead  Command   gfp_flags
        # ........  .......  ..........
        #
            99.89%  perf          32848
             0.06%  sleep           208
             0.03%  perf          32976
             0.01%  perf            208
      Suggested-and-Acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-9-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      053a3989
  11. 27 11月, 2015 1 次提交
  12. 13 11月, 2015 1 次提交
  13. 14 9月, 2015 1 次提交
  14. 29 8月, 2015 1 次提交
  15. 13 8月, 2015 1 次提交
    • K
      perf report: Show call graph from reference events · 9e207ddf
      Kan Liang 提交于
      Introduce --show-ref-call-graph for perf report to print reference
      callgraph for no callgraph event.
      
      Here is an example.
      
       perf report --show-ref-call-graph --stdio
      
       # To display the perf.data header info, please use
       --header/--header-only options.
       #
       #
       # Total Lost Samples: 0
       #
       # Samples: 5  of event 'cpu/cpu-cycles,call-graph=fp/'
       # Event count (approx.): 144985
       #
       # Children      Self  Command  Shared Object     Symbol
       # ........  ........  .......  ................  ........................................
       #
          72.30%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--22.62%-- __GI___libc_nanosleep
                        --77.38%-- [...]
      
      ......
      
       # Samples: 6  of event 'cpu/instructions,call-graph=no/', show reference callgraph
       # Event count (approx.): 172780
       #
       # Children      Self  Command  Shared Object     Symbol
       # ........  ........  .......  ................  ........................................
       #
          73.16%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--31.44%-- __GI___libc_nanosleep
                        --68.56%-- [...]
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e207ddf
  16. 13 7月, 2015 1 次提交
    • A
      perf symbols: Store if there is a filter in place · 0bc2f2f7
      Arnaldo Carvalho de Melo 提交于
      When setting yup the symbols library we setup several filter lists,
      for dsos, comms, symbols, etc, and there is code that, if there are
      filters, do certain operations, like recalculate the number of non
      filtered histogram entries in the top/report TUI.
      
      But they were considering just the "Zoom" filters, when they need to
      take into account as well the above mentioned filters (perf top --comms,
      --dsos, etc).
      
      So store in symbol_conf.has_filter true if any of those filters is in
      place.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-f5edfmhq69vfvs1kmikq1wep@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0bc2f2f7
  17. 06 5月, 2015 1 次提交
  18. 04 5月, 2015 3 次提交
  19. 25 3月, 2015 1 次提交
  20. 12 3月, 2015 1 次提交
  21. 21 1月, 2015 1 次提交
  22. 02 12月, 2014 1 次提交
    • A
      perf callchain: Support handling complete branch stacks as histograms · 8b7bad58
      Andi Kleen 提交于
      Currently branch stacks can be only shown as edge histograms for
      individual branches. I never found this display particularly useful.
      
      This implements an alternative mode that creates histograms over
      complete branch traces, instead of individual branches, similar to how
      normal callgraphs are handled. This is done by putting it in front of
      the normal callgraph and then using the normal callgraph histogram
      infrastructure to unify them.
      
      This way in complex functions we can understand the control flow that
      lead to a particular sample, and may even see some control flow in the
      caller for short functions.
      
      Example (simplified, of course for such simple code this is usually not
      needed), please run this after the whole patchkit is in, as at this
      point in the patch order there is no --branch-history, that will be
      added in a patch after this one:
      
      tcall.c:
      
      volatile a = 10000, b = 100000, c;
      
      __attribute__((noinline)) f2()
      {
      	c = a / b;
      }
      
      __attribute__((noinline)) f1()
      {
      	f2();
      	f2();
      }
      main()
      {
      	int i;
      	for (i = 0; i < 1000000; i++)
      		f1();
      }
      
      % perf record -b -g ./tsrc/tcall
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
      % perf report --no-children --branch-history
      ...
          54.91%  tcall.c:6  [.] f2                      tcall
                  |
                  |--65.53%-- f2 tcall.c:5
                  |          |
                  |          |--70.83%-- f1 tcall.c:11
                  |          |          f1 tcall.c:10
                  |          |          main tcall.c:18
                  |          |          main tcall.c:18
                  |          |          main tcall.c:17
                  |          |          main tcall.c:17
                  |          |          f1 tcall.c:13
                  |          |          f1 tcall.c:13
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:12
                  |          |          f1 tcall.c:12
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:11
                  |          |
                  |           --29.17%-- f1 tcall.c:12
                  |                     f1 tcall.c:12
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:11
                  |                     f1 tcall.c:10
                  |                     main tcall.c:18
                  |                     main tcall.c:18
                  |                     main tcall.c:17
                  |                     main tcall.c:17
                  |                     f1 tcall.c:13
                  |                     f1 tcall.c:13
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:12
      
      The default output is unchanged.
      
      This is only implemented in perf report, no change to record or anywhere
      else.
      
      This adds the basic code to report:
      
      - add a new "branch" option to the -g option parser to enable this mode
      - when the flag is set include the LBR into the callstack in machine.c.
      
      The rest of the history code is unchanged and doesn't know the
      difference between LBR entry and normal call entry.
      
      - detect overlaps with the callchain
      - remove small loop duplicates in the LBR
      
      Current limitations:
      
      - The LBR flags (mispredict etc.) are not shown in the history
      and LBR entries have no special marker.
      - It would be nice if annotate marked the LBR entries somehow
      (e.g. with arrows)
      
      v2: Various fixes.
      v3: Merge further patches into this one. Fix white space.
      v4: Improve manpage. Address review feedback.
      v5: Rename functions. Better error message without -g. Fix crash without
          -b.
      v6: Rebase
      v7: Rebase. Use NO_ENTRY in memset.
      v8: Port to latest tip. Move add_callchain_ip to separate
          patch. Skip initial entries in callchain. Minor cleanups.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b7bad58
  23. 25 11月, 2014 1 次提交
    • A
      perf symbols: Move bfd_demangle stubbing to its only user · aaba4e12
      Arnaldo Carvalho de Melo 提交于
      We need to define bfd_demangle() to either a wrapper for
      cplus_demangle() or to a stub when NO_DEMANGLE is defined.
      
      That is at odds with using bfd.h for some other reason, as it defines
      bfd_demangle() and then if code that wants to use symbol.h, where the
      above stubbing/wrapping is done, and bfd.h for other reasons, we end up
      with a build error where bfd_demangle() is found to be redefined.
      
      Avoid that by moving the stubbing/wrapping to symbol-elf.c, that is the
      only user of such function. If we ever get to a point where there are
      more valid users, we can then introduce a header for that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-6wzjpe2fy9xtgchshulixlzw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aaba4e12
  24. 05 11月, 2014 1 次提交
    • N
      perf record: Do not save pathname in ./debug/.build-id directory for vmlinux · 00dc8657
      Namhyung Kim 提交于
      When perf record finishes a session, it pre-processes samples in order
      to write build-id info from DSOs that had samples.
      
      During this process it'll call map__load() for the kernel map, and it
      ends up calling dso__load_vmlinux_path() which replaces dso->long_name.
      
      But this function checks kernel's build-id before searching vmlinux path
      so it'll end up with a cryptic name, the pathname for the entry in the
      ~/.debug cache, which can be confusing to users.
      
      This patch adds a flag to skip the build-id check during record, so
      that it'll have the original vmlinux path for the kernel dso->long_name,
      not the entry in the ~/.debug cache.
      
      Before:
        # perf record -va sleep 3
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.196 MB perf.data (~8545 samples) ]
        Looking at the vmlinux_path (7 entries long)
        Using /home/namhyung/.debug/.build-id/f0/6e17aa50adf4d00b88925e03775de107611551 for symbols
      
      After:
        # perf record -va sleep 3
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.193 MB perf.data (~8432 samples) ]
        Looking at the vmlinux_path (7 entries long)
        Using /lib/modules/3.16.4-1-ARCH/build/vmlinux for symbols
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-7-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      00dc8657
  25. 15 10月, 2014 1 次提交
  26. 18 9月, 2014 1 次提交
  27. 14 8月, 2014 2 次提交
    • N
      perf symbols: Don't demangle parameters and such by default · e71e7945
      Namhyung Kim 提交于
      Some C++ symbols have very long name and they make column length longer.
      Most of them are about parameters including templates and we can ignore
      such info most of time IMHO.
      
      This patch passes DMGL_NO_OPTS by default when calling bfd_demangle().
      One can still see full symbols with -v/--verbose option.
      
      before:
        JS_CallFunctionValue(JSContext*, JSObject*, JS::Value, unsigned int, JS::Value*, JS::Value*)
      
      after:
        JS_CallFunctionValue
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1406785662-5534-9-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e71e7945
    • N
      perf tools: Check recorded kernel version when finding vmlinux · 0a7e6d1b
      Namhyung Kim 提交于
      Currently vmlinux_path__init() only tries to find vmlinux file from
      current directory, /boot and some canonical directories with version
      number of the running kernel.  This can be a problem when reporting old
      data recorded on a kernel version not running currently.
      
      We can use --symfs option for this but it's annoying for user to do it
      always.  As we already have the info in the perf.data file, it can be
      changed to use it for the search automatically.
      
      Before:
      
        $ perf report
        ...
        # Samples: 4K of event 'cpu-clock'
        # Event count (approx.): 1067250000
        #
        # Overhead  Command     Shared Object      Symbol
        # ........  ..........  .................  ..............................
            71.87%     swapper  [kernel.kallsyms]  [k] recover_probed_instruction
      
      After:
      
        # Overhead  Command     Shared Object      Symbol
        # ........  ..........  .................  ....................
            71.87%     swapper  [kernel.kallsyms]  [k] native_safe_halt
      
      This requires to change signature of symbol__init() to receive struct
      perf_session_env *.
      Reported-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a7e6d1b
  28. 31 7月, 2014 1 次提交
  29. 24 7月, 2014 1 次提交
  30. 17 7月, 2014 2 次提交
  31. 08 7月, 2014 1 次提交
  32. 01 6月, 2014 1 次提交
  33. 05 5月, 2014 1 次提交
  34. 16 4月, 2014 1 次提交
  35. 19 3月, 2014 1 次提交
  36. 18 2月, 2014 1 次提交
    • M
      perf probe: Allow to add events on the local functions · eb948e50
      Masami Hiramatsu 提交于
      Allow to add events on the local functions without debuginfo.
      (With the debuginfo, we can add events even on inlined functions)
      Currently, probing on local functions requires debuginfo to
      locate actual address. It is also possible without debuginfo since
      we have symbol maps.
      
      Without this change;
        ----
        # ./perf probe -a t_show
        Added new event:
          probe:t_show         (on t_show)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:t_show -aR sleep 1
      
        # ./perf probe -x perf -a identity__map_ip
        no symbols found in /kbuild/ksrc/linux-3/tools/perf/perf, maybe install a debug package?
        Failed to load map.
          Error: Failed to add events. (-22)
        ----
      As the above results, perf probe just put one event
      on the first found symbol for kprobe event. Moreover,
      for uprobe event, perf probe failed to find local
      functions.
      
      With this change;
        ----
        # ./perf probe -a t_show
        Added new events:
          probe:t_show         (on t_show)
          probe:t_show_1       (on t_show)
          probe:t_show_2       (on t_show)
          probe:t_show_3       (on t_show)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:t_show_3 -aR sleep 1
      
        # ./perf probe -x perf -a identity__map_ip
        Added new events:
          probe_perf:identity__map_ip (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
          probe_perf:identity__map_ip_1 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
          probe_perf:identity__map_ip_2 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
          probe_perf:identity__map_ip_3 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe_perf:identity__map_ip_3 -aR sleep 1
        ----
      Now we succeed to put events on every given local functions
      for both kprobes and uprobes. :)
      
      Note that this also introduces some symbol rbtree
      iteration macros; symbols__for_each, dso__for_each_symbol,
      and map__for_each_symbol. These are for walking through
      the symbol list in a map.
      
      Changes from v2:
        - Fix add_exec_to_probe_trace_events() not to convert address
          to tp->symbol any more.
        - Fix to set kernel probes based on ref_reloc_sym.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: "David A. Long" <dave.long@linaro.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: yrl.pp-manager.tt@hitachi.com
      Link: http://lkml.kernel.org/r/20140206053225.29635.15026.stgit@kbuild-fedora.yrl.intra.hitachi.co.jpSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb948e50