1. 02 12月, 2014 1 次提交
    • A
      perf callchain: Support handling complete branch stacks as histograms · 8b7bad58
      Andi Kleen 提交于
      Currently branch stacks can be only shown as edge histograms for
      individual branches. I never found this display particularly useful.
      
      This implements an alternative mode that creates histograms over
      complete branch traces, instead of individual branches, similar to how
      normal callgraphs are handled. This is done by putting it in front of
      the normal callgraph and then using the normal callgraph histogram
      infrastructure to unify them.
      
      This way in complex functions we can understand the control flow that
      lead to a particular sample, and may even see some control flow in the
      caller for short functions.
      
      Example (simplified, of course for such simple code this is usually not
      needed), please run this after the whole patchkit is in, as at this
      point in the patch order there is no --branch-history, that will be
      added in a patch after this one:
      
      tcall.c:
      
      volatile a = 10000, b = 100000, c;
      
      __attribute__((noinline)) f2()
      {
      	c = a / b;
      }
      
      __attribute__((noinline)) f1()
      {
      	f2();
      	f2();
      }
      main()
      {
      	int i;
      	for (i = 0; i < 1000000; i++)
      		f1();
      }
      
      % perf record -b -g ./tsrc/tcall
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
      % perf report --no-children --branch-history
      ...
          54.91%  tcall.c:6  [.] f2                      tcall
                  |
                  |--65.53%-- f2 tcall.c:5
                  |          |
                  |          |--70.83%-- f1 tcall.c:11
                  |          |          f1 tcall.c:10
                  |          |          main tcall.c:18
                  |          |          main tcall.c:18
                  |          |          main tcall.c:17
                  |          |          main tcall.c:17
                  |          |          f1 tcall.c:13
                  |          |          f1 tcall.c:13
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:12
                  |          |          f1 tcall.c:12
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:11
                  |          |
                  |           --29.17%-- f1 tcall.c:12
                  |                     f1 tcall.c:12
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:11
                  |                     f1 tcall.c:10
                  |                     main tcall.c:18
                  |                     main tcall.c:18
                  |                     main tcall.c:17
                  |                     main tcall.c:17
                  |                     f1 tcall.c:13
                  |                     f1 tcall.c:13
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:12
      
      The default output is unchanged.
      
      This is only implemented in perf report, no change to record or anywhere
      else.
      
      This adds the basic code to report:
      
      - add a new "branch" option to the -g option parser to enable this mode
      - when the flag is set include the LBR into the callstack in machine.c.
      
      The rest of the history code is unchanged and doesn't know the
      difference between LBR entry and normal call entry.
      
      - detect overlaps with the callchain
      - remove small loop duplicates in the LBR
      
      Current limitations:
      
      - The LBR flags (mispredict etc.) are not shown in the history
      and LBR entries have no special marker.
      - It would be nice if annotate marked the LBR entries somehow
      (e.g. with arrows)
      
      v2: Various fixes.
      v3: Merge further patches into this one. Fix white space.
      v4: Improve manpage. Address review feedback.
      v5: Rename functions. Better error message without -g. Fix crash without
          -b.
      v6: Rebase
      v7: Rebase. Use NO_ENTRY in memset.
      v8: Port to latest tip. Move add_callchain_ip to separate
          patch. Skip initial entries in callchain. Minor cleanups.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b7bad58
  2. 25 11月, 2014 1 次提交
    • A
      perf symbols: Move bfd_demangle stubbing to its only user · aaba4e12
      Arnaldo Carvalho de Melo 提交于
      We need to define bfd_demangle() to either a wrapper for
      cplus_demangle() or to a stub when NO_DEMANGLE is defined.
      
      That is at odds with using bfd.h for some other reason, as it defines
      bfd_demangle() and then if code that wants to use symbol.h, where the
      above stubbing/wrapping is done, and bfd.h for other reasons, we end up
      with a build error where bfd_demangle() is found to be redefined.
      
      Avoid that by moving the stubbing/wrapping to symbol-elf.c, that is the
      only user of such function. If we ever get to a point where there are
      more valid users, we can then introduce a header for that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-6wzjpe2fy9xtgchshulixlzw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aaba4e12
  3. 05 11月, 2014 1 次提交
    • N
      perf record: Do not save pathname in ./debug/.build-id directory for vmlinux · 00dc8657
      Namhyung Kim 提交于
      When perf record finishes a session, it pre-processes samples in order
      to write build-id info from DSOs that had samples.
      
      During this process it'll call map__load() for the kernel map, and it
      ends up calling dso__load_vmlinux_path() which replaces dso->long_name.
      
      But this function checks kernel's build-id before searching vmlinux path
      so it'll end up with a cryptic name, the pathname for the entry in the
      ~/.debug cache, which can be confusing to users.
      
      This patch adds a flag to skip the build-id check during record, so
      that it'll have the original vmlinux path for the kernel dso->long_name,
      not the entry in the ~/.debug cache.
      
      Before:
        # perf record -va sleep 3
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.196 MB perf.data (~8545 samples) ]
        Looking at the vmlinux_path (7 entries long)
        Using /home/namhyung/.debug/.build-id/f0/6e17aa50adf4d00b88925e03775de107611551 for symbols
      
      After:
        # perf record -va sleep 3
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.193 MB perf.data (~8432 samples) ]
        Looking at the vmlinux_path (7 entries long)
        Using /lib/modules/3.16.4-1-ARCH/build/vmlinux for symbols
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-7-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      00dc8657
  4. 15 10月, 2014 1 次提交
  5. 18 9月, 2014 1 次提交
  6. 14 8月, 2014 2 次提交
    • N
      perf symbols: Don't demangle parameters and such by default · e71e7945
      Namhyung Kim 提交于
      Some C++ symbols have very long name and they make column length longer.
      Most of them are about parameters including templates and we can ignore
      such info most of time IMHO.
      
      This patch passes DMGL_NO_OPTS by default when calling bfd_demangle().
      One can still see full symbols with -v/--verbose option.
      
      before:
        JS_CallFunctionValue(JSContext*, JSObject*, JS::Value, unsigned int, JS::Value*, JS::Value*)
      
      after:
        JS_CallFunctionValue
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1406785662-5534-9-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e71e7945
    • N
      perf tools: Check recorded kernel version when finding vmlinux · 0a7e6d1b
      Namhyung Kim 提交于
      Currently vmlinux_path__init() only tries to find vmlinux file from
      current directory, /boot and some canonical directories with version
      number of the running kernel.  This can be a problem when reporting old
      data recorded on a kernel version not running currently.
      
      We can use --symfs option for this but it's annoying for user to do it
      always.  As we already have the info in the perf.data file, it can be
      changed to use it for the search automatically.
      
      Before:
      
        $ perf report
        ...
        # Samples: 4K of event 'cpu-clock'
        # Event count (approx.): 1067250000
        #
        # Overhead  Command     Shared Object      Symbol
        # ........  ..........  .................  ..............................
            71.87%     swapper  [kernel.kallsyms]  [k] recover_probed_instruction
      
      After:
      
        # Overhead  Command     Shared Object      Symbol
        # ........  ..........  .................  ....................
            71.87%     swapper  [kernel.kallsyms]  [k] native_safe_halt
      
      This requires to change signature of symbol__init() to receive struct
      perf_session_env *.
      Reported-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1407825645-24586-14-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a7e6d1b
  7. 31 7月, 2014 1 次提交
  8. 24 7月, 2014 1 次提交
  9. 17 7月, 2014 2 次提交
  10. 08 7月, 2014 1 次提交
  11. 01 6月, 2014 1 次提交
  12. 05 5月, 2014 1 次提交
  13. 16 4月, 2014 1 次提交
  14. 19 3月, 2014 1 次提交
  15. 18 2月, 2014 2 次提交
    • M
      perf probe: Allow to add events on the local functions · eb948e50
      Masami Hiramatsu 提交于
      Allow to add events on the local functions without debuginfo.
      (With the debuginfo, we can add events even on inlined functions)
      Currently, probing on local functions requires debuginfo to
      locate actual address. It is also possible without debuginfo since
      we have symbol maps.
      
      Without this change;
        ----
        # ./perf probe -a t_show
        Added new event:
          probe:t_show         (on t_show)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:t_show -aR sleep 1
      
        # ./perf probe -x perf -a identity__map_ip
        no symbols found in /kbuild/ksrc/linux-3/tools/perf/perf, maybe install a debug package?
        Failed to load map.
          Error: Failed to add events. (-22)
        ----
      As the above results, perf probe just put one event
      on the first found symbol for kprobe event. Moreover,
      for uprobe event, perf probe failed to find local
      functions.
      
      With this change;
        ----
        # ./perf probe -a t_show
        Added new events:
          probe:t_show         (on t_show)
          probe:t_show_1       (on t_show)
          probe:t_show_2       (on t_show)
          probe:t_show_3       (on t_show)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:t_show_3 -aR sleep 1
      
        # ./perf probe -x perf -a identity__map_ip
        Added new events:
          probe_perf:identity__map_ip (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
          probe_perf:identity__map_ip_1 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
          probe_perf:identity__map_ip_2 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
          probe_perf:identity__map_ip_3 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe_perf:identity__map_ip_3 -aR sleep 1
        ----
      Now we succeed to put events on every given local functions
      for both kprobes and uprobes. :)
      
      Note that this also introduces some symbol rbtree
      iteration macros; symbols__for_each, dso__for_each_symbol,
      and map__for_each_symbol. These are for walking through
      the symbol list in a map.
      
      Changes from v2:
        - Fix add_exec_to_probe_trace_events() not to convert address
          to tp->symbol any more.
        - Fix to set kernel probes based on ref_reloc_sym.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: "David A. Long" <dave.long@linaro.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: yrl.pp-manager.tt@hitachi.com
      Link: http://lkml.kernel.org/r/20140206053225.29635.15026.stgit@kbuild-fedora.yrl.intra.hitachi.co.jpSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb948e50
    • A
      perf symbols: No need to export dso__first_symbol · c96626b1
      Arnaldo Carvalho de Melo 提交于
      There are no users outside the file that defines it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-sybihqycxrmssa4df9516jib@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c96626b1
  16. 17 1月, 2014 1 次提交
  17. 20 12月, 2013 1 次提交
    • A
      perf symbols: Add 'machine' member to struct addr_location · cc22e575
      Arnaldo Carvalho de Melo 提交于
      The addr_location struct should fully qualify an address, and to do that
      it should have in it the machine where the thread was found.
      
      Thus all functions that receive an addr_location now don't need to also
      receive a 'machine', those functions just need to access al->machine
      instead, just like it does with the other parts of an address location:
      al->thread, al->map, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-o51iiee7vyq4r3k362uvuylg@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cc22e575
  18. 13 12月, 2013 1 次提交
  19. 11 12月, 2013 1 次提交
  20. 28 11月, 2013 2 次提交
  21. 14 10月, 2013 2 次提交
    • A
      perf buildid-cache: Add ability to add kcore to the cache · fc1b691d
      Adrian Hunter 提交于
      kcore can be used to view the running kernel object code.  However,
      kcore changes as modules are loaded and unloaded, and when the kernel
      decides to modify its own code.  Consequently it is useful to create a
      copy of kcore at a particular time.  Unlike vmlinux, kcore is not unique
      for a given build-id.  And in addition, the kallsyms and modules files
      are also needed.  The tool therefore creates a directory:
      
      	~/.debug/[kernel.kcore]/<build-id>/<YYYYmmddHHMMSShh>
      
      which contains: kcore, kallsyms and modules.
      
      Note that the copied kcore contains only code sections.  See the
      kcore_copy() function for how that is determined.
      
      The tool will not make additional copies of kcore if there is already
      one with the same modules at the same addresses.
      
      Currently, perf tools will not look for kcore in the cache.  That is
      addressed in another patch.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/525BF849.5030405@intel.com
      [ renamed 'index' to 'idx' to avoid shadowing string.h symbol in f12,
        use at least one member initializer when initializing a struct to
        zeros, also to fix the build on f12 ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc1b691d
    • A
      perf symbols: Workaround objdump difficulties with kcore · afba19d9
      Adrian Hunter 提交于
      The objdump tool fails to annotate module symbols when looking at kcore.
      
      Workaround this by extracting object code from kcore and putting it in a
      temporary file for objdump to use instead.
      
      The temporary file is created to look like kcore but contains only the
      function being disassembled.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1381320078-16497-3-git-send-email-adrian.hunter@intel.com
      [ Renamed 'index' to 'idx' to avoid shadowing string.h's 'index' in Fedora 12,
        Replace local with variable length with malloc/free to fix build in Fedora 12 ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      afba19d9
  22. 11 10月, 2013 1 次提交
  23. 09 10月, 2013 2 次提交
  24. 08 8月, 2013 1 次提交
  25. 01 4月, 2013 1 次提交
  26. 27 3月, 2013 1 次提交
  27. 01 2月, 2013 1 次提交
  28. 25 1月, 2013 1 次提交
  29. 09 12月, 2012 1 次提交
  30. 15 11月, 2012 1 次提交
  31. 09 11月, 2012 1 次提交
  32. 29 10月, 2012 3 次提交