1. 02 12月, 2014 3 次提交
    • A
      perf callchain: Support handling complete branch stacks as histograms · 8b7bad58
      Andi Kleen 提交于
      Currently branch stacks can be only shown as edge histograms for
      individual branches. I never found this display particularly useful.
      
      This implements an alternative mode that creates histograms over
      complete branch traces, instead of individual branches, similar to how
      normal callgraphs are handled. This is done by putting it in front of
      the normal callgraph and then using the normal callgraph histogram
      infrastructure to unify them.
      
      This way in complex functions we can understand the control flow that
      lead to a particular sample, and may even see some control flow in the
      caller for short functions.
      
      Example (simplified, of course for such simple code this is usually not
      needed), please run this after the whole patchkit is in, as at this
      point in the patch order there is no --branch-history, that will be
      added in a patch after this one:
      
      tcall.c:
      
      volatile a = 10000, b = 100000, c;
      
      __attribute__((noinline)) f2()
      {
      	c = a / b;
      }
      
      __attribute__((noinline)) f1()
      {
      	f2();
      	f2();
      }
      main()
      {
      	int i;
      	for (i = 0; i < 1000000; i++)
      		f1();
      }
      
      % perf record -b -g ./tsrc/tcall
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
      % perf report --no-children --branch-history
      ...
          54.91%  tcall.c:6  [.] f2                      tcall
                  |
                  |--65.53%-- f2 tcall.c:5
                  |          |
                  |          |--70.83%-- f1 tcall.c:11
                  |          |          f1 tcall.c:10
                  |          |          main tcall.c:18
                  |          |          main tcall.c:18
                  |          |          main tcall.c:17
                  |          |          main tcall.c:17
                  |          |          f1 tcall.c:13
                  |          |          f1 tcall.c:13
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:12
                  |          |          f1 tcall.c:12
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:11
                  |          |
                  |           --29.17%-- f1 tcall.c:12
                  |                     f1 tcall.c:12
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:11
                  |                     f1 tcall.c:10
                  |                     main tcall.c:18
                  |                     main tcall.c:18
                  |                     main tcall.c:17
                  |                     main tcall.c:17
                  |                     f1 tcall.c:13
                  |                     f1 tcall.c:13
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:12
      
      The default output is unchanged.
      
      This is only implemented in perf report, no change to record or anywhere
      else.
      
      This adds the basic code to report:
      
      - add a new "branch" option to the -g option parser to enable this mode
      - when the flag is set include the LBR into the callstack in machine.c.
      
      The rest of the history code is unchanged and doesn't know the
      difference between LBR entry and normal call entry.
      
      - detect overlaps with the callchain
      - remove small loop duplicates in the LBR
      
      Current limitations:
      
      - The LBR flags (mispredict etc.) are not shown in the history
      and LBR entries have no special marker.
      - It would be nice if annotate marked the LBR entries somehow
      (e.g. with arrows)
      
      v2: Various fixes.
      v3: Merge further patches into this one. Fix white space.
      v4: Improve manpage. Address review feedback.
      v5: Rename functions. Better error message without -g. Fix crash without
          -b.
      v6: Rebase
      v7: Rebase. Use NO_ENTRY in memset.
      v8: Port to latest tip. Move add_callchain_ip to separate
          patch. Skip initial entries in callchain. Minor cleanups.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b7bad58
    • J
      perf stat: Add support for per-pkg counters · 779d0b99
      Jiri Olsa 提交于
      The .per-pkg file indicates that all but one value per socket should be
      discarded. Adding the logic of skipping the rest of the socket once
      first value was read.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1416562275-12404-11-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      779d0b99
    • J
      perf tools: Remove perf_evsel__read interface · a5a7fd76
      Jiri Olsa 提交于
      Removing the perf_evsel__read interfaces because we replaced the only
      user in the stat command code.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1416562275-12404-8-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a5a7fd76
  2. 25 11月, 2014 10 次提交
  3. 19 11月, 2014 12 次提交
  4. 16 11月, 2014 1 次提交
  5. 07 11月, 2014 4 次提交
  6. 05 11月, 2014 6 次提交
    • N
      perf tools: Make vmlinux short name more like kallsyms short name · 96d78059
      Namhyung Kim 提交于
      The previous patch changed kernel dso name from '[kernel.kallsyms]' to
      vmlinux.  However it might add confusion to old users accustomed to the
      old name.  So change the short name to '[kernel.vmlinux]' to reduce such
      confusion.
      
      Before:
        # Overhead  Command         Shared Object            Symbol
        # ........  ..............  .......................  ...............................
        #
             9.83%  swapper         vmlinux                  [k] intel_idle
             4.10%  awk             libc-2.20.so             [.] __strcmp_sse2
             1.86%  sed             libc-2.20.so             [.] __strcmp_sse2
             1.78%  netctl-auto     libc-2.20.so             [.] __strcmp_sse2
             1.23%  netctl-auto     libc-2.20.so             [.] __mbrtowc
             1.21%  firefox         libxul.so                [.] 0x00000000024b62bd
             1.20%  swapper         vmlinux                  [k] cpuidle_enter_state
             1.03%  sleep           vmlinux                  [k] copy_user_generic_unrolled
      
      After:
        # Overhead  Command         Shared Object            Symbol
        # ........  ..............  .......................  ...............................
        #
             9.83%  swapper         [kernel.vmlinux]         [k] intel_idle
             4.10%  awk             libc-2.20.so             [.] __strcmp_sse2
             1.86%  sed             libc-2.20.so             [.] __strcmp_sse2
             1.78%  netctl-auto     libc-2.20.so             [.] __strcmp_sse2
             1.23%  netctl-auto     libc-2.20.so             [.] __mbrtowc
             1.21%  firefox         libxul.so                [.] 0x00000000024b62bd
             1.20%  swapper         [kernel.vmlinux]         [k] cpuidle_enter_state
             1.03%  sleep           [kernel.vmlinux]         [k] copy_user_generic_unrolled
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-9-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      96d78059
    • N
      perf tools: Fix build-id matching on vmlinux · b837a8bd
      Namhyung Kim 提交于
      There's a problem on finding correct kernel symbols when perf report
      runs on a different kernel.  Although a part of the problem was solved
      by the prior commit 0a7e6d1b ("perf tools: Check recorded kernel
      version when finding vmlinux"), there's a remaining problem still.
      
      When perf records samples, it synthesizes the kernel map using
      machine__mmap_name() and ref_reloc_sym like "[kernel.kallsyms]_text".
      You can easily see it using 'perf report -D' command.
      
      After finishing record, it goes through the recorded events to find
      maps/dsos actually used.  And then record build-id info of them.
      
      During this process, it needs to load symbols in a dso and it'd call
      dso__load_vmlinux_path() since the default value of the symbol_conf.
      try_vmlinux_path is true.  However it changes dso->long_name to a real
      path of the vmlinux file (e.g. /lib/modules/3.16.4/build/vmlinux) if one
      is running on a custom kernel.
      
      It resulted in that perf report reads the build-id of the vmlinux, but
      cannot use it since it only knows about the [kernel.kallsyms] map.  It
      then falls back to possible vmlinux paths by using the recorded kernel
      version (in case of a recent version) or a running kernel silently.
      
      Even with the recent tools, this still has a possibility of breaking
      the result.  As the build directory is a symbolic link, if one built a
      new kernel in the same directory with different source/config, the old
      link to vmlinux will point the new file.  So it's absolutely needed to
      use build-id when finding a kernel image.
      
      In this patch, it's now changed to try to search a kernel dso in the
      existing dso list which was constructed during build-id table parsing
      so it'll always have a build-id.  If not found, search "[kernel.kallsyms]".
      
      Before:
      
        $ perf report
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ...............................
        #
            72.15%     0.00%  swapper  [kernel.kallsyms]  [k] set_curr_task_rt
            72.15%     0.00%  swapper  [kernel.kallsyms]  [k] native_calibrate_tsc
            72.15%     0.00%  swapper  [kernel.kallsyms]  [k] tsc_refine_calibration_work
            71.87%    71.87%  swapper  [kernel.kallsyms]  [k] module_finalize
         ...
      
      After (for the same perf.data):
      
            72.15%     0.00%  swapper  vmlinux  [k] cpu_startup_entry
            72.15%     0.00%  swapper  vmlinux  [k] arch_cpu_idle
            72.15%     0.00%  swapper  vmlinux  [k] default_idle
            71.87%    71.87%  swapper  vmlinux  [k] native_safe_halt
         ...
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Link: http://lkml.kernel.org/r/20140924073356.GB1962@gmail.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-8-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b837a8bd
    • N
      perf record: Do not save pathname in ./debug/.build-id directory for vmlinux · 00dc8657
      Namhyung Kim 提交于
      When perf record finishes a session, it pre-processes samples in order
      to write build-id info from DSOs that had samples.
      
      During this process it'll call map__load() for the kernel map, and it
      ends up calling dso__load_vmlinux_path() which replaces dso->long_name.
      
      But this function checks kernel's build-id before searching vmlinux path
      so it'll end up with a cryptic name, the pathname for the entry in the
      ~/.debug cache, which can be confusing to users.
      
      This patch adds a flag to skip the build-id check during record, so
      that it'll have the original vmlinux path for the kernel dso->long_name,
      not the entry in the ~/.debug cache.
      
      Before:
        # perf record -va sleep 3
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.196 MB perf.data (~8545 samples) ]
        Looking at the vmlinux_path (7 entries long)
        Using /home/namhyung/.debug/.build-id/f0/6e17aa50adf4d00b88925e03775de107611551 for symbols
      
      After:
        # perf record -va sleep 3
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.193 MB perf.data (~8432 samples) ]
        Looking at the vmlinux_path (7 entries long)
        Using /lib/modules/3.16.4-1-ARCH/build/vmlinux for symbols
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-7-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      00dc8657
    • N
      perf build-id: Move build-id related functions to util/build-id.c · e195fac8
      Namhyung Kim 提交于
      It'd be better managing those functions in a separate place as
      util/header.c file is already big.
      
      It now exports following 3 functions to others:
      
        bool perf_session__read_build_ids(struct perf_session *session, bool with_hits);
        int perf_session__write_buildid_table(struct perf_session *session, int fd);
        int perf_session__cache_build_ids(struct perf_session *session);
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/r/545733E7.6010105@intel.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e195fac8
    • N
      perf build-id: Rename dsos__write_buildid_table() · 714c9c4a
      Namhyung Kim 提交于
      The dsos__write_buildid_table() is not use struct dso and it mostly
      uses perf_session struct.
      
      So rename it to perf_session__write_buildid_ table() so that it
      corresponds to other related functions such as
      perf_session__read_build_ids() and perf_session__cache_build_ids().
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1415063674-17206-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      714c9c4a
    • N
      perf tools: Add gzip decompression support for kernel module · e92ce12e
      Namhyung Kim 提交于
      Now my Archlinux box shows module symbols correctly.
      
      Before:
        $ perf report --stdio
        Failed to open /tmp/perf-3477.map, continuing without symbols
        no symbols found in /usr/bin/date, maybe install a debug package?
        No kallsyms or vmlinux with build-id 7b4ea0a49ae2111925857099aaf05c3246ff33e0 was found
        [drm] with build id 7b4ea0a49ae2111925857099aaf05c3246ff33e0 not found, continuing without symbols
        No kallsyms or vmlinux with build-id edd931629094b660ca9dec09a1b635c8d87aa2ee was found
        [jbd2] with build id edd931629094b660ca9dec09a1b635c8d87aa2ee not found, continuing without symbols
        No kallsyms or vmlinux with build-id a7b1eada671c34933e5610bb920b2ca4945a82c3 was found
        [ext4] with build id a7b1eada671c34933e5610bb920b2ca4945a82c3 not found, continuing without symbols
        No kallsyms or vmlinux with build-id d69511fa3e5840e770336ef45b06c83fef8d74e3 was found
        [scsi_mod] with build id d69511fa3e5840e770336ef45b06c83fef8d74e3 not found, continuing without symbols
        No kallsyms or vmlinux with build-id af0430af13461af058770ee9b87afc07922c2e77 was found
        [libata] with build id af0430af13461af058770ee9b87afc07922c2e77 not found, continuing without symbols
        No kallsyms or vmlinux with build-id aaeedff8160ce631a5f0333591c6ff291201d29f was found
        [libahci] with build id aaeedff8160ce631a5f0333591c6ff291201d29f not found, continuing without symbols
        No kallsyms or vmlinux with build-id c57907712becaf662dc4981824bb372c0441d605 was found
        [mac80211] with build id c57907712becaf662dc4981824bb372c0441d605 not found, continuing without symbols
        No kallsyms or vmlinux with build-id e0589077cc0ec8c3e4c40eb9f2d9e69d236bee8f was found
        [iwldvm] with build id e0589077cc0ec8c3e4c40eb9f2d9e69d236bee8f not found, continuing without symbols
        No kallsyms or vmlinux with build-id 2d86086bf136bf374a2f029cf85a48194f9b950b was found
        [cfg80211] with build id 2d86086bf136bf374a2f029cf85a48194f9b950b not found, continuing without symbols
        No kallsyms or vmlinux with build-id 4493c48599bdb3d91d0f8db5150e0be33fdd9221 was found
        [iwlwifi] with build id 4493c48599bdb3d91d0f8db5150e0be33fdd9221 not found, continuing without symbols
        ...
        #
        # Overhead  Command          Shared Object            Symbol
        # ........  ...............  .......................  ........................................................
        #
             0.03%  swapper          [ext4]                   [k] 0x000000000000fe2e
             0.03%  swapper          [kernel.kallsyms]        [k] account_entity_enqueue
             0.03%  swapper          [ext4]                   [k] 0x000000000000fc2b
             0.03%  irq/50-iwlwifi   [iwlwifi]                [k] 0x000000000000200b
             0.03%  swapper          [kernel.kallsyms]        [k] ktime_add_safe
             0.03%  swapper          [kernel.kallsyms]        [k] elv_completed_request
             0.03%  swapper          [libata]                 [k] 0x0000000000003997
             0.03%  swapper          [libahci]                [k] 0x0000000000001f25
             0.03%  swapper          [kernel.kallsyms]        [k] rb_next
             0.03%  swapper          [kernel.kallsyms]        [k] blk_finish_request
             0.03%  swapper          [ext4]                   [k] 0x0000000000010248
             0.00%  perf             [kernel.kallsyms]        [k] native_write_msr_safe
      
      After:
        $ perf report --stdio
        Failed to open /tmp/perf-3477.map, continuing without symbols
        no symbols found in /usr/bin/tr, maybe install a debug package?
        ...
        #
        # Overhead  Command          Shared Object                Symbol
        # ........  ...............  ...........................  ......................................................
        #
      
             0.04%  kworker/u16:3    [ext4]                       [k] ext4_read_block_bitmap
             0.03%  kworker/u16:0    [mac80211]                   [k] ieee80211_sta_reset_beacon_monitor
             0.02%  irq/50-iwlwifi   [mac80211]                   [k] ieee80211_get_bssid
             0.02%  firefox          [e1000e]                     [k] __ew32_prepare
             0.02%  swapper          [libahci]                    [k] ahci_handle_port_interrupt
             0.02%  emacs            libglib-2.0.so.0.4000.0      [.] g_mutex_unlock
             0.02%  swapper          [e1000e]                     [k] e1000_clean_tx_irq
             0.02%  dwm              [kernel.kallsyms]            [k] __schedule
             0.02%  gnome-terminal-  [vdso]                       [.] __vdso_clock_gettime
             0.02%  swapper          [e1000e]                     [k] e1000_alloc_rx_buffers
             0.02%  irq/50-iwlwifi   [mac80211]                   [k] ieee80211_rx
             0.01%  firefox          [vdso]                       [.] __vdso_gettimeofday
             0.01%  irq/50-iwlwifi   [iwlwifi]                    [k] iwl_pcie_rxq_restock.part.13
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/87h9yexshi.fsf@sejong.aot.lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e92ce12e
  7. 04 11月, 2014 4 次提交