1. 27 3月, 2017 1 次提交
    • J
      perf report: Show inline stack for stdio mode · 0db64dd0
      Jin Yao 提交于
      If the address belongs to an inlined function, the source information
      back to the first non-inlined function will be printed.
      
      For example:
      
      1. Show inlined function name
         perf report --stdio -g function --inline
      
           0.69%     0.00%  inline   ld-2.23.so           [.] dl_main
                  |
                  ---dl_main
                     |
                      --0.56%--_dl_relocate_object
                                _dl_relocate_object (inline)
                                elf_dynamic_do_Rela (inline)
      
      2. Show the file/line information
         perf report --stdio -g address --inline
      
           0.69%     0.00%  inline   ld-2.23.so           [.] _dl_start_user
                  |
                  ---_dl_start_user .:0
                     _dl_start rtld.c:307
                     /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
                     _dl_sysdep_start dl-sysdep.c:250
                     |
                      --0.56%--dl_main rtld.c:2076
      
      Committer tests:
      
        # perf record --call-graph dwarf ~/bin/perf stat usleep 1
      
       Performance counter stats for 'usleep 1':
      
                0.443020      task-clock (msec)         #    0.449 CPUs utilized
                       1      context-switches          #    0.002 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      52      page-faults               #    0.117 M/sec
               1,049,423      cycles                    #    2.369 GHz
                 801,456      instructions              #    0.76  insn per cycle
                 155,609      branches                  #  351.246 M/sec
                   7,026      branch-misses             #    4.52% of all branches
      
             0.000987570 seconds time elapsed
      
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.553 MB perf.data (66 samples) ]
        # perf report --stdio --inline fs__get_mountpoint
        <SNIP>
           1.73%     0.00%  perf     perf           [.] fs__get_mountpoint
                  |
                  ---fs__get_mountpoint
                     fs__get_mountpoint (inline)
                     fs__check_mounts (inline)
                     __statfs
                     entry_SYSCALL_64
                     sys_statfs
                     SYSC_statfs
                     user_statfs
                     user_path_at_empty
                     filename_lookup
                     path_lookupat
                     link_path_walk
                     inode_permission
                     __inode_permission
                     kernfs_iop_permission
                     kernfs_refresh_inode
                     security_inode_notifysecctx
                     selinux_inode_notifysecctx
                     selinux_inode_setsecurity
                     security_context_to_sid
                     security_context_to_sid_core
                     string_to_context_struct
                     symcmp
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0db64dd0
  2. 15 11月, 2016 1 次提交
    • J
      perf report: Show branch info in callchain entry for stdio mode · 8577ae6b
      Jin Yao 提交于
      If the branch is 100% predicted then the "predicted" is hidden.
      Similarly, if there is no branch tsx abort, the "abort" is hidden.
      There is only cycles shown (cycle is supported on skylake platform,
      older platform would be 0).
      
      If no iterations, the "iterations" is hidden.
      
      For example:
      
      |--29.93%--main div.c:39 (predicted:50.6%, cycles:1, iterations:18)
      |          main div.c:44 (predicted:50.6%, cycles:1)
      |          |
      |           --22.69%--main div.c:42 (cycles:2, iterations:17)
      |                     compute_flag div.c:28 (cycles:2)
      |                     |
      |                      --10.52%--compute_flag div.c:27 (cycles:1)
      |                                rand rand.c:28 (cycles:1)
      |                                rand rand.c:28 (cycles:1)
      |                                __random random.c:298 (cycles:1)
      |                                __random random.c:297 (cycles:1)
      |                                __random random.c:295 (cycles:1)
      |                                __random random.c:295 (cycles:1)
      |                                __random random.c:295 (cycles:1)
      |                                __random random.c:295 (cycles:6)
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linux-kernel@vger.kernel.org
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/1477876794-30749-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8577ae6b
  3. 23 9月, 2016 3 次提交
  4. 21 9月, 2016 2 次提交
  5. 14 9月, 2016 2 次提交
  6. 24 8月, 2016 3 次提交
  7. 15 6月, 2016 7 次提交
  8. 07 4月, 2016 1 次提交
  9. 08 3月, 2016 3 次提交
  10. 27 2月, 2016 2 次提交
    • N
      perf report: Left align dynamic entries in hierarchy · cb1fab91
      Namhyung Kim 提交于
      The dynamic entries are right-aligned unlike other entries since it
      usually has numeric value.  But for the hierarchy mode, left alignment
      is more appropriate IMHO.  Also trim spaces on the left so that we can
      easily identify the hierarchy.
      
      Before:
      
        $ perf report --hierarchy -i perf.data.kmem -s gfp_flags,ptr,bytes_req --stdio -g none
        ...
        #
        #       Overhead                                        gfp_flags /                ptr /          bytes_req
        # ..............  .................................................................................................
        #
            91.67%                   GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
               37.50%        0xffff8803f7669400
                  37.50%                       448
                8.33%        0xffff8803f766be00
                   8.33%                        96
                4.17%        0xffff8800d156dc00
                   4.17%                       704
      
      After:
      
        #       Overhead  gfp_flags / ptr / bytes_req
        # ..............  ....................................
        #
            91.67%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
               37.50%        0xffff8803f7669400
                  37.50%        448
                8.33%        0xffff8803f766be00
                   8.33%        96
                4.17%        0xffff8800d156dc00
                   4.17%        704
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cb1fab91
    • N
      perf report: Fix indentation of dynamic entries in hierarchy · d3a72fd8
      Namhyung Kim 提交于
      When dynamic entries are used in the hierarchy mode with multiple
      events, the output might not be aligned properly.  In the hierarchy
      mode, the each sort column is indented using total number of sort keys.
      So it keeps track of number of sort keys when adding them.  However
      a dynamic sort key can be added more than once when multiple events have
      same field names.  This results in unnecessarily long indentation in the
      output.
      
      For example perf kmem records following events:
      
        $ perf evlist --trace-fields -i perf.data.kmem
        kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
        kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
        kmem:kfree: trace_fields: call_site,ptr
        kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
        kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
        kmem:kmem_cache_free: trace_fields: call_site,ptr
        kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
        kmem:mm_page_free: trace_fields: page,order
      
      As you can see, many field names shared between kmem events.  So adding
      'ptr' dynamic sort key alone will set nr_sort_keys to 6.  And this adds
      many unnecessary spaces between columns.
      
      Before:
      
        $ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
        ...
        #                Overhead                 ptr
        # .......................  ...................................
        #
            99.89%                 0xffff8803ffb79720
             0.06%                 0xffff8803d228a000
             0.03%                 0xffff8803f7678f00
             0.00%                 0xffff880401dc5280
             0.00%                 0xffff880406172380
             0.00%                 0xffff8803ffac3a00
             0.00%                 0xffff8803ffac1600
      
      After:
      
        # Overhead                 ptr
        # ........  ....................
        #
            99.89%  0xffff8803ffb79720
             0.06%  0xffff8803d228a000
             0.03%  0xffff8803f7678f00
             0.00%  0xffff880401dc5280
             0.00%  0xffff880406172380
             0.00%  0xffff8803ffac3a00
             0.00%  0xffff8803ffac1600
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3a72fd8
  11. 26 2月, 2016 1 次提交
    • N
      perf report: Show message for percent limit on stdio · bd4abd39
      Namhyung Kim 提交于
      When the hierarchy mode is used, some entries might be omiited due to a
      percent limit or filter.  In this case the output hierarchy is different
      than other entries.  Add an informative message to users about this.
      
      For example, when 4% of percent limit is applied:
      
      Before:
        #       Overhead  Command / Shared Object / Symbol
        # ..............  ..........................................
        #
            49.09%        swapper
               48.67%        [kernel.vmlinux]
                  34.42%        [k] intel_idle
            11.51%        firefox
                8.87%        libpthread-2.22.so
                   6.60%        [.] __GI___libc_recvmsg
            10.49%        gnome-shell
                4.74%        libc-2.22.so
            10.08%        Xorg
                6.11%        libc-2.22.so
                   5.27%        [.] __memcpy_sse2_unaligned
             6.15%        perf
      
      Note that, gnome-shell/libc has no symbols and perf has no dso/symbols.
      With that patch the output will look like below:
      
      After:
      
        #       Overhead  Command / Shared Object / Symbol
        # ..............  ..........................................
        #
            49.09%        swapper
               48.67%        [kernel.vmlinux]
                  34.42%        [k] intel_idle
            11.51%        firefox
                8.87%        libpthread-2.22.so
                   6.60%        [.] __GI___libc_recvmsg
            10.49%        gnome-shell
                4.74%        libc-2.22.so
                                no entry >= 4.00%
            10.08%        Xorg
                6.11%        libc-2.22.so
                   5.27%        [.] __memcpy_sse2_unaligned
             6.15%        perf
                             no entry >= 4.00%
      Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456488800-28124-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bd4abd39
  12. 25 2月, 2016 2 次提交
  13. 12 2月, 2016 1 次提交
    • A
      perf hists: Do column alignment on the format iterator · 89fee709
      Arnaldo Carvalho de Melo 提交于
      We were doing column alignment in the format function for each cell,
      returning a string padded with spaces so that when the next column is
      printed the cursor is at its column alignment.
      
      This ends up needlessly printing trailing spaces, do it at the format
      iterator, that is where we know if it is needed, i.e. if there is more
      columns to be printed.
      
      This eliminates the need for triming lines when doing a dump using 'P'
      in the TUI browser and also produces far saner results with things like
      piping 'perf report' to 'less'.
      
      Right now only the formatters for sym->name and the 'locked' column
      (perf mem report), that are the ones that end up at the end of lines
      in the default 'perf report', 'perf top' and 'perf mem report' tools,
      the others will be done in a subsequent patch.
      
      In the end the 'width' parameter for the formatters now mean, in
      'printf' terms, the 'precision', where before it was the field 'width'.
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89fee709
  14. 03 2月, 2016 2 次提交
  15. 02 2月, 2016 4 次提交
  16. 07 1月, 2016 1 次提交
    • N
      perf tools: Skip dynamic fields not defined for current event · 361459f1
      Namhyung Kim 提交于
      When there are multiple events, each dynamic sort key is defined just
      for one event.  In this case other events will always show "N/A" for
      those fields.  But they are meaningless and consume precious screen
      width.
      
      Let's skip those undefined dynamic fields.
      
        $ perf record -e kmem:kmalloc,kmem:kfree -a sleep 1
      
        $ perf report -s 'comm,kmalloc.*' --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 20K of event 'kmem:kmalloc'
        # Event count (approx.): 20533
        #
        # Overhead  Command           call_site                 ptr  bytes_req  bytes_alloc            gfp_flags
        # ........  .......  ..................  ..................  .........  ...........  ...................
        #
            99.89%  perf       ffffffffa01d4396  0xffff8803ffb79720         96           96    GFP_NOFS|GFP_ZERO
             0.06%  sleep      ffffffff8114e1cd  0xffff8803d228a000       4096         4096           GFP_KERNEL
             0.03%  perf       ffffffff811d6ae6  0xffff8803f7678f00        240          256  GFP_KERNEL|GFP_ZERO
             0.00%  perf       ffffffff812263c1  0xffff880406172380        128          128           GFP_KERNEL
             0.00%  perf       ffffffff812264b9  0xffff8803ffac1600        504          512           GFP_KERNEL
             0.00%  perf       ffffffff81226634  0xffff880401dc5280         28           32           GFP_KERNEL
             0.00%  sleep      ffffffff81226da9  0xffff8803ffac3a00        392          512           GFP_KERNEL
      
        # Samples: 20K of event 'kmem:kfree'
        # Event count (approx.): 20597
        #
        # Overhead  Command
        # ........  ..............
        #
            99.63%  perf
             0.14%  sleep
             0.11%  irq/36-iwlwifi
             0.11%  kworker/u16:0
             0.01%  Xorg
             0.00%  firefox
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-12-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      361459f1
  17. 20 11月, 2015 3 次提交
  18. 19 11月, 2014 1 次提交