1. 01 3月, 2019 1 次提交
    • J
      perf time-utils: Refactor time range parsing code · 284c4e18
      Jin Yao 提交于
      Jiri points out that we don't need any time checking and time string
      parsing if the --time option is not set. That makes sense.
      
      This patch refactors the time range parsing code, move the duplicated
      code from perf report and perf script to time_utils and check if --time
      option is set before parsing the time string. This patch is no logic
      change expected. So the usage of --time is same as before.
      
      For example:
      
      Select the first and second 10% time slices:
        perf report --time 10%/1,10%/2
        perf script --time 10%/1,10%/2
      
      Select the slices from 0% to 10% and from 30% to 40%:
        perf report --time 0%-10%,30%-40%
        perf script --time 0%-10%,30%-40%
      
      Select the time slices from timestamp 3971 to 3973
        perf report --time 3971,3973
        perf script --time 3971,3973
      
      Committer testing:
      
      Using the above examples, check before and after to see if it remains
      the same:
      
        $ perf record -F 10000 -- find . -name "*.[ch]" -exec cat {} + > /dev/null
        [ perf record: Woken up 3 times to write data ]
        [ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ]
        $
        $ perf report --time 10%/1,10%/2 > /tmp/report.before.1
        $ perf script --time 10%/1,10%/2 > /tmp/script.before.1
        $ perf report --time 0%-10%,30%-40% > /tmp/report.before.2
        $ perf script --time 0%-10%,30%-40% > /tmp/script.before.2
        $ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3
        $ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3
      
      For example, the 3rd test produces this slice:
      
        $ cat /tmp/script.before.3
              cat  3147 180457.375844:   2143 cycles:uppp:      7f79362590d9 cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so)
              cat  3147 180457.375986:   2245 cycles:uppp:      558b70f3d86e [unknown] (/usr/bin/cat)
              cat  3147 180457.376012:   2164 cycles:uppp:      7f7936257430 _int_malloc+0x8c0 (/usr/lib64/libc-2.28.so)
              cat  3147 180457.376140:   2921 cycles:uppp:      558b70f3a554 [unknown] (/usr/bin/cat)
              cat  3147 180457.376296:   2844 cycles:uppp:      7f7936258abe malloc+0x4e (/usr/lib64/libc-2.28.so)
              cat  3147 180457.376431:   2717 cycles:uppp:      558b70f3b0ca [unknown] (/usr/bin/cat)
              cat  3147 180457.376667:   2630 cycles:uppp:      558b70f3d86e [unknown] (/usr/bin/cat)
              cat  3147 180457.376795:   2442 cycles:uppp:      7f79362bff55 read+0x15 (/usr/lib64/libc-2.28.so)
              cat  3147 180457.376927:   2376 cycles:uppp:  ffffffff9aa00163 [unknown] ([unknown])
              cat  3147 180457.376954:   2307 cycles:uppp:      7f7936257438 _int_malloc+0x8c8 (/usr/lib64/libc-2.28.so)
              cat  3147 180457.377116:   3091 cycles:uppp:      7f7936258a70 malloc+0x0 (/usr/lib64/libc-2.28.so)
              cat  3147 180457.377362:   2945 cycles:uppp:      558b70f3a3b0 [unknown] (/usr/bin/cat)
              cat  3147 180457.377517:   2727 cycles:uppp:      558b70f3a9aa [unknown] (/usr/bin/cat)
        $
      
      Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the
      above chunk translates into this 'perf report' output:
      
        $ cat /tmp/report.before.3
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'cycles:uppp' (time slices: 180457.375844,180457.377717)
        # Event count (approx.): 33552
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ......................
        #
            17.69%  cat      libc-2.28.so      [.] malloc
            14.53%  cat      cat               [.] 0x000000000000586e
            13.33%  cat      libc-2.28.so      [.] _int_malloc
             8.78%  cat      cat               [.] 0x00000000000023b0
             8.71%  cat      cat               [.] 0x0000000000002554
             8.13%  cat      cat               [.] 0x00000000000029aa
             8.10%  cat      cat               [.] 0x00000000000030ca
             7.28%  cat      libc-2.28.so      [.] read
             7.08%  cat      [unknown]         [k] 0xffffffff9aa00163
             6.39%  cat      libc-2.28.so      [.] cfree@GLIBC_2.2.5
      
        #
        # (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
        #
        $
      
      Now lets see after applying this patch, nothing should change:
      
        $ perf report --time 10%/1,10%/2 > /tmp/report.after.1
        $ perf script --time 10%/1,10%/2 > /tmp/script.after.1
        $ perf report --time 0%-10%,30%-40% > /tmp/report.after.2
        $ perf script --time 0%-10%,30%-40% > /tmp/script.after.2
        $ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3
        $ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3
        $ diff -u /tmp/report.before.1 /tmp/report.after.1
        $ diff -u /tmp/script.before.1 /tmp/script.after.1
        $ diff -u /tmp/report.before.2 /tmp/report.after.2
        --- /tmp/report.before.2	2019-03-01 11:01:53.526094883 -0300
        +++ /tmp/report.after.2	2019-03-01 11:09:18.231770467 -0300
        @@ -352,5 +352,5 @@
      
         #
        -# (Tip: Generate a script for your data: perf script -g <lang>)
        +# (Tip: Treat branches as callchains: perf report --branch-history)
         #
        $ diff -u /tmp/script.before.2 /tmp/script.after.2
        $ diff -u /tmp/report.before.3 /tmp/report.after.3
        --- /tmp/report.before.3	2019-03-01 11:03:08.890045588 -0300
        +++ /tmp/report.after.3	2019-03-01 11:09:40.660224002 -0300
        @@ -22,5 +22,5 @@
      
         #
        -# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
        +# (Tip: List events using substring match: perf list <keyword>)
         #
        $ diff -u /tmp/script.before.3 /tmp/script.after.3
        $
      
      Cool, just the 'perf report' tips changed, QED.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1551435186-6008-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      284c4e18
  2. 25 2月, 2019 1 次提交
  3. 23 2月, 2019 1 次提交
    • J
      perf data: Add global path holder · 2d4f2799
      Jiri Olsa 提交于
      Add a 'path' member to 'struct perf_data'. It will keep the configured
      path for the data (const char *). The path in struct perf_data_file is
      now dynamically allocated (duped) from it.
      
      This scheme is useful/used in following patches where struct
      perf_data::path holds the 'configure' directory path and struct
      perf_data_file::path holds the allocated path for specific files.
      
      Also it actually makes the code little simpler.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
      [ Fixup data-convert-bt.c missing conversion ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d4f2799
  4. 21 2月, 2019 1 次提交
    • J
      perf script: Allow +- operator for type specific fields option · 6ef362fd
      Jiri Olsa 提交于
      Add support to add/remove fields for specific event types in -F option.
      It's now possible to use '+-' after event type, like:
      
        # cat > test.c
        #include <stdio.h>
      
        int main(void)
        {
           printf("Hello world\n");
           while(1) {}
        }
        ^D
        # gcc -g -o test test.c
        # perf probe -x test 'test.c:5'
        # perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
        ...
      
        # perf script -Ftrace:+period,-cpu
                  test  3859 396291.117343:      10275 cpu/cpu-cycles,period=10000/:      7f..
                  test  3859 396291.118234:      11041 cpu/cpu-cycles,period=10000/:  ffffff..
                  test  3859 396291.118234:          1              probe_test:main:
                  test  3859 396291.118248:       8668 cpu/cpu-cycles,period=10000/:  ffffff..
                  test  3859 396291.118263:      10139 cpu/cpu-cycles,period=10000/:  ffffff..
      
      Committer testing:
      
      Couldn't make the test above work, but tested it with:
      
        # perf probe -x hello main
        Added new event:
          probe_hello:main     (on main in /home/acme/c/hello)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_hello:main -aR sleep 1
      
        # perf record -e probe_hello:main ./hello
        hello, world
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
        # perf script
                 hello 21454 [002] 254116.874005: probe_hello:main: (401126)
        #
        # perf script -Ftrace:+period,-cpu
                 hello 21454 254116.874005:          1 probe_hello:main: (401126)
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ef362fd
  5. 06 2月, 2019 1 次提交
  6. 21 1月, 2019 1 次提交
    • T
      perf script: Fix crash when processing recorded stat data · 8bf8c6da
      Tony Jones 提交于
      While updating perf to work with Python3 and Python2 I noticed that the
      stat-cpi script was dumping core.
      
      $ perf  stat -e cycles,instructions record -o /tmp/perf.data /bin/false
      
       Performance counter stats for '/bin/false':
      
                 802,148      cycles
      
                 604,622      instructions                                                       802,148      cycles
                 604,622      instructions
      
             0.001445842 seconds time elapsed
      
      $ perf script -i /tmp/perf.data -s scripts/python/stat-cpi.py
      Segmentation fault (core dumped)
      ...
      ...
          rblist=rblist@entry=0xb2a200 <rt_stat>,
          new_entry=new_entry@entry=0x7ffcb755c310) at util/rblist.c:33
          ctx=<optimized out>, type=<optimized out>, create=<optimized out>,
          cpu=<optimized out>, evsel=<optimized out>) at util/stat-shadow.c:118
          ctx=<optimized out>, type=<optimized out>, st=<optimized out>)
          at util/stat-shadow.c:196
          count=count@entry=727442, cpu=cpu@entry=0, st=0xb2a200 <rt_stat>)
          at util/stat-shadow.c:239
          config=config@entry=0xafeb40 <stat_config>,
          counter=counter@entry=0x133c6e0) at util/stat.c:372
      ...
      ...
      
      The issue is that since 1fcd0394 perf_stat__update_shadow_stats now calls
      update_runtime_stat passing rt_stat rather than calling update_stats but
      perf_stat__init_shadow_stats has never been called to initialize rt_stat in
      the script path processing recorded stat data.
      
      Since I can't see any reason why perf_stat__init_shadow_stats() is presently
      initialized like it is in builtin-script.c::perf_sample__fprint_metric()
      [4bd1bef8] I'm proposing it instead be initialized once in __cmd_script
      
      Committer testing:
      
      After applying the patch:
      
        # perf script -i /tmp/perf.data -s tools/perf/scripts/python/stat-cpi.py
             0.001970: cpu -1, thread -1 -> cpi 1.709079 (1075684/629394)
        #
      
      No segfault.
      Signed-off-by: NTony Jones <tonyj@suse.de>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Fixes: 1fcd0394 ("perf stat: Update per-thread shadow stats")
      Link: http://lkml.kernel.org/r/20190120191414.12925-1-tonyj@suse.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8bf8c6da
  7. 18 1月, 2019 1 次提交
    • A
      perf script: Fix crash with printing mixed trace point and other events · 96167167
      Andi Kleen 提交于
      'perf script' crashes currently when printing mixed trace points and
      other events because the trace format does not handle events without
      trace meta data. Add a simple check to avoid that.
      
        % cat > test.c
        main()
        {
            printf("Hello world\n");
        }
        ^D
        % gcc -g -o test test.c
        % sudo perf probe -x test 'test.c:3'
        % perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
        % perf script
        <segfault>
      
      Committer testing:
      
      Before:
      
        # perf probe -x /lib64/libc-2.28.so malloc
        Added new event:
          probe_libc:malloc    (on malloc in /usr/lib64/libc-2.28.so)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe_libc:malloc -aR sleep 1
      
        # perf probe -l
        probe_libc:malloc    (on __libc_malloc@malloc/malloc.c in /usr/lib64/libc-2.28.so)
        # perf record -e '{cpu/cpu-cycles,period=10000/,probe_libc:*}:S' sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.023 MB perf.data (40 samples) ]
        # perf script
        Segmentation fault (core dumped)
        ^C
        #
      
      After:
      
        # perf script | head -6
           sleep 2888 94796.944981: 16198 cpu/cpu-cycles,period=10000/: ffffffff925dc04f get_random_u32+0x1f (/lib/modules/5.0.0-rc2+/build/vmlinux)
           sleep 2888 [-01] 94796.944981: probe_libc:malloc:
           sleep 2888 94796.944983:  4713 cpu/cpu-cycles,period=10000/: ffffffff922763af change_protection+0xcf (/lib/modules/5.0.0-rc2+/build/vmlinux)
           sleep 2888 [-01] 94796.944983: probe_libc:malloc:
           sleep 2888 94796.944986:  9934 cpu/cpu-cycles,period=10000/: ffffffff922777e0 move_page_tables+0x0 (/lib/modules/5.0.0-rc2+/build/vmlinux)
           sleep 2888 [-01] 94796.944986: probe_libc:malloc:
        #
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190117194834.21940-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      96167167
  8. 02 1月, 2019 1 次提交
  9. 29 12月, 2018 1 次提交
    • A
      perf script: Fix LBR skid dump problems in brstackinsn · 61f61159
      Andi Kleen 提交于
      This is a fix for another instance of the skid problem Milian recently
      found [1]
      
      The LBRs don't freeze at the exact same time as the PMI is triggered.
      The perf script brstackinsn code that dumps LBR assembler assumes that
      the last branch in the LBR leads to the sample point.  But with skid
      it's possible that the CPU executes one or more branches before the
      sample, but which do not appear in the LBR.
      
      What happens then is either that the sample point is before the last LBR
      branch. In this case the dumper sees a negative length and ignores it.
      Or it the sample point is long after the last branch. Then the dumper
      sees a very long block and dumps it upto its block limit (16k bytes),
      which is noise in the output.
      
      On typical sample session this can happen regularly.
      
      This patch tries to detect and handle the situation. On the last block
      that is dumped by the LBR dumper we always stop on the first branch. If
      the block length is negative just scan forward to the first branch.
      Otherwise scan until a branch is found.
      
      The PT decoder already has a function that uses the instruction decoder
      to detect branches, so we can just reuse it here.
      
      Then when a terminating branch is found print an indication and stop
      dumping. This might miss a few instructions, but at least shows no
      runaway blocks.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Link: http://lkml.kernel.org/r/20181120050617.4119-1-andi@firstfloor.org
      [ Resolved conflict with dd2e18e9 ("perf tools: Support 'srccode' output") ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61f61159
  10. 18 12月, 2018 2 次提交
    • A
      perf tools: Support 'srccode' output · dd2e18e9
      Andi Kleen 提交于
      When looking at PT or brstackinsn traces with 'perf script' it can be
      very useful to see the source code. This adds a simple facility to print
      them with 'perf script', if the information is available through dwarf
      
        % perf record ...
        % perf script -F insn,ip,sym,srccode
        ...
      
                  4004c6 main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004c6 main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004b3 main
        6                       v++;
      
        % perf record -b ...
        % perf script -F insn,ip,sym,srccode,brstackinsn
      
        ...
               main+22:
                0000000000400543        insn: e8 ca ff ff ff            # PRED
        |18                     f1();
                f1:
                0000000000400512        insn: 55
        |10       {
                0000000000400513        insn: 48 89 e5
                0000000000400516        insn: b8 00 00 00 00
        |11             f2();
                000000000040051b        insn: e8 d6 ff ff ff            # PRED
                f2:
                00000000004004f6        insn: 55
        |5        {
                00000000004004f7        insn: 48 89 e5
                00000000004004fa        insn: 8b 05 2c 0b 20 00
        |6              c = a / b;
                0000000000400500        insn: 8b 0d 2a 0b 20 00
                0000000000400506        insn: 99
                0000000000400507        insn: f7 f9
                0000000000400509        insn: 89 05 29 0b 20 00
                000000000040050f        insn: 90
        |7        }
                0000000000400510        insn: 5d
                0000000000400511        insn: c3                        # PRED
                f1+14:
                0000000000400520        insn: b8 00 00 00 00
        |12             f2();
                0000000000400525        insn: e8 cc ff ff ff            # PRED
                f2:
                00000000004004f6        insn: 55
        |5        {
                00000000004004f7        insn: 48 89 e5
                00000000004004fa        insn: 8b 05 2c 0b 20 00
        |6              c = a / b;
      
      Not supported for callchains currently, would need some layout changes
      there.
      
      Committer notes:
      
      Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
      warning:
      
        In file included from util/srccode.c:19:0:
        /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
         #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
          ^~~~~~~
        cc1: all warnings being treated as errors
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd2e18e9
    • A
      perf script: Use fallbacks for branch stacks · 692d0e63
      Adrian Hunter 提交于
      Branch stacks do not necessarily have the same cpumode as the 'ip'. Use
      the fallback functions in those cases.
      
      This patch depends on patch "perf tools: Add fallback functions for cases
      where cpumode is insufficient".
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: stable@vger.kernel.org # 4.19
      Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      692d0e63
  11. 21 11月, 2018 2 次提交
    • M
      perf script: Share code and output format for uregs and iregs output · 9add8fe8
      Milian Wolff 提交于
      The iregs output was missing the newline at end as well as the leading
      ABI output. This made it hard to compare the iregs and uregs values.
      Instead, use a single function to output the register values and use it
      for both, iregs and uregs, to ensure the output is consistent.
      
      Before:
      
        perf  7049 [-01]  1343.354347:          1 cycles:ppp:
              ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
          AX:0x80000000    BX:0x0    CX:0x0    DX:0x7    SI:0xf    DI:0x286    BP:0xffff95bc8213a460    SP:0xffffacbf0ba97d18    IP:0xffffffffa7bc21cd FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x2    R9:0x21440   R10:0x33816fb3b8c   R11:0x1   R12:0xffff95bc8213a460   R13:0xffff95bc8213a400   R14:0xffff95bc8213a400   R15:0x1  ABI:2    AX:0xffffffffffffffda    BX:0xffffffffffffffff    CX:0x7f84ad85798b    DX:0x560209699d50    SI:0x7ffe2c7a6820    DI:0x7ffe2c7a8c9b    BP:0x7ffe2c7a20d0    SP:0x7ffe2c7a2058    IP:0x7f84ad85798b FLAGS:0x206    CS:0x33    SS:0x2b    R8:0x7ffe2c7a2030    R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0xffffffffffffffff   R13:0xffffffffffffffff   R14:0xffffffffffffffff   R15:0xffffffffffffffff
      
        perf  7049 [-01]  1343.354363:          1 cycles:ppp:
              ...
      
      After:
      
        perf  7049 [-01]  1343.354347:          1 cycles:ppp:
              ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
              ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux)
          ABI:2    AX:0x80000000    BX:0x0    CX:0x0    DX:0x7    SI:0xf    DI:0x286    BP:0xffff95bc8213a460    SP:0xffffacbf0ba97d18    IP:0xffffffffa7bc21cd FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x2    R9:0x21440   R10:0x33816fb3b8c   R11:0x1   R12:0xffff95bc8213a460   R13:0xffff95bc8213a400   R14:0xffff95bc8213a400   R15:0x1
          ABI:2    AX:0xffffffffffffffda    BX:0xffffffffffffffff    CX:0x7f84ad85798b    DX:0x560209699d50    SI:0x7ffe2c7a6820    DI:0x7ffe2c7a8c9b    BP:0x7ffe2c7a20d0    SP:0x7ffe2c7a2058    IP:0x7f84ad85798b FLAGS:0x206    CS:0x33    SS:0x2b    R8:0x7ffe2c7a2030    R9:0x7f84ae55f010   R10:0x8   R11:0x206   R12:0xffffffffffffffff   R13:0xffffffffffffffff   R14:0xffffffffffffffff   R15:0xffffffffffffffff
      
        perf  7049 [-01]  1343.354363:          1 cycles:ppp:
              ...
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181107223437.9071-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9add8fe8
    • M
      perf script: Add newline after uregs output · b07d16f7
      Milian Wolff 提交于
      This change makes it much easier to easily distinguish between
      consecutive samples by keeping the empty line between them, like we see
      when we do not enable uregs output.
      
      Before:
      
        cpp-inlining 28298 [-01] 54837.342780:    3068085 cycles:pp:
                    7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x0    BX:0x40f56cf6    CX:0x294a3ae7    ...
        cpp-inlining 28298 [-01] 54837.344493:    2881929 cycles:pp:
                    7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x40d440c7    BX:0x40d440c7    CX:0x4d45e5da    ...
      
      After:
      
        cpp-inlining 28298 [-01] 54837.342780:    3068085 cycles:pp:
                    7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x0    BX:0x40f56cf6    CX:0x294a3ae7    ...
      
        cpp-inlining 28298 [-01] 54837.344493:    2881929 cycles:pp:
                    7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so)
                    ...
         ABI:2    AX:0x40d440c7    BX:0x40d440c7    CX:0x4d45e5da    ...
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181107093705.16346-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b07d16f7
  12. 25 10月, 2018 5 次提交
    • A
      perf script: Support total cycles count · fe57120e
      Andi Kleen 提交于
      For 'perf script' brstackinsn also print a running cycles count.  This
      makes it easier to calculate cycle deltas for code sections measured
      with LBRs.
      
      % perf record -b -a sleep 1
      % perf script -F +brstackinsn
      ...
              00007f73ecc41083        insn: 74 06                     # PRED 9 cycles [17] 1.11 IPC
              00007f73ecc4108b        insn: a8 10
              00007f73ecc4108d        insn: 74 71                     # PRED 1 cycles [18] 1.00 IPC
              00007f73ecc41100        insn: 48 8b 46 10
              00007f73ecc41104        insn: 4c 8b 38
              00007f73ecc41107        insn: 4d 85 ff
              00007f73ecc4110a        insn: 0f 84 b0 00 00 00
              00007f73ecc41110        insn: 83 43 58 01
              00007f73ecc41114        insn: 48 89 df
              00007f73ecc41117        insn: e8 94 73 04 00            # PRED 6 cycles [24] 1.00 IPC
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Link: http://lkml.kernel.org/r/20180924170732.GA28040@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fe57120e
    • A
      perf script: Implement --graph-function · 99f753f0
      Andi Kleen 提交于
      Add a ftrace style --graph-function argument to 'perf script' that
      allows to print itrace function calls only below a given function. This
      makes it easier to find the code of interest in a large trace.
      
      % perf record -e intel_pt//k -a sleep 1
      % perf script --graph-function group_sched_in --call-trace
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          group_sched_in
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              event_sched_in.isra.107
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_event_set_state.part.71
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_time
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_disable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_log_itrace_start
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_userpage
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          calc_timer_values
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_cpu
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          arch_perf_update_userpage
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              __fentry__
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              using_native_sched_clock
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_stable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_enable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])          group_sched_in
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              event_sched_in.isra.107
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_event_set_state.part.71
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_time
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_pmu_disable
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_log_itrace_start
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_userpage
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          calc_timer_values
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_cpu
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          arch_perf_update_userpage
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              __fentry__
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              using_native_sched_clock
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_stable
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180920180540.14039-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      99f753f0
    • A
      tools script: Add --call-trace and --call-ret-trace · d1b1552e
      Andi Kleen 提交于
      Add short cut options to print PT call trace and call-ret-trace, for
      calls and call and returns. Roughly corresponds to ftrace function
      tracer and function graph tracer.
      
      Just makes these common use cases nicer to use.
      
      % perf record -a -e intel_pt// sleep 1
      % perf script --call-trace
      	    perf   900 [000] 194167.205652203: ([kernel.kallsyms])          perf_pmu_enable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          event_filter_match
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          group_sched_in
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              event_sched_in.isra.107
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_event_set_state.part.71
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_time
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_disable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_log_itrace_start
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_userpage
      
      % perf script --call-ret-trace
      	    perf   900 [000] 194167.205652203:   tr strt     ([unknown])        pt_config
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            pt_config
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            pt_event_add
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            perf_pmu_enable
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            perf_pmu_nop_void
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            event_sched_in.isra.107
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            perf_pmu_nop_int
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            group_sched_in
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            event_filter_match
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            event_filter_match
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            group_sched_in
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])                __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])                perf_pmu_nop_txn
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])                event_sched_in.isra.107
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])                    perf_event_set_state.part.71
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180920180540.14039-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d1b1552e
    • A
      perf script: Make itrace script default to all calls · 4eb06815
      Andi Kleen 提交于
      By default 'perf script' for itrace outputs sampled instructions or
      branches. In my experience this is confusing to users because it's hard
      to correlate with real program behavior. The sampling makes sense for
      tools like 'perf report' that actually sample to reduce the run time,
      but run time is normally not a problem for 'perf script'.  It's better
      to give an accurate representation of the program flow.
      
      Default 'perf script' to output all calls for itrace. That's a much saner
      default. The old behavior can be still requested with 'perf script'
      --itrace=ibxwpe100000
      
      v2: Fix ETM build failure
      v3: Really fix ETM build failure (Kim Phillips)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Link: http://lkml.kernel.org/r/20180920180540.14039-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4eb06815
    • A
      perf script: Add --insn-trace for instruction decoding · b585ebdb
      Andi Kleen 提交于
      Add a --insn-trace short hand option for decoding and disassembling
      instruction streams for intel_pt. This automatically pipes the output
      into the xed disassembler to generate disassembled instructions.  This
      just makes this use model much nicer to use.
      
      Before
      
        % perf record -e intel_pt// ...
        % perf script --itrace=i0ns --ns -F +insn,-event,-period | xed -F insn: -A -64
         swapper 0 [000] 17276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms])    nopl  %eax, (%rax,%rax,1)
         swapper 0 [000] 17276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms])    add $0x10, %rsp
         swapper 0 [000] 17276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms])    popq  %rbx
         swapper 0 [000] 17276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms])    popq  %rbp
         swapper 0 [000] 17276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms])    popq  %r12
         swapper 0 [000] 17276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms])    popq  %r13
         swapper 0 [000] 17276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms])    popq  %r14
         swapper 0 [000] 17276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms])    popq  %r15
         swapper 0 [000] 17276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms])    retq
         swapper 0 [000] 17276.429606186: ffffffff8101063e pt_event_add ([kernel.kallsyms])         cmpl  $0x1, 0x1b0(%rbx)
         swapper 0 [000] 17276.429606186: ffffffff81010645 pt_event_add ([kernel.kallsyms])         mov $0xffffffea, %eax
         swapper 0 [000] 17276.429606186: ffffffff8101064a pt_event_add ([kernel.kallsyms])         mov $0x0, %edx
         swapper 0 [000] 17276.429606186: ffffffff8101064f pt_event_add ([kernel.kallsyms])         popq  %rbx
         swapper 0 [000] 17276.429606186: ffffffff81010650 pt_event_add ([kernel.kallsyms])         cmovnz %edx, %eax
         swapper 0 [000] 17276.429606186: ffffffff81010653 pt_event_add ([kernel.kallsyms])         jmp 0xffffffff81010635
         swapper 0 [000] 17276.429606186: ffffffff81010635 pt_event_add ([kernel.kallsyms])         retq
         swapper 0 [000] 17276.429606186: ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms])       test %eax, %eax
      
      Now:
      
        % perf record -e intel_pt// ...
        % perf script --insn-trace --xed
        ... same output ...
      
      XED needs to be installed with:
      
        $ git clone https://github.com/intelxed/mbuild.git mbuild
        $ git clone https://github.com/intelxed/xed
        $ cd xed
        $ ./mfile.py
        $ ./mfile.py examples
        $ sudo ./mfile.py --prefix=/usr/local install
        $ sudo cp obj/examples/xed /usr/local/bin
        $ xed | head -3
        ERROR: required argument(s) were missing
        Copyright (C) 2017, Intel Corporation. All rights reserved.
        XED version: [v10.0-328-g7d62c8c49b7b]
        $
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20180920180540.14039-2-andi@firstfloor.org
      [ Fixed up whitespace damage, added the 'mfile.py examples + cp obj/examples/xed ... ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b585ebdb
  13. 23 10月, 2018 1 次提交
    • M
      perf script: Flush output stream after events in verbose mode · 7ee40678
      Milian Wolff 提交于
      When the perf script output is written to a terminal stream, the normal
      output of `perf script` would get buffered, but its debug output would
      be written directly. This made it quite hard to figure out where a given
      debug output is coming from.
      
      We can improve on this by flushing the output buffer after processing an
      event. To see the value, compare the following output for a `perf script
      -v` run:
      
      Before this patch:
      ```
      unwind: reg 16, val 7faf7dfdc000
      unwind: reg 7, val 7ffc80811e30
      unwind: find_proc_info dso /usr/lib/ld-2.28.so
      unwind: reg 6, val 0
      unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
      unwind: reg 16, val 7faf7dfdc000
      unwind: reg 7, val 7ffc80811e30
      unwind: find_proc_info dso /usr/lib/ld-2.28.so
      unwind: reg 6, val 0
      unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
      unwind: reg 16, val 7faf7dfdc000
      unwind: reg 7, val 7ffc80811e30
      unwind: find_proc_info dso /usr/lib/ld-2.28.so
      unwind: reg 6, val 0
      unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
      unwind: reg 16, val 7faf7dfdc000
      unwind: reg 7, val 7ffc80811e30
      ... lots and lots of verbose debug output
      cpp-inlining 24617 90229.122036534:          1 cycles:uppp:
                  7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
      
      cpp-inlining 24617 90229.122043974:          1 cycles:uppp:
                  7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
      ...
      ```
      
      After this patch:
      ```
      ...
      unwind: reg 16, val 7faf7dfdc000
      unwind: reg 7, val 7ffc80811e30
      unwind: find_proc_info dso /usr/lib/ld-2.28.so
      unwind: reg 6, val 0
      unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
      cpp-inlining 24617 90229.122036534:          1 cycles:uppp:
                  7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
      
      unwind: reg 16, val 7faf7dfdc000
      unwind: reg 7, val 7ffc80811e30
      unwind: find_proc_info dso /usr/lib/ld-2.28.so
      unwind: reg 6, val 0
      unwind: _start:ip = 0x7faf7dfdc000 (0x2000)
      cpp-inlining 24617 90229.122043974:          1 cycles:uppp:
                  7faf7dfdc000 _start+0x0 (/usr/lib/ld-2.28.so)
      ...
      ```
      
      This new output format makes it much easier to use perf script output
      for debugging purposes, e.g. to investigate broken dwarf unwinding.
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181021191424.16183-2-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ee40678
  14. 22 10月, 2018 1 次提交
  15. 20 9月, 2018 4 次提交
    • A
      perf script: Enhance sample flags for trace begin / end · 62cb1b88
      Adrian Hunter 提交于
      Allow for different combinations of sample flags with "trace begin" or
      "trace end".
      
      Previously, the Intel PT decoder would indicate begin / end by a branch
      from / to zero. That hides useful information, in particular when a
      trace ends with a call. Before remedying that, prepare 'perf script' to
      display sample flags with more combinations that include trace begin /
      end. In those cases display 'tr start' and 'tr end' separately.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20180920130048.31432-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      62cb1b88
    • A
      perf script: Print DSO for callindent · a78cdee6
      Andi Kleen 提交于
      Now that we don't need to print the IP/ADDR for callindent the DSO is
      also not printed. It's useful for some cases, so add an own DSO printout
      for callindent for the case when IP/ADDR is not enabled.
      
      Before:
      
      % perf script --itrace=cr -F +callindent,-ip,-sym,-symoff,-addr
               swapper     0 [000]  3377.917072:          1   branches: pt_config
               swapper     0 [000]  3377.917072:          1   branches:     pt_config
               swapper     0 [000]  3377.917072:          1   branches:     pt_event_add
               swapper     0 [000]  3377.917072:          1   branches:     perf_pmu_enable
               swapper     0 [000]  3377.917072:          1   branches:     perf_pmu_nop_void
               swapper     0 [000]  3377.917072:          1   branches:     event_sched_in.isra.107
               swapper     0 [000]  3377.917072:          1   branches:     __x86_indirect_thunk_rax
               swapper     0 [000]  3377.917072:          1   branches:     perf_pmu_nop_int
               swapper     0 [000]  3377.917072:          1   branches:     group_sched_in
               swapper     0 [000]  3377.917072:          1   branches:     event_filter_match
               swapper     0 [000]  3377.917072:          1   branches:     event_filter_match
               swapper     0 [000]  3377.917072:          1   branches:     group_sched_in
      
      After:
      
               swapper     0 [000]  3377.917072:          1   branches: ([unknown])   pt_config
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       pt_config
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       pt_event_add
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       perf_pmu_enable
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       perf_pmu_nop_void
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       event_sched_in.isra.107
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       __x86_indirect_thunk_rax
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       perf_pmu_nop_int
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       group_sched_in
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       event_filter_match
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       event_filter_match
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])       group_sched_in
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])           __x86_indirect_thunk_rax
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])           perf_pmu_nop_txn
               swapper     0 [000]  3377.917072:          1   branches: ([kernel.kallsyms])           event_sched_in.isra.107
      
      (in the kernel case of course it's not very useful, but it's important
      with user programs where symbols are not unique)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180918123214.26728-6-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a78cdee6
    • A
      perf script: Allow sym and dso without ip, addr · 37fed3de
      Andi Kleen 提交于
      Currently sym and dso require printing ip and addr because the print
      function is tied to those outputs. With callindent it makes sense to
      print the symbol or dso without numerical IP or ADDR. So change the
      dependency check to only check the underlying attribute.
      
      Also the branch target output relies on the user_set flag to determine
      if the branch target should be implicitely printed. When modifying the
      fields with + or - also set user_set, so that ADDR can be removed. We
      also need to set wildcard_set to make the initial sanity check pass.
      
      This allows to remove a lot of noise in callindent output by dropping
      the numerical addresses, which are not all that useful.
      
      Before
      
      % perf script --itrace=cr -F +callindent
               swapper     0 [000] 156546.354971:          1   branches: pt_config                                       0 [unknown] ([unknown]) => ffffffff81010486 pt_config ([kernel.kallsyms])
               swapper     0 [000] 156546.354971:          1   branches:     pt_config                    ffffffff81010499 pt_config ([kernel.kallsyms]) => ffffffff8101063e pt_event_add ([kernel.kallsyms])
               swapper     0 [000] 156546.354971:          1   branches:     pt_event_add                 ffffffff81010635 pt_event_add ([kernel.kallsyms]) => ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms])
               swapper     0 [000] 156546.354971:          1   branches:     perf_pmu_enable              ffffffff8115e726 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff811579b0 perf_pmu_enable ([kernel.kallsyms])
               swapper     0 [000] 156546.354971:          1   branches:     perf_pmu_nop_void            ffffffff81151730 perf_pmu_nop_void ([kernel.kallsyms]) => ffffffff8115e72b event_sched_in.isra.107 ([kernel.kallsyms])
               swapper     0 [000] 156546.354971:          1   branches:     event_sched_in.isra.107      ffffffff8115e737 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff8115e7a5 group_sched_in ([kernel.kallsyms])
               swapper     0 [000] 156546.354971:          1   branches:     __x86_indirect_thunk_rax     ffffffff8115e7f6 group_sched_in ([kernel.kallsyms]) => ffffffff81a03000 __x86_indirect_thunk_rax ([kernel.kallsyms])
      
      After
      
      % perf script --itrace=cr -F +callindent,-ip,-sym,-symoff
             swapper     0 [000] 156546.354971:          1   branches:  pt_config
               swapper     0 [000] 156546.354971:          1   branches:      pt_config
               swapper     0 [000] 156546.354971:          1   branches:      pt_event_add
               swapper     0 [000] 156546.354971:          1   branches:       perf_pmu_enable
               swapper     0 [000] 156546.354971:          1   branches:       perf_pmu_nop_void
               swapper     0 [000] 156546.354971:          1   branches:       event_sched_in.isra.107
               swapper     0 [000] 156546.354971:          1   branches:       __x86_indirect_thunk_rax
               swapper     0 [000] 156546.354971:          1   branches:       perf_pmu_nop_int
               swapper     0 [000] 156546.354971:          1   branches:       group_sched_in
               swapper     0 [000] 156546.354971:          1   branches:       event_filter_match
               swapper     0 [000] 156546.354971:          1   branches:       event_filter_match
               swapper     0 [000] 156546.354971:          1   branches:       group_sched_in
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180918123214.26728-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      37fed3de
    • A
      perf tools: Report itrace options in help · c12e039d
      Andi Kleen 提交于
      I often forget all the options that --itrace accepts. Instead of burying
      them in the man page only report them in the normal command line help
      too to make them easier accessible.
      
      v2: Align
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180914031038.4160-2-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c12e039d
  16. 19 9月, 2018 1 次提交
  17. 31 8月, 2018 1 次提交
  18. 14 8月, 2018 1 次提交
  19. 25 6月, 2018 3 次提交
  20. 09 6月, 2018 1 次提交
    • S
      perf script: Show hw-cache events · fad76d43
      Seeteena Thoufeek 提交于
      'perf script' fails to report hardware cache events (PERF_TYPE_HW_CACHE)
      where as 'perf report' shows the samples. Fix it. Ex,
      
        # perf record -e L1-dcache-loads ./a.out
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.008 MB perf.data (11 samples)]
      
      Before patch:
      
        # perf script | wc -l
        0
      
      After patch:
      
        # perf script | wc -l
        11
      
      Committer testing:
      
        [root@jouet ~]# perf script | head -30 | tail
              Timer 9803 [2] 8.963330:  1554 L1-dcache-loads: 7ffef89baae4 __vdso_clock_gettime+0xf4 ([vdso])
            swapper    0 [2] 8.963343:  5626 L1-dcache-loads: ffffffffa66f4f6b cpuidle_not_av+0xb (/lib/modules/4.17.0-rc5/build/vmlinux)
            firefox 4853 [2] 8.964070: 18935 L1-dcache-loads: 7f0b9a00dc30 xcb_poll_for_event+0x0 (/usr/lib64/libxcb.so.1.1.0)
        Softwar~cTh 4928 [2] 8.964548: 15928 L1-dcache-loads: ffffffffa60d795c update_curr+0x10c (/lib/modules/4.17.0-rc5/build/vmlinux)
            firefox 4853 [2] 8.964675: 14978 L1-dcache-loads: ffffffffa6897018 mutex_unlock+0x18 (/lib/modules/4.17.0-rc5/build/vmlinux)
        gnome-shell 2026 [3] 8.964693: 50670 L1-dcache-loads: 7fa08854de6d g_source_iter_next+0x6d (/usr/lib64/libglib-2.0.so.0.5400.3)
         Compositor 4929 [1] 8.964784: 71772 L1-dcache-loads: 7f0b936bf078 [unknown] (/usr/lib64/firefox/libxul.so)
           Xwayland 2096 [2] 8.964919: 16799 L1-dcache-loads: 7f68ce2fcb8a glXGetCurrentContext+0x1a (/usr/lib64/libGLX.so.0.0.0)
        gnome-shell 2026 [3] 8.964997: 50670 L1-dcache-loads: 7fa08854de6d g_source_iter_next+0x6d (/usr/lib64/libglib-2.0.so.0.5400.3)
        [root@jouet ~]#
      Signed-off-by: NSeeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1528455748-20087-1-git-send-email-s1seetee@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fad76d43
  21. 05 6月, 2018 2 次提交
  22. 19 5月, 2018 1 次提交
    • S
      perf script: Show symbol offsets by default · 7903a708
      Sandipan Das 提交于
      Since the ip shown for a symbol is now always a virtual address, it
      becomes difficult to correlate this with objdump output and determine
      the exact instruction address. So, we always show the offset from the
      start of the symbol.
      
      This can be verified on a powerpc64le system running Fedora 27 as
      follows:
      
        # perf probe -a sys_write
        # perf record -e probe:sys_write -g ~/test
      
      Before applying this patch:
      
        # perf script
      
        test  9710 [013] 95614.332431: probe:sys_write: (c0000000004025b0)
                c0000000004025b0 sys_write (/lib/modules/4.17.0-rc4+/build/vmlinux)
                c00000000000b9e0 system_call (/lib/modules/4.17.0-rc4+/build/vmlinux)
                    7fffb70d8234 __GI___libc_write (/usr/lib64/libc-2.26.so)
                    7fffb7052c74 _IO_file_write@@GLIBC_2.17 (/usr/lib64/libc-2.26.so)
                        5afc1818 [unknown] ([unknown])
                    7fffb7051a60 new_do_write (/usr/lib64/libc-2.26.so)
                    7fffb7054638 _IO_do_write@@GLIBC_2.17 (/usr/lib64/libc-2.26.so)
                    7fffb7054bbc _IO_file_overflow@@GLIBC_2.17 (/usr/lib64/libc-2.26.so)
                    7fffb7055a24 __overflow (/usr/lib64/libc-2.26.so)
                    7fffb7044548 _IO_puts (/usr/lib64/libc-2.26.so)
                        10000440 main (/home/sandipan/test)
                    7fffb6fe36a0 generic_start_main.isra.0 (/usr/lib64/libc-2.26.so)
                    7fffb6fe3898 __libc_start_main (/usr/lib64/libc-2.26.so)
                               0 [unknown] ([unknown])
        ...
      
      After applying this patch:
      
        # perf script
      
        test  9710 [013] 95614.332431: probe:sys_write: (c0000000004025b0)
                c0000000004025b0 sys_write+0x10 (/lib/modules/4.17.0-rc4+/build/vmlinux)
                c00000000000b9e0 system_call+0x58 (/lib/modules/4.17.0-rc4+/build/vmlinux)
                    7fffb70d8234 __GI___libc_write+0x24 (/usr/lib64/libc-2.26.so)
                    7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44 (/usr/lib64/libc-2.26.so)
                        5afc1818 [unknown] ([unknown])
                    7fffb7051a60 new_do_write+0x90 (/usr/lib64/libc-2.26.so)
                    7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38 (/usr/lib64/libc-2.26.so)
                    7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c (/usr/lib64/libc-2.26.so)
                    7fffb7055a24 __overflow+0x64 (/usr/lib64/libc-2.26.so)
                    7fffb7044548 _IO_puts+0x218 (/usr/lib64/libc-2.26.so)
                        10000440 main+0x20 (/home/sandipan/test)
                    7fffb6fe36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
                    7fffb6fe3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
                               0 [unknown] ([unknown])
        ...
      Signed-off-by: NSandipan Das <sandipan@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20180517063326.6319-2-sandipan@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7903a708
  23. 27 4月, 2018 4 次提交
  24. 17 4月, 2018 1 次提交
    • A
      perf script: Extend misc field decoding with switch out event type · bf30cc18
      Alexey Budankov 提交于
      Append 'p' sign to 'S' tag designating the type of context switch out event so
      'Sp' means preemption context switch. Documentation is extended to cover
      new presentation changes.
      
        $ perf script --show-switch-events -F +misc -I -i perf.data:
      
                hdparm 4073 [004] U  762.198265:     380194 cycles:ppp:      7faf727f5a23 strchr (/usr/lib64/ld-2.26.so)
                hdparm 4073 [004] K  762.198366:     441572 cycles:ppp:  ffffffffb9218435 alloc_set_pte (/lib/modules/4.16.0-rc6+/build/vmlinux)
                hdparm 4073 [004] S  762.198391: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    0/0
               swapper    0 [004]    762.198392: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 4073/4073
               swapper    0 [004] Sp 762.198477: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 4073/4073
                hdparm 4073 [004]    762.198478: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    0/0
               swapper    0 [007] K  762.198514:    2303073 cycles:ppp:  ffffffffb98b0c66 intel_idle (/lib/modules/4.16.0-rc6+/build/vmlinux)
               swapper    0 [007] Sp 762.198561: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 1134/1134
        kworker/u16:18 1134 [007]    762.198562: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    0/0
        kworker/u16:18 1134 [007] S  762.198567: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    0/0
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/5fc65ce7-8ca5-53ae-8858-8ddd27290575@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bf30cc18
  25. 12 4月, 2018 1 次提交