1. 21 6月, 2017 1 次提交
  2. 12 11月, 2016 1 次提交
  3. 30 3月, 2016 1 次提交
  4. 29 9月, 2015 3 次提交
    • A
      perf intel-pt: Add mispred-all config option to aid use with autofdo · ba11ba65
      Adrian Hunter 提交于
      autofdo incorrectly expects branch flags to include either mispred or
      predicted.  In fact mispred = predicted = 0 is valid and means the flags
      are not supported, which they aren't by Intel PT.
      
      To make autofdo work, add a config option which will cause Intel PT
      decoder to set the mispred flag on all branches.
      
      Below is an example of using Intel PT with autofdo.  The example is
      also added to the Intel PT documentation.  It requires autofdo
      (https://github.com/google/autofdo) and gcc version 5.  The bubble
      sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial)
      amended to take the number of elements as a parameter.
      
      	$ gcc-5 -O3 sort.c -o sort_optimized
      	$ ./sort_optimized 30000
      	Bubble sorting array of 30000 elements
      	2254 ms
      
      	$ cat ~/.perfconfig
      	[intel-pt]
      		mispred-all
      
      	$ perf record -e intel_pt//u ./sort 3000
      	Bubble sorting array of 3000 elements
      	58 ms
      	[ perf record: Woken up 2 times to write data ]
      	[ perf record: Captured and wrote 3.939 MB perf.data ]
      	$ perf inject -i perf.data -o inj --itrace=i100usle --strip
      	$ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
      	$ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
      	$ ./sort_autofdo 30000
      	Bubble sorting array of 30000 elements
      	2155 ms
      
      Note there is currently no advantage to using Intel PT instead of LBR,
      but that may change in the future if greater use is made of the data.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-26-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ba11ba65
    • A
      perf intel-pt: Support generating branch stack · f14445ee
      Adrian Hunter 提交于
      Add support for generating branch stack context for PT samples.  The
      decoder reports a configurable number of branches as branch context for
      each sample. Internally it keeps track of them by using a simple sliding
      window.  We also flush the last branch buffer on each sample to avoid
      overlapping intervals.
      
      This is useful for:
      
      - Reporting accurate basic block edge frequencies through the perf
        report branch view
      - Using with --branch-history to get the wider context of samples
      - Other users of LBRs
      
      Also the Documentation is updated.
      
      Examples:
      
      	Record with Intel PT:
      
      		perf record -e intel_pt//u ls
      
      	Branch stacks are used by default if synthesized so:
      
      		perf report --itrace=ile
      
      	is the same as:
      
      		perf report --itrace=ile -b
      
      	Branch history can be requested also:
      
      		perf report --itrace=igle --branch-history
      Based-on-patch-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-15-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f14445ee
    • A
      perf auxtrace: Fix 'instructions' period of zero · e1791347
      Adrian Hunter 提交于
      Instruction tracing options (i.e. --itrace) include an option for
      sampling instructions at an arbitrary period. e.g.
      
      	--itrace=i10us
      
      means make an 'instructions' sample for every 10us of trace.
      
      Currently the logic does not distinguish between a period of
      zero and no period being specified at all, so it gets treated
      as the default period which is 100000.  That doesn't really
      make sense.
      
      Fix it so that zero period is accepted and treated as meaning
      "as often as possible".
      
      In the case of Intel PT that is the same as a period of 1 and
      a unit of 'instructions' (i.e. --itrace=i1i).
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-2-git-send-email-adrian.hunter@intel.com
      [ Add a few lines describing this in the Documentation/intel-pt.txt file ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e1791347
  5. 25 9月, 2015 1 次提交
  6. 25 8月, 2015 1 次提交
  7. 17 8月, 2015 1 次提交
    • A
      perf tools: Take Intel PT into use · 5efb1d54
      Adrian Hunter 提交于
      To record an AUX area, the weak function auxtrace_record__init() must be
      implemented.
      
      Equally to decode an AUX area, the AUX area tracing type must be added
      to the perf_event__process_auxtrace_info() function.
      
      This patch makes those two changes plus hooks up default config for the
      intel_pt PMU.  Also some brief documentation is provided for using the
      tools with intel_pt.
      
      Commiter note:
      
      E.g:
      
        [root@perf4 ~]# dmesg
        451 [0.405807] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
        [root@perf4 ~]# perf --version
        perf version 4.1.g53874a
        [root@perf4 ~]#  perf record -e intel_pt//u -a sleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.383 MB perf.data ]
        [root@perf4 ~]# perf evlist
        intel_pt//u
        sched:sched_switch
        dummy:u
        [root@perf4 ~]# perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 0  of event 'intel_pt//u'
        # Event count (approx.): 0
        #
        # Overhead  Command  Shared Object  Symbol
        # ........  .......  .............  ......
        #
      
        # Samples: 393  of event 'sched:sched_switch'
        # Event count (approx.): 393
        #
        # Overhead  Command         Shared Object     Symbol
        # ........  ..............  ................  ..............
          49.62%  swapper         [kernel.vmlinux]  [k] __schedule
          10.69%  rcu_sched       [kernel.vmlinux]  [k] __schedule
           6.62%  rcuos/0         [kernel.vmlinux]  [k] __schedule
           5.60%  kworker/0:1     [kernel.vmlinux]  [k] __schedule
           3.56%  rcuos/3         [kernel.vmlinux]  [k] __schedule
           3.05%  kworker/u384:2  [kernel.vmlinux]  [k] __schedule
           2.54%  kworker/2:0     [kernel.vmlinux]  [k] __schedule
           2.54%  tuned           [kernel.vmlinux]  [k] __schedule
        <SNIP>
        # Samples: 0  of event 'dummy:u'
        # Event count (approx.): 0
        #
        # Overhead  Command  Shared Object  Symbol
        # ........  .......  .............  ......
      
        # Samples: 28  of event 'instructions:u'
        # Event count (approx.): 5030172
        #
        # Overhead  Command     Shared Object        Symbol
        # ........  ..........  ...................  ................................
        #
          21.43%  tuned       libpython2.7.so.1.0  [.] PyEval_EvalFrameEx
                       |
                       ---PyEval_EvalFrameEx
                          |
                          |--83.33%-- PyEval_EvalCodeEx
                          |          PyEval_EvalFrameEx
                          |          |
                          |          |--60.00%-- PyEval_EvalCodeEx
                          |          |          PyEval_EvalFrameEx
                          |          |          PyEval_EvalFrameEx
                          |          |
                          |           --40.00%-- PyEval_EvalFrameEx
                          |
                           --16.67%-- PyEval_EvalFrameEx
                                     PyEval_EvalCodeEx
                                     PyEval_EvalFrameEx
                                     PyEval_EvalCodeEx
                                     PyEval_EvalFrameEx
                                     PyEval_EvalFrameEx
      
          14.29%  tuned       libpython2.7.so.1.0  [.] _PyType_Lookup
                       |
                       ---_PyType_Lookup
                          _PyObject_GenericGetAttrWithDict
                          PyEval_EvalFrameEx
                          PyEval_EvalCodeEx
                          PyEval_EvalFrameEx
                          PyEval_EvalCodeEx
                          PyEval_EvalFrameEx
                          |
                          |--75.00%-- PyEval_EvalFrameEx
                          |
                           --25.00%-- PyEval_EvalCodeEx
                                     PyEval_EvalFrameEx
                                     PyEval_EvalFrameEx
      
           3.57%  irqbalance  irqbalance           [.] 0x0000000000004038
                  |
                  ---0x4038
                     0x4761
                     0x4761
                     0x4761
                     0x49f1
                     0x2295
      
           3.57%  irqbalance  libc-2.17.so         [.] __GI_____strtoull_l_internal
                  |
                  ---__GI_____strtoull_l_internal
                     0x6f49
                     0x229a
      
           3.57%  irqbalance  libc-2.17.so         [.] __strchrnul
                  |
                  ---__strchrnul
                     vfprintf
                     __vsprintf_chk
                     __sprintf_chk
                     0x2724
                     0x4038
                     0x2331
      
           3.57%  irqbalance  libc-2.17.so         [.] __strstr_sse42
                  |
                  ---__strstr_sse42
                     0x71e0
                     0x229f
      
        # And now to some userspace ftrace on uninstrumented binaries 8-) :
        # Hand edited to make it a bit more compact, replacing /home/acme/bin/perf
        # with /bin/perf:
      
        [root@perf4 ~]# perf script
           perf 8921 [3] 7.310889: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310890: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310893: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310956: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310961: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310968: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.311040: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.311046: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.311050: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311050: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
      :
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-8-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5efb1d54