• A
    perf script: Add 'brstackinsn' for branch stacks · 48d02a1d
    Andi Kleen 提交于
    Implement printing instruction sequences as hex dump for branch stacks.
    
    This relies on the x86 instruction decoder used by the PT decoder to
    find the lengths of instructions to dump them individually.
    
    This is good enough for pattern matching.
    
    This allows to study hot paths for individual samples, together with
    branch misprediction and cycle count / IPC information if available (on
    Skylake systems).
    
      % perf record -b ...
      % perf script -F brstackinsn
      ...
        read_hpet+67:
              ffffffff9905b843        insn: 74 ea                     # PRED
              ffffffff9905b82f        insn: 85 c9
              ffffffff9905b831        insn: 74 12
              ffffffff9905b833        insn: f3 90
              ffffffff9905b835        insn: 48 8b 0f
              ffffffff9905b838        insn: 48 89 ca
              ffffffff9905b83b        insn: 48 c1 ea 20
              ffffffff9905b83f        insn: 39 f2
              ffffffff9905b841        insn: 89 d0
              ffffffff9905b843        insn: 74 ea                     # PRED
    
    Only works when no special branch filters are specified.
    
    Occasionally the path does not reach up to the sample IP, as the LBRs
    may be frozen before executing a final jump. In this case we print a
    special message.
    
    The instruction dumper piggy backs on the existing infrastructure from
    the IP PT decoder.
    
    An earlier iteration of this patch relied on a disassembler, but this
    version only uses the existing instruction decoder.
    
    Committer note:
    
    Added hint about how to get suitable perf.data files for use with
    '-F brstackinsm':
    
      $ perf record usleep 1
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
      $
      $ perf script -F brstackinsn
      Display of branch stack assembler requested, but non all-branch filter set
      Hint: run 'perf record -b ...'
      $
    Signed-off-by: NAndi Kleen <ak@linux.intel.com>
    Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
    48d02a1d
intel-pt-insn-decoder.c 6.6 KB