1. 27 2月, 2020 4 次提交
    • R
      perf annotate: Simplify disasm_line allocation and freeing code · 2316f861
      Ravi Bangoria 提交于
      We are allocating disasm_line object in annotation_line__new() instead
      of disasm_line__new(). Similarly annotation_line__delete() is actually
      freeing disasm_line object as well. This complexity is because of
      privsize.  But we don't need privsize anymore so get rid of privsize and
      simplify disasm_line allocation and freeing code.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: http://lore.kernel.org/lkml/20200204045233.474937-3-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2316f861
    • R
      perf annotate: Remove privsize from symbol__annotate() args · e0ad4d68
      Ravi Bangoria 提交于
      privsize is passed as 0 from all the symbol__annotate() callers.
      Remove it from argument list.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: http://lore.kernel.org/lkml/20200204045233.474937-2-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e0ad4d68
    • R
      perf annotate: Make perf config effective · 7384083b
      Ravi Bangoria 提交于
      perf default config set by user in [annotate] section is totally ignored
      by annotate code. Fix it.
      
      Before:
      
        $ ./perf config
        annotate.hide_src_code=true
        annotate.show_nr_jumps=true
        annotate.show_nr_samples=true
      
        $ ./perf annotate shash
               │    unsigned h = 0;
               │      movl   $0x0,-0xc(%rbp)
               │    while (*s)
               │    ↓ jmp    44
               │    h = 65599 * h + *s++;
         11.33 │24:   mov    -0xc(%rbp),%eax
         43.50 │      imul   $0x1003f,%eax,%ecx
               │      mov    -0x18(%rbp),%rax
      
      After:
      
               │        movl   $0x0,-0xc(%rbp)
               │      ↓ jmp    44
             1 │1 24:   mov    -0xc(%rbp),%eax
             4 │        imul   $0x1003f,%eax,%ecx
               │        mov    -0x18(%rbp),%rax
      
      Note that we have removed show_nr_samples and show_total_period from
      annotation_options because they are not used. Instead of them we use
      symbol_conf.show_nr_samples and symbol_conf.show_total_period.
      
      Committer testing:
      
      Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio:
      
        # perf config
        #
        # perf config annotate.hide_src_code=true
        # perf config
        annotate.hide_src_code=true
        #
        # perf config annotate.show_nr_jumps=true
        # perf config annotate.show_nr_samples=true
        # perf config
        annotate.hide_src_code=true
        annotate.show_nr_jumps=true
        annotate.show_nr_samples=true
        #
        #
      
      Before:
      
        # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized
        Samples: 1  of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
        ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
        Percent
                    00000000000609f0 <ObjectInstance::weak_pointer_was_finalized()@@base>:
                      endbr64
                      cmpq    $0x0,0x20(%rdi)
                    ↓ je      10
                      xor     %eax,%eax
                    ← retq
                      xchg    %ax,%ax
        100.00  10:   push    %rbp
                      cmpq    $0x0,0x18(%rdi)
                      mov     %rdi,%rbp
                    ↓ jne     20
                1b:   xor     %eax,%eax
                      pop     %rbp
                    ← retq
                      nop
                20:   lea     0x18(%rdi),%rdi
                    → callq   JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
                      cmpq    $0x0,0x18(%rbp)
                    ↑ jne     1b
                      mov     %rbp,%rdi
                    → callq   ObjectBase::jsobj_addr() const@plt
                      mov     $0x1,%eax
                      pop     %rbp
                    ← retq
        #
      
      After:
      
        # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
        Samples: 1  of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
        ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
        Samples       endbr64
                      cmpq    $0x0,0x20(%rdi)
                    ↓ je      10
                      xor     %eax,%eax
                    ← retq
                      xchg    %ax,%ax
           1  1 10:   push    %rbp
                      cmpq    $0x0,0x18(%rdi)
                      mov     %rdi,%rbp
                    ↓ jne     20
              1 1b:   xor     %eax,%eax
                      pop     %rbp
                    ← retq
                      nop
              1 20:   lea     0x18(%rdi),%rdi
                    → callq   JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
                      cmpq    $0x0,0x18(%rbp)
                    ↑ jne     1b
                      mov     %rbp,%rdi
                    → callq   ObjectBase::jsobj_addr() const@plt
                      mov     $0x1,%eax
                      pop     %rbp
                    ← retq
        #
        # perf config annotate.show_nr_jumps
        annotate.show_nr_jumps=true
        # perf config annotate.show_nr_jumps=false
        # perf config annotate.show_nr_jumps
        annotate.show_nr_jumps=false
        #
        # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized 2> /dev/null
        Samples: 1  of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
        ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
        Samples       endbr64
                      cmpq    $0x0,0x20(%rdi)
                    ↓ je      10
                      xor     %eax,%eax
                    ← retq
                      xchg    %ax,%ax
             1  10:   push    %rbp
                      cmpq    $0x0,0x18(%rdi)
                      mov     %rdi,%rbp
                    ↓ jne     20
                1b:   xor     %eax,%eax
                      pop     %rbp
                    ← retq
                      nop
                20:   lea     0x18(%rdi),%rdi
                    → callq   JS_UpdateWeakPointerAfterGC(JS::Heap<JSObject*
                      cmpq    $0x0,0x18(%rbp)
                    ↑ jne     1b
                      mov     %rbp,%rdi
                    → callq   ObjectBase::jsobj_addr() const@plt
                      mov     $0x1,%eax
                      pop     %rbp
                    ← retq
        #
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Link: http://lore.kernel.org/lkml/20200213064306.160480-6-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7384083b
    • R
      perf annotate: Fix --show-total-period for tui/stdio2 · 68aac855
      Ravi Bangoria 提交于
      perf annotate --show-total-period does not really show total period.
      
      The reason is we have two separate variables for the same purpose.
      
      One is in symbol_conf.show_total_period and another is
      annotation_options.show_total_period.
      
      We save command line option in symbol_conf.show_total_period but uses
      annotation_option.show_total_period while rendering tui/stdio2 browser.
      
      Though, we copy symbol_conf.show_total_period to
      annotation__default_options.show_total_period but that is not really
      effective as we don't use annotation__default_options once we copy
      default options to dynamic variable annotate.opts in cmd_annotate().
      
      Instead of all these complication, keep only one variable and use it all
      over. symbol_conf.show_total_period is used by perf report/top as well.
      So let's kill annotation_options.show_total_period.
      
      On a side note, I've kept annotation_options.show_total_period
      definition because it's still used by perf-config code. Follow up patch
      to fix perf-config for annotate will remove
      annotation_options.show_total_period.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yisheng Xie <xieyisheng1@huawei.com>
      Link: http://lore.kernel.org/lkml/20200213064306.160480-3-ravi.bangoria@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      68aac855
  2. 14 1月, 2020 1 次提交
    • A
      perf tools: Support --prefix/--prefix-strip · 3b0b16bf
      Andi Kleen 提交于
      The objdump utility has useful --prefix / --prefix-strip options to
      allow changing source code file names hardcoded into executables' debug
      info. Add options to 'perf report', 'perf top' and 'perf annotate',
      which are then passed to objdump.
      
        $ mkdir foo
        $ echo 'main() { for (;;); }' > foo/foo.c
        $ gcc -g foo/foo.c
        foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
            1 | main() { for (;;); }
              | ^~~~
        $ perf record ./a.out
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ]
        $ mv foo bar
        $ perf annotate
        <does not show source code>
        $ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5
        <does show source code>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b0b16bf
  3. 12 11月, 2019 1 次提交
  4. 11 10月, 2019 1 次提交
    • J
      perf diff: Report noisy for cycles diff · cebf7d51
      Jin Yao 提交于
      This patch prints the stddev and hist for the cycles diff of program
      block. It can help us to understand if the cycles is noisy or not.
      
      This patch is inspired by Andi Kleen's patch:
      
        https://lwn.net/Articles/600471/
      
      We create new option '--cycles-hist'.
      
      Example:
      
        perf record -b ./div
        perf record -b ./div
        perf diff -c cycles
      
        # Baseline                                [Program Block Range] Cycles Diff  Shared Object      Symbol
        # ........  .......................................................... ....  .................  ............................
        #
            46.72%                                      [div.c:40 -> div.c:40]    0  div                [.] main
            46.72%                                      [div.c:42 -> div.c:44]    0  div                [.] main
            46.72%                                      [div.c:42 -> div.c:39]    0  div                [.] main
            20.54%                          [random_r.c:357 -> random_r.c:394]    1  libc-2.27.so       [.] __random_r
            20.54%                          [random_r.c:357 -> random_r.c:380]    0  libc-2.27.so       [.] __random_r
            20.54%                          [random_r.c:388 -> random_r.c:388]    0  libc-2.27.so       [.] __random_r
            20.54%                          [random_r.c:388 -> random_r.c:391]    0  libc-2.27.so       [.] __random_r
            17.04%                              [random.c:288 -> random.c:291]    0  libc-2.27.so       [.] __random
            17.04%                              [random.c:291 -> random.c:291]    0  libc-2.27.so       [.] __random
            17.04%                              [random.c:293 -> random.c:293]    0  libc-2.27.so       [.] __random
            17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
            17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
            17.04%                              [random.c:298 -> random.c:298]    0  libc-2.27.so       [.] __random
             8.40%                                      [div.c:22 -> div.c:25]    0  div                [.] compute_flag
             8.40%                                      [div.c:27 -> div.c:28]    0  div                [.] compute_flag
             5.14%                                    [rand.c:26 -> rand.c:27]    0  libc-2.27.so       [.] rand
             5.14%                                    [rand.c:28 -> rand.c:28]    0  libc-2.27.so       [.] rand
             2.15%                                  [rand@plt+0 -> rand@plt+0]    0  div                [.] rand@plt
             0.00%                                                                   [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
             0.00%                                [do_mmap+714 -> do_mmap+732]  -10  [kernel.kallsyms]  [k] do_mmap
             0.00%                                [do_mmap+737 -> do_mmap+765]    1  [kernel.kallsyms]  [k] do_mmap
             0.00%                                [do_mmap+262 -> do_mmap+299]    0  [kernel.kallsyms]  [k] do_mmap
             0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7  [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
             0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  [kernel.kallsyms]  [k] native_sched_clock
             0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  [kernel.kallsyms]  [k] native_write_msr
      
      When we enable the option '--cycles-hist', the output is
      
        perf diff -c cycles --cycles-hist
      
        # Baseline                                [Program Block Range] Cycles Diff        stddev/Hist  Shared Object      Symbol
        # ........  .......................................................... ....  .................  .................  ............................
        #
            46.72%                                      [div.c:40 -> div.c:40]    0  ± 37.8% ▁█▁▁██▁█   div                [.] main
            46.72%                                      [div.c:42 -> div.c:44]    0  ± 49.4% ▁▁▂█▂▂▂▂   div                [.] main
            46.72%                                      [div.c:42 -> div.c:39]    0  ± 24.1% ▃█▂▄▁▃▂▁   div                [.] main
            20.54%                          [random_r.c:357 -> random_r.c:394]    1  ± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so       [.] __random_r
            20.54%                          [random_r.c:357 -> random_r.c:380]    0  ± 39.4% ▁▁█▁██▅▁   libc-2.27.so       [.] __random_r
            20.54%                          [random_r.c:388 -> random_r.c:388]    0                     libc-2.27.so       [.] __random_r
            20.54%                          [random_r.c:388 -> random_r.c:391]    0  ± 41.2% ▁▃▁▂█▄▃▁   libc-2.27.so       [.] __random_r
            17.04%                              [random.c:288 -> random.c:291]    0  ± 48.8% ▁▁▁▁███▁   libc-2.27.so       [.] __random
            17.04%                              [random.c:291 -> random.c:291]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
            17.04%                              [random.c:293 -> random.c:293]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
            17.04%                              [random.c:295 -> random.c:295]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
            17.04%                              [random.c:295 -> random.c:295]    0                     libc-2.27.so       [.] __random
            17.04%                              [random.c:298 -> random.c:298]    0  ± 75.6% ▃█▁▁▁▁▁▁   libc-2.27.so       [.] __random
             8.40%                                      [div.c:22 -> div.c:25]    0  ± 42.1% ▁▃▁▁███▁   div                [.] compute_flag
             8.40%                                      [div.c:27 -> div.c:28]    0  ± 41.8% ██▁▁▄▁▁▄   div                [.] compute_flag
             5.14%                                    [rand.c:26 -> rand.c:27]    0  ± 37.8% ▁▁▁████▁   libc-2.27.so       [.] rand
             5.14%                                    [rand.c:28 -> rand.c:28]    0                     libc-2.27.so       [.] rand
             2.15%                                  [rand@plt+0 -> rand@plt+0]    0                     div                [.] rand@plt
             0.00%                                                                                      [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
             0.00%                                [do_mmap+714 -> do_mmap+732]  -10                     [kernel.kallsyms]  [k] do_mmap
             0.00%                                [do_mmap+737 -> do_mmap+765]    1                     [kernel.kallsyms]  [k] do_mmap
             0.00%                                [do_mmap+262 -> do_mmap+299]    0                     [kernel.kallsyms]  [k] do_mmap
             0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7                     [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
             0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  ± 38.5% ▄█▁        [kernel.kallsyms]  [k] native_sched_clock
             0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  ± 47.1% ▁█▇▃▁▁     [kernel.kallsyms]  [k] native_write_msr
      
       v8:
       ---
       Rebase to perf/core branch
      
       v7:
       ---
       1. v6 got Jiri's ACK.
       2. Rebase to latest perf/core branch.
      
       v6:
       ---
       1. Jiri provides better code for using data__hpp_register() in ui_init().
          Use this code in v6.
      
       v5:
       ---
       1. Refine the use of data__hpp_register() in ui_init() according to
          Jiri's suggestion.
      
       v4:
       ---
       1. Rename the new option from '--noisy' to '--cycles-hist'
       2. Remove the option '-n'.
       3. Only update the spark value and stats when '--cycles-hist' is enabled.
       4. Remove the code of printing '..'.
      
       v3:
       ---
       1. Move the histogram to a separate column
       2. Move the svals[] out of struct stats
      
       v2:
       ---
       Jiri got a compile error,
      
        CC       builtin-diff.o
        builtin-diff.c: In function ‘compute_cycles_diff’:
        builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
        712 |          labs(pair->block_info->cycles_spark[i] -
            |          ^~~~
      
       Because the result of u64 - u64 is still u64. Now we change the type of
       cycles_spark[] to s64.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cebf7d51
  5. 01 10月, 2019 2 次提交
  6. 30 7月, 2019 2 次提交
  7. 21 3月, 2019 1 次提交
    • S
      perf annotate: Enable annotation of BPF programs · 6987561c
      Song Liu 提交于
      In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
      a new function symbol__disassemble_bpf(), where annotation line
      information is filled based on the bpf_prog_info and btf data saved in
      given perf_env.
      
      symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
      programs.
      
      Committer testing:
      
      After fixing this:
      
        -               u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
        +               u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms);
      
      Detected when crossbuilding to a 32-bit arch.
      
      And making all this dependent on HAVE_LIBBFD_SUPPORT and
      HAVE_LIBBPF_SUPPORT:
      
      1) Have a BPF program running, one that has BTF info, etc, I used
         the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
         by 'perf trace'.
      
        # grep -B1 augmented_raw ~/.perfconfig
        [trace]
      	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
        #
        # perf trace -e *mmsg
        dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
        NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2
      
      2) Then do a 'perf record' system wide for a while:
      
        # perf record -a
        ^C[ perf record: Woken up 68 times to write data ]
        [ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
        #
      
      3) Check that we captured BPF and BTF info in the perf.data file:
      
        # perf report --header-only | grep 'b[pt]f'
        # event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
        # bpf_prog_info of id 13
        # bpf_prog_info of id 14
        # bpf_prog_info of id 15
        # bpf_prog_info of id 16
        # bpf_prog_info of id 17
        # bpf_prog_info of id 18
        # bpf_prog_info of id 21
        # bpf_prog_info of id 22
        # bpf_prog_info of id 41
        # bpf_prog_info of id 42
        # btf info of id 2
        #
      
      4) Check which programs got recorded:
      
         # perf report | grep bpf_prog | head
           0.16%  exe              bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.14%  exe              bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.08%  fuse-overlayfs   bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.07%  fuse-overlayfs   bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.01%  clang-4.0        bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.01%  clang-4.0        bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.00%  clang            bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.00%  runc             bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.00%  clang            bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.00%  sh               bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
        #
      
        This was with the default --sort order for 'perf report', which is:
      
          --sort comm,dso,symbol
      
        If we just look for the symbol, for instance:
      
         # perf report --sort symbol | grep bpf_prog | head
           0.26%  [k] bpf_prog_819967866022f1e1_sys_enter                -      -
           0.24%  [k] bpf_prog_c1bd85c092d6e4aa_sys_exit                 -      -
         #
      
        or the DSO:
      
         # perf report --sort dso | grep bpf_prog | head
           0.26%  bpf_prog_819967866022f1e1_sys_enter
           0.24%  bpf_prog_c1bd85c092d6e4aa_sys_exit
        #
      
      We'll see the two BPF programs that augmented_raw_syscalls.o puts in
      place,  one attached to the raw_syscalls:sys_enter and another to the
      raw_syscalls:sys_exit tracepoints, as expected.
      
      Now we can finally do, from the command line, annotation for one of
      those two symbols, with the original BPF program source coude intermixed
      with the disassembled JITed code:
      
        # perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter
      
        Samples: 950  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
        bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
        Percent      int sys_enter(struct syscall_enter_args *args)
         53.41         push   %rbp
      
          0.63         mov    %rsp,%rbp
          0.31         sub    $0x170,%rsp
          1.93         sub    $0x28,%rbp
          7.02         mov    %rbx,0x0(%rbp)
          3.20         mov    %r13,0x8(%rbp)
          1.07         mov    %r14,0x10(%rbp)
          0.61         mov    %r15,0x18(%rbp)
          0.11         xor    %eax,%eax
          1.29         mov    %rax,0x20(%rbp)
          0.11         mov    %rdi,%rbx
                     	return bpf_get_current_pid_tgid();
          2.02       → callq  *ffffffffda6776d9
          2.76         mov    %eax,-0x148(%rbp)
                       mov    %rbp,%rsi
                     int sys_enter(struct syscall_enter_args *args)
                       add    $0xfffffffffffffeb8,%rsi
                     	return bpf_map_lookup_elem(pids, &pid) != NULL;
                       movabs $0xffff975ac2607800,%rdi
      
          1.26       → callq  *ffffffffda6789e9
                       cmp    $0x0,%rax
          2.43       → je     0
                       add    $0x38,%rax
          0.21         xor    %r13d,%r13d
                     	if (pid_filter__has(&pids_filtered, getpid()))
          0.81         cmp    $0x0,%rax
                     → jne    0
                       mov    %rbp,%rdi
                     	probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
          2.22         add    $0xfffffffffffffeb8,%rdi
          0.11         mov    $0x40,%esi
          0.32         mov    %rbx,%rdx
          2.74       → callq  *ffffffffda658409
                     	syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
          0.22         mov    %rbp,%rsi
          1.69         add    $0xfffffffffffffec0,%rsi
                     	syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
                       movabs $0xffff975bfcd36000,%rdi
      
                       add    $0xd0,%rdi
          0.21         mov    0x0(%rsi),%eax
          0.93         cmp    $0x200,%rax
                     → jae    0
          0.10         shl    $0x3,%rax
      
          0.11         add    %rdi,%rax
          0.11       → jmp    0
                       xor    %eax,%eax
                     	if (syscall == NULL || !syscall->enabled)
          1.07         cmp    $0x0,%rax
                     → je     0
                     	if (syscall == NULL || !syscall->enabled)
          6.57         movzbq 0x0(%rax),%rdi
      
                     	if (syscall == NULL || !syscall->enabled)
                       cmp    $0x0,%rdi
          0.95       → je     0
                       mov    $0x40,%r8d
                     	switch (augmented_args.args.syscall_nr) {
                       mov    -0x140(%rbp),%rdi
                     	switch (augmented_args.args.syscall_nr) {
                       cmp    $0x2,%rdi
                     → je     0
                       cmp    $0x101,%rdi
                     → je     0
                       cmp    $0x15,%rdi
                     → jne    0
                     	case SYS_OPEN:	 filename_arg = (const void *)args->args[0];
                       mov    0x10(%rbx),%rdx
                     → jmp    0
                     	case SYS_OPENAT: filename_arg = (const void *)args->args[1];
                       mov    0x18(%rbx),%rdx
                     	if (filename_arg != NULL) {
                       cmp    $0x0,%rdx
                     → je     0
                       xor    %edi,%edi
                     		augmented_args.filename.reserved = 0;
                       mov    %edi,-0x104(%rbp)
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    %rbp,%rdi
                       add    $0xffffffffffffff00,%rdi
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    $0x100,%esi
                     → callq  *ffffffffda658499
                       mov    $0x148,%r8d
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    %eax,-0x108(%rbp)
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    %rax,%rdi
                       shl    $0x20,%rdi
      
                       shr    $0x20,%rdi
      
                     		if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
                       cmp    $0xff,%rdi
                     → ja     0
                     			len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
                       add    $0x48,%rax
                     			len &= sizeof(augmented_args.filename.value) - 1;
                       and    $0xff,%rax
                       mov    %rax,%r8
                       mov    %rbp,%rcx
                     	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
                       add    $0xfffffffffffffeb8,%rcx
                       mov    %rbx,%rdi
                       movabs $0xffff975fbd72d800,%rsi
      
                       mov    $0xffffffff,%edx
                     → callq  *ffffffffda658ad9
                       mov    %rax,%r13
                     }
                       mov    %r13,%rax
          0.72         mov    0x0(%rbp),%rbx
                       mov    0x8(%rbp),%r13
          1.16         mov    0x10(%rbp),%r14
          0.10         mov    0x18(%rbp),%r15
          0.42         add    $0x28,%rbp
          0.54         leaveq
          0.54       ← retq
        #
      
      Please see 'man perf-config' to see how to control what should be seen,
      via ~/.perfconfig [annotate] section, for instance, one can suppress the
      source code and see just the disassembly, etc.
      
      Alternatively, use the TUI bu just using 'perf annotate', press
      '/bpf_prog' to see the bpf symbols, press enter and do the interactive
      annotation, which allows for dumping to a file after selecting the
      the various output tunables, for instance, the above without source code
      intermixed, plus showing all the instruction offsets:
      
        # perf annotate bpf_prog_819967866022f1e1_sys_enter
      
      Then press: 's' to hide the source code + 'O' twice to show all
      instruction offsets, then 'P' to print to the
      bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
      
        # cat bpf_prog_819967866022f1e1_sys_enter.annotation
        bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
        Event: cycles:ppp
      
         53.41    0:   push   %rbp
      
          0.63    1:   mov    %rsp,%rbp
          0.31    4:   sub    $0x170,%rsp
          1.93    b:   sub    $0x28,%rbp
          7.02    f:   mov    %rbx,0x0(%rbp)
          3.20   13:   mov    %r13,0x8(%rbp)
          1.07   17:   mov    %r14,0x10(%rbp)
          0.61   1b:   mov    %r15,0x18(%rbp)
          0.11   1f:   xor    %eax,%eax
          1.29   21:   mov    %rax,0x20(%rbp)
          0.11   25:   mov    %rdi,%rbx
          2.02   28: → callq  *ffffffffda6776d9
          2.76   2d:   mov    %eax,-0x148(%rbp)
                 33:   mov    %rbp,%rsi
                 36:   add    $0xfffffffffffffeb8,%rsi
                 3d:   movabs $0xffff975ac2607800,%rdi
      
          1.26   47: → callq  *ffffffffda6789e9
                 4c:   cmp    $0x0,%rax
          2.43   50: → je     0
                 52:   add    $0x38,%rax
          0.21   56:   xor    %r13d,%r13d
          0.81   59:   cmp    $0x0,%rax
                 5d: → jne    0
                 63:   mov    %rbp,%rdi
          2.22   66:   add    $0xfffffffffffffeb8,%rdi
          0.11   6d:   mov    $0x40,%esi
          0.32   72:   mov    %rbx,%rdx
          2.74   75: → callq  *ffffffffda658409
          0.22   7a:   mov    %rbp,%rsi
          1.69   7d:   add    $0xfffffffffffffec0,%rsi
                 84:   movabs $0xffff975bfcd36000,%rdi
      
                 8e:   add    $0xd0,%rdi
          0.21   95:   mov    0x0(%rsi),%eax
          0.93   98:   cmp    $0x200,%rax
                 9f: → jae    0
          0.10   a1:   shl    $0x3,%rax
      
          0.11   a5:   add    %rdi,%rax
          0.11   a8: → jmp    0
                 aa:   xor    %eax,%eax
          1.07   ac:   cmp    $0x0,%rax
                 b0: → je     0
          6.57   b6:   movzbq 0x0(%rax),%rdi
      
                 bb:   cmp    $0x0,%rdi
          0.95   bf: → je     0
                 c5:   mov    $0x40,%r8d
                 cb:   mov    -0x140(%rbp),%rdi
                 d2:   cmp    $0x2,%rdi
                 d6: → je     0
                 d8:   cmp    $0x101,%rdi
                 df: → je     0
                 e1:   cmp    $0x15,%rdi
                 e5: → jne    0
                 e7:   mov    0x10(%rbx),%rdx
                 eb: → jmp    0
                 ed:   mov    0x18(%rbx),%rdx
                 f1:   cmp    $0x0,%rdx
                 f5: → je     0
                 f7:   xor    %edi,%edi
                 f9:   mov    %edi,-0x104(%rbp)
                 ff:   mov    %rbp,%rdi
                102:   add    $0xffffffffffffff00,%rdi
                109:   mov    $0x100,%esi
                10e: → callq  *ffffffffda658499
                113:   mov    $0x148,%r8d
                119:   mov    %eax,-0x108(%rbp)
                11f:   mov    %rax,%rdi
                122:   shl    $0x20,%rdi
      
                126:   shr    $0x20,%rdi
      
                12a:   cmp    $0xff,%rdi
                131: → ja     0
                133:   add    $0x48,%rax
                137:   and    $0xff,%rax
                13d:   mov    %rax,%r8
                140:   mov    %rbp,%rcx
                143:   add    $0xfffffffffffffeb8,%rcx
                14a:   mov    %rbx,%rdi
                14d:   movabs $0xffff975fbd72d800,%rsi
      
                157:   mov    $0xffffffff,%edx
                15c: → callq  *ffffffffda658ad9
                161:   mov    %rax,%r13
                164:   mov    %r13,%rax
          0.72  167:   mov    0x0(%rbp),%rbx
                16b:   mov    0x8(%rbp),%r13
          1.16  16f:   mov    0x10(%rbp),%r14
          0.10  173:   mov    0x18(%rbp),%r15
          0.42  177:   add    $0x28,%rbp
          0.54  17b:   leaveq
          0.54  17c: ← retq
      
      Another cool way to test all this is to symple use 'perf top' look for
      those symbols, go there and press enter, annotate it live :-)
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6987561c
  8. 07 3月, 2019 1 次提交
    • A
      perf annotate: Calculate the max instruction name, align column to that · bc3bb795
      Arnaldo Carvalho de Melo 提交于
      We were hardcoding '6' as the max instruction name, and we have lots
      that are longer than that, see the diff from two 'P' printed TUI
      annotations for a libc function that uses instructions with long names,
      such as 'vpmovmskb' with its 9 chars:
      
        --- __strcmp_avx2.annotation.before	2019-03-06 16:31:39.368020425 -0300
        +++ __strcmp_avx2.annotation	2019-03-06 16:32:12.079450508 -0300
        @@ -2,284 +2,284 @@
         Event: cycles:ppp
      
         Percent        endbr64
        -  0.10         mov    %edi,%eax
        +  0.10         mov        %edi,%eax
        -               xor    %edx,%edx
        +               xor        %edx,%edx
        -  3.54         vpxor  %ymm7,%ymm7,%ymm7
        +  3.54         vpxor      %ymm7,%ymm7,%ymm7
        -               or     %esi,%eax
        +               or         %esi,%eax
        -               and    $0xfff,%eax
        +               and        $0xfff,%eax
        -               cmp    $0xf80,%eax
        +               cmp        $0xf80,%eax
        -             ↓ jg     370
        +             ↓ jg         370
        - 27.07         vmovdqu (%rdi),%ymm1
        + 27.07         vmovdqu    (%rdi),%ymm1
        -  7.97         vpcmpeqb (%rsi),%ymm1,%ymm0
        +  7.97         vpcmpeqb   (%rsi),%ymm1,%ymm0
        -  2.15         vpminub %ymm1,%ymm0,%ymm0
        +  2.15         vpminub    %ymm1,%ymm0,%ymm0
        -  4.09         vpcmpeqb %ymm7,%ymm0,%ymm0
        +  4.09         vpcmpeqb   %ymm7,%ymm0,%ymm0
        -  0.43         vpmovmskb %ymm0,%ecx
        +  0.43         vpmovmskb  %ymm0,%ecx
        -  1.53         test   %ecx,%ecx
        +  1.53         test       %ecx,%ecx
        -             ↓ je     b0
        +             ↓ je         b0
        -  5.26         tzcnt  %ecx,%edx
        +  5.26         tzcnt      %ecx,%edx
        - 18.40         movzbl (%rdi,%rdx,1),%eax
        + 18.40         movzbl     (%rdi,%rdx,1),%eax
        -  7.09         movzbl (%rsi,%rdx,1),%edx
        +  7.09         movzbl     (%rsi,%rdx,1),%edx
        -  3.34         sub    %edx,%eax
        +  3.34         sub        %edx,%eax
           2.37         vzeroupper
                      ← retq
                        nop
        -         50:   tzcnt  %ecx,%edx
        +         50:   tzcnt      %ecx,%edx
        -               movzbl 0x20(%rdi,%rdx,1),%eax
        +               movzbl     0x20(%rdi,%rdx,1),%eax
        -               movzbl 0x20(%rsi,%rdx,1),%edx
        +               movzbl     0x20(%rsi,%rdx,1),%edx
        -               sub    %edx,%eax
        +               sub        %edx,%eax
                        vzeroupper
                      ← retq
        -               data16 nopw %cs:0x0(%rax,%rax,1)
        +               data16     nopw %cs:0x0(%rax,%rax,1)
      Reported-by: NTravis Downs <travis.downs@gmail.com>
      LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bc3bb795
  9. 25 1月, 2019 1 次提交
  10. 18 12月, 2018 1 次提交
    • J
      perf annotate: Compute average IPC and IPC coverage per symbol · ace4f8fa
      Jin Yao 提交于
      Add support to 'perf report' annotate view or 'perf annotate --stdio2'
      to aggregate the IPC derived from timed LBRs per symbol. We compute the
      average IPC and the IPC coverage percentage.
      
      For example:
      
        $ perf annotate --stdio2
      
        Percent  IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)
      
                                Disassembly of section .text:
      
                                000000000003aac0 <random@@GLIBC_2.2.5>:
          8.32  3.28              sub    $0x18,%rsp
                3.28              mov    $0x1,%esi
                3.28              xor    %eax,%eax
                3.28              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
         11.57  3.28     1      ↓ je     20
                                  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
                                ↓ jne    29
                                ↓ jmp    43
         11.57  1.10        20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
          0.00  1.10     1      ↓ je     43
                            29:   lea    __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
                                  sub    $0x80,%rsp
                                → callq  __lll_lock_wait_private
                                  add    $0x80,%rsp
          0.00  3.00        43:   lea    __ctype_b@GLIBC_2.2.5+0x38,%rdi
                3.00              lea    0xc(%rsp),%rsi
          8.49  3.00     1      → callq  __random_r
          7.91  1.94              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
          0.00  1.94     1      ↓ je     68
                                  lock   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
                                ↓ jne    70
                                ↓ jmp    8a
          0.00  2.00        68:   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
         21.56  2.00     1      ↓ je     8a
                            70:   lea    __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
                                  sub    $0x80,%rsp
                                → callq  __lll_unlock_wake_private
                                  add    $0x80,%rsp
         21.56  2.90        8a:   movslq 0xc(%rsp),%rax
                2.90              add    $0x18,%rsp
          9.03  2.90     1      ← retq
      
      It shows for this symbol the average IPC is 2.30 and the IPC coverage is
      54.8%.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1543586097-27632-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ace4f8fa
  11. 31 8月, 2018 1 次提交
    • K
      perf annotate: Fix parsing aarch64 branch instructions after objdump update · 4e67b2a5
      Kim Phillips 提交于
      Starting with binutils 2.28, aarch64 objdump adds comments to the
      disassembly output to show the alternative names of a condition code
      [1].
      
      It is assumed that commas in objdump comments could occur in other
      arches now or in the future, so this fix is arch-independent.
      
      The fix could have been done with arm64 specific jump__parse and
      jump__scnprintf functions, but the jump__scnprintf instruction would
      have to have its comment character be a literal, since the scnprintf
      functions cannot receive a struct arch easily.
      
      This inconvenience also applies to the generic jump__scnprintf, which is
      why we add a raw_comment pointer to struct ins_operands, so the __parse
      function assigns it to be re-used by its corresponding __scnprintf
      function.
      
      Example differences in 'perf annotate --stdio2' output on an aarch64
      perf.data file:
      
      BEFORE: → b.cs   ffff200008133d1c <unwind_frame+0x18c>  // b.hs, dffff7ecc47b
      AFTER : ↓ b.cs   18c
      
      BEFORE: → b.cc   ffff200008d8d9cc <get_alloc_profile+0x31c>  // b.lo, b.ul, dffff727295b
      AFTER : ↓ b.cc   31c
      
      The branch target labels 18c and 31c also now appear in the output:
      
      BEFORE:        add    x26, x29, #0x80
      AFTER : 18c:   add    x26, x29, #0x80
      
      BEFORE:        add    x21, x21, #0x8
      AFTER : 31c:   add    x21, x21, #0x8
      
      The Fixes: tag below is added so stable branches will get the update; it
      doesn't necessarily mean that commit was broken at the time, rather it
      didn't withstand the aarch64 objdump update.
      
      Tested no difference in output for sample x86_64, power arch perf.data files.
      
      [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Fixes: b13bbeee ("perf annotate: Fix branch instruction with multiple operands")
      Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e67b2a5
  12. 09 8月, 2018 12 次提交
  13. 04 6月, 2018 11 次提交
  14. 19 5月, 2018 1 次提交
    • J
      perf annotate: Create hotkey 'c' to show min/max cycles · 3e71fc03
      Jin Yao 提交于
      In the 'perf annotate' view, a new hotkey 'c' is created for showing the
      min/max cycles.
      
      For example, when press 'c', the annotate view is:
      
        Percent│ IPC     Cycle(min/max)
               │
               │
               │                             Disassembly of section .text:
               │
               │                             000000000003aab0 <random@@GLIBC_2.2.5>:
          8.22 │3.92                           sub    $0x18,%rsp
               │3.92                           mov    $0x1,%esi
               │3.92                           xor    %eax,%eax
               │3.92                           cmpl   $0x0,argp_program_version_hook@@G
               │3.92             1(2/1)      ↓ je     20
               │                               lock   cmpxchg %esi,__abort_msg@@GLIBC_P
               │                             ↓ jne    29
               │                             ↓ jmp    43
               │1.10                     20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+
          8.93 │1.10             1(5/1)      ↓ je     43
      
      When press 'c' again, the annotate view is switched back:
      
        Percent│ IPC Cycle
               │
               │
               │                Disassembly of section .text:
               │
               │                000000000003aab0 <random@@GLIBC_2.2.5>:
          8.22 │3.92              sub    $0x18,%rsp
               │3.92              mov    $0x1,%esi
               │3.92              xor    %eax,%eax
               │3.92              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x
               │3.92     1      ↓ je     20
               │                  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
               │                ↓ jne    29
               │                ↓ jmp    43
               │1.10        20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
          8.93 │1.10     1      ↓ je     43
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1526569118-14217-3-git-send-email-yao.jin@linux.intel.com
      [ Rename all maxmin to minmax ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3e71fc03