1. 18 12月, 2018 3 次提交
    • I
      perf tools: Fix diverse comment typos · adba1634
      Ingo Molnar 提交于
      Go over the tools/ files that are maintained in Arnaldo's tree and
      fix common typos: half of them were in comments, the other half
      in JSON files.
      
      No change in functionality intended.
      
      Committer notes:
      
      This was split from a larger patch as there are code that is,
      additionally, maintained outside the kernel tree, so to ease
      cherry-picking and/or backporting, split this into multiple patches.
      
      Just typos in comments, no need to backport, reducing the possibility of
      possible backporting artifacts.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20181203102200.GA104797@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      adba1634
    • J
      perf annotate: Create a annotate2 flag in struct symbol · 246fda09
      Jin Yao 提交于
      We often use the symbol__annotate2() to annotate a specified symbol.
      While annotating may take some time, so in order to avoid annotating the
      same symbol repeatedly, the patch creates a new flag to indicate the
      symbol has been annotated.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1543586097-27632-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      246fda09
    • J
      perf annotate: Compute average IPC and IPC coverage per symbol · ace4f8fa
      Jin Yao 提交于
      Add support to 'perf report' annotate view or 'perf annotate --stdio2'
      to aggregate the IPC derived from timed LBRs per symbol. We compute the
      average IPC and the IPC coverage percentage.
      
      For example:
      
        $ perf annotate --stdio2
      
        Percent  IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)
      
                                Disassembly of section .text:
      
                                000000000003aac0 <random@@GLIBC_2.2.5>:
          8.32  3.28              sub    $0x18,%rsp
                3.28              mov    $0x1,%esi
                3.28              xor    %eax,%eax
                3.28              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
         11.57  3.28     1      ↓ je     20
                                  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
                                ↓ jne    29
                                ↓ jmp    43
         11.57  1.10        20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
          0.00  1.10     1      ↓ je     43
                            29:   lea    __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
                                  sub    $0x80,%rsp
                                → callq  __lll_lock_wait_private
                                  add    $0x80,%rsp
          0.00  3.00        43:   lea    __ctype_b@GLIBC_2.2.5+0x38,%rdi
                3.00              lea    0xc(%rsp),%rsi
          8.49  3.00     1      → callq  __random_r
          7.91  1.94              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
          0.00  1.94     1      ↓ je     68
                                  lock   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
                                ↓ jne    70
                                ↓ jmp    8a
          0.00  2.00        68:   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
         21.56  2.00     1      ↓ je     8a
                            70:   lea    __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
                                  sub    $0x80,%rsp
                                → callq  __lll_unlock_wake_private
                                  add    $0x80,%rsp
         21.56  2.90        8a:   movslq 0xc(%rsp),%rax
                2.90              add    $0x18,%rsp
          9.03  2.90     1      ← retq
      
      It shows for this symbol the average IPC is 2.30 and the IPC coverage is
      54.8%.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1543586097-27632-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ace4f8fa
  2. 18 10月, 2018 1 次提交
    • D
      perf annotate: Add Sparc support · 0ab41886
      David Miller 提交于
      E.g.:
      
        $ perf annotate --stdio2
        Samples: 7K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 3086733887
        __gettimeofday  /lib32/libc-2.27.so [Percent: local period]
        Percent│
               │
               │
               │    Disassembly of section .text:
               │
               │    000a6fa0 <__gettimeofday@@GLIBC_2.0>:
          0.47 │      save   %sp, -96, %sp
          0.73 │      sethi  %hi(0xe9000), %l7
               │    → call   __frame_state_for@@GLIBC_2.0+0x480
          0.30 │      add    %l7, 0x58, %l7     ! e9058 <nftw64@@GLIBC_2.3.3+0x818>
          1.33 │      mov    %i0, %o0
               │      mov    %i1, %o1
          0.43 │      mov    0x74, %g1
               │      ta     0x10
         88.92 │    ↓ bcc    30
          2.95 │      clr    %g1
               │      neg    %o0
               │      mov    1, %g1
          0.31 │30:   cmp    %g1, 0
               │      bne,pn %icc, a6fe4 <__gettimeofday@@GLIBC_2.0+0x44>
               │      mov    %o0, %i0
          1.96 │    ← return %i7 + 8
          2.62 │      nop
               │      sethi  %hi(0), %g1
               │      neg    %o0, %g2
               │      add    %g1, 0x160, %g1
               │      ld     [ %l7 + %g1 ], %g1
               │      st     %g2, [ %g7 + %g1 ]
               │    ← return %i7 + 8
               │      mov    -1, %o0
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20181016.205555.1070918198627611771.davem@davemloft.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ab41886
  3. 31 8月, 2018 2 次提交
    • K
      perf annotate: Fix parsing aarch64 branch instructions after objdump update · 4e67b2a5
      Kim Phillips 提交于
      Starting with binutils 2.28, aarch64 objdump adds comments to the
      disassembly output to show the alternative names of a condition code
      [1].
      
      It is assumed that commas in objdump comments could occur in other
      arches now or in the future, so this fix is arch-independent.
      
      The fix could have been done with arm64 specific jump__parse and
      jump__scnprintf functions, but the jump__scnprintf instruction would
      have to have its comment character be a literal, since the scnprintf
      functions cannot receive a struct arch easily.
      
      This inconvenience also applies to the generic jump__scnprintf, which is
      why we add a raw_comment pointer to struct ins_operands, so the __parse
      function assigns it to be re-used by its corresponding __scnprintf
      function.
      
      Example differences in 'perf annotate --stdio2' output on an aarch64
      perf.data file:
      
      BEFORE: → b.cs   ffff200008133d1c <unwind_frame+0x18c>  // b.hs, dffff7ecc47b
      AFTER : ↓ b.cs   18c
      
      BEFORE: → b.cc   ffff200008d8d9cc <get_alloc_profile+0x31c>  // b.lo, b.ul, dffff727295b
      AFTER : ↓ b.cc   31c
      
      The branch target labels 18c and 31c also now appear in the output:
      
      BEFORE:        add    x26, x29, #0x80
      AFTER : 18c:   add    x26, x29, #0x80
      
      BEFORE:        add    x21, x21, #0x8
      AFTER : 31c:   add    x21, x21, #0x8
      
      The Fixes: tag below is added so stable branches will get the update; it
      doesn't necessarily mean that commit was broken at the time, rather it
      didn't withstand the aarch64 objdump update.
      
      Tested no difference in output for sample x86_64, power arch perf.data files.
      
      [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730Signed-off-by: NKim Phillips <kim.phillips@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Fixes: b13bbeee ("perf annotate: Fix branch instruction with multiple operands")
      Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e67b2a5
    • M
      perf annotate: Properly interpret indirect call · 1dc27f63
      Martin Liška 提交于
      The patch changes the parsing of:
      
      	callq  *0x8(%rbx)
      
      from:
      
        0.26 │     → callq  *8
      
      to:
      
        0.26 │     → callq  *0x8(%rbx)
      
      in this case an address is followed by a register, thus one can't parse
      only the address.
      
      Committer testing:
      
      1) run 'perf record sleep 10'
      2) before applying the patch, run:
      
           perf annotate --stdio2 > /tmp/before
      
      3) after applying the patch, run:
      
           perf annotate --stdio2 > /tmp/after
      
      4) diff /tmp/before /tmp/after:
        --- /tmp/before 2018-08-28 11:16:03.238384143 -0300
        +++ /tmp/after  2018-08-28 11:15:39.335341042 -0300
        @@ -13274,7 +13274,7 @@
                      ↓ jle    128
                        hash_value = hash_table->hash_func (key);
                        mov    0x8(%rsp),%rdi
        -  0.91       → callq  *30
        +  0.91       → callq  *0x30(%r12)
                        mov    $0x2,%r8d
                        cmp    $0x2,%eax
                        node_hash = hash_table->hashes[node_index];
        @@ -13848,7 +13848,7 @@
                         mov    %r14,%rdi
                         sub    %rbx,%r13
                         mov    %r13,%rdx
        -              → callq  *38
        +              → callq  *0x38(%r15)
                         cmp    %rax,%r13
           1.91        ↓ je     240
                  1b4:   mov    $0xffffffff,%r13d
        @@ -14026,7 +14026,7 @@
                         mov    %rcx,-0x500(%rbp)
                         mov    %r15,%rsi
                         mov    %r14,%rdi
        -              → callq  *38
        +              → callq  *0x38(%rax)
                         mov    -0x500(%rbp),%rcx
                         cmp    %rax,%rcx
                       ↓ jne    9b0
      <SNIP tons of other such cases>
      Signed-off-by: NMartin Liška <mliska@suse.cz>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/bd1f3932-be2b-85f9-7582-111ee0a43b07@suse.czSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1dc27f63
  4. 20 8月, 2018 1 次提交
  5. 09 8月, 2018 17 次提交
  6. 06 6月, 2018 1 次提交
    • A
      perf annnotate: Make __symbol__inc_addr_samples handle src->histograms == NULL · 8d628d26
      Arnaldo Carvalho de Melo 提交于
      Making it a bit more robust, this took place here when a sample appeared
      right after:
      
        ffffffff8a925000 D __nosave_end
      
      And before the next considered symbol, which, using kallsyms make us
      over guess the size of __nosave_end, and then the sequence:
      
        hist_entry__inc_addr_samples ->
          symbol__inc_addr_samples ->
            symbol__hists ->
              annotated_source__alloc_histograms
      
      Ends up not liking to allocate gigabytes of ram for annotation...
      
      This will be alleviated by considering BSS symbols, which we should but
      don't so far, and then we should investigate those samples further.
      
      The testcase was to have:
      
         perf top -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions
      
      Running for a while till it segfaulted trying to access NULL notes->src->histograms.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-ndfjtpiop3tdcnyjgp320ra8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8d628d26
  7. 04 6月, 2018 15 次提交