1. 18 11月, 2016 1 次提交
    • A
      perf annotate: Start supporting cross arch annotation · 786c1b51
      Arnaldo Carvalho de Melo 提交于
      Introduce a 'struct arch', where arch specific stuff will live, starting
      with objdump's choice of comment delimitation character, that is '#' in
      x86 while a ';' in arm.
      
      This has some bits and pieces from a patch submitted by Ravi.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-f337tzjjcl8vtapgvjxmhrbx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      786c1b51
  2. 20 9月, 2016 1 次提交
    • A
      perf annotate: Pass the symbol's map/dso to the instruction parsers · bff5c306
      Arnaldo Carvalho de Melo 提交于
      So that things like:
      
             → callq  0xffffffff993e3230
      
      found while disassembling /proc/kcore can be beautified by later
      patches, that will resolve that address to a function, looking it up in
      /proc/kallsyms.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chris Riyder <chris.ryder@arm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Taeung Song <treeze.taeung@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-p76myuke4j7gplg54amaklxk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bff5c306
  3. 09 9月, 2016 1 次提交
    • P
      perf annotate: Add branch stack / basic block · 70fbe057
      Peter Zijlstra 提交于
      I wanted to know the hottest path through a function and figured the
      branch-stack (LBR) information should be able to help out with that.
      
      The below uses the branch-stack to create basic blocks and generate
      statistics from them.
      
              from    to              branch_i
              * ----> *
                      |
                      | block
                      v
                      * ----> *
                      from    to      branch_i+1
      
      The blocks are broken down into non-overlapping ranges, while tracking
      if the start of each range is an entry point and/or the end of a range
      is a branch.
      
      Each block iterates all ranges it covers (while splitting where required
      to exactly match the block) and increments the 'coverage' count.
      
      For the range including the branch we increment the taken counter, as
      well as the pred counter if flags.predicted.
      
      Using these number we can find if an instruction:
      
       - had coverage; given by:
      
              br->coverage / br->sym->max_coverage
      
         This metric ensures each symbol has a 100% spot, which reflects the
         observation that each symbol must have a most covered/hottest
         block.
      
       - is a branch target: br->is_target && br->start == add
      
       - for targets, how much of a branch's coverages comes from it:
      
      	target->entry / branch->coverage
      
       - is a branch: br->is_branch && br->end == addr
      
       - for branches, how often it was taken:
      
              br->taken / br->coverage
      
         after all, all execution that didn't take the branch would have
         incremented the coverage and continued onward to a later branch.
      
       - for branches, how often it was predicted:
      
              br->pred / br->taken
      
      The coverage percentage is used to color the address and asm sections;
      for low (<1%) coverage we use NORMAL (uncolored), indicating that these
      instructions are not 'important'. For high coverage (>75%) we color the
      address RED.
      
      For each branch, we add an asm comment after the instruction with
      information on how often it was taken and predicted.
      
      Output looks like (sans color, which does loose a lot of the
      information :/)
      
      $ perf record --branch-filter u,any -e cycles:p ./branches 27
      $ perf annotate branches
      
       Percent |	Source code & Disassembly of branches for cycles:pu (217 samples)
      ---------------------------------------------------------------------------------
               :	branches():
          0.00 :	  40057a:       push   %rbp
          0.00 :	  40057b:       mov    %rsp,%rbp
          0.00 :	  40057e:       sub    $0x20,%rsp
          0.00 :	  400582:       mov    %rdi,-0x18(%rbp)
          0.00 :	  400586:       mov    %rsi,-0x20(%rbp)
          0.00 :	  40058a:       mov    -0x18(%rbp),%rax
          0.00 :	  40058e:       mov    %rax,-0x10(%rbp)
          0.00 :	  400592:       movq   $0x0,-0x8(%rbp)
          0.00 :	  40059a:       jmpq   400656 <branches+0xdc>
          1.84 :	  40059f:       mov    -0x10(%rbp),%rax	# +100.00%
          3.23 :	  4005a3:       and    $0x1,%eax
          1.84 :	  4005a6:       test   %rax,%rax
          0.00 :	  4005a9:       je     4005bf <branches+0x45>	# -54.50% (p:42.00%)
          0.46 :	  4005ab:       mov    0x200bbe(%rip),%rax        # 601170 <acc>
         12.90 :	  4005b2:       add    $0x1,%rax
          2.30 :	  4005b6:       mov    %rax,0x200bb3(%rip)        # 601170 <acc>
          0.46 :	  4005bd:       jmp    4005d1 <branches+0x57>	# -100.00% (p:100.00%)
          0.92 :	  4005bf:       mov    0x200baa(%rip),%rax        # 601170 <acc>	# +49.54%
         13.82 :	  4005c6:       sub    $0x1,%rax
          0.46 :	  4005ca:       mov    %rax,0x200b9f(%rip)        # 601170 <acc>
          2.30 :	  4005d1:       mov    -0x10(%rbp),%rax	# +50.46%
          0.46 :	  4005d5:       mov    %rax,%rdi
          0.46 :	  4005d8:       callq  400526 <lfsr>	# -100.00% (p:100.00%)
          0.00 :	  4005dd:       mov    %rax,-0x10(%rbp)	# +100.00%
          0.92 :	  4005e1:       mov    -0x18(%rbp),%rax
          0.00 :	  4005e5:       and    $0x1,%eax
          0.00 :	  4005e8:       test   %rax,%rax
          0.00 :	  4005eb:       je     4005ff <branches+0x85>	# -100.00% (p:100.00%)
          0.00 :	  4005ed:       mov    0x200b7c(%rip),%rax        # 601170 <acc>
          0.00 :	  4005f4:       shr    $0x2,%rax
          0.00 :	  4005f8:       mov    %rax,0x200b71(%rip)        # 601170 <acc>
          0.00 :	  4005ff:       mov    -0x10(%rbp),%rax	# +100.00%
          7.37 :	  400603:       and    $0x1,%eax
          3.69 :	  400606:       test   %rax,%rax
          0.00 :	  400609:       jne    400612 <branches+0x98>	# -59.25% (p:42.99%)
          1.84 :	  40060b:       mov    $0x1,%eax
         14.29 :	  400610:       jmp    400617 <branches+0x9d>	# -100.00% (p:100.00%)
          1.38 :	  400612:       mov    $0x0,%eax	# +57.65%
         10.14 :	  400617:       test   %al,%al	# +42.35%
          0.00 :	  400619:       je     40062f <branches+0xb5>	# -57.65% (p:100.00%)
          0.46 :	  40061b:       mov    0x200b4e(%rip),%rax        # 601170 <acc>
          2.76 :	  400622:       sub    $0x1,%rax
          0.00 :	  400626:       mov    %rax,0x200b43(%rip)        # 601170 <acc>
          0.46 :	  40062d:       jmp    400641 <branches+0xc7>	# -100.00% (p:100.00%)
          0.92 :	  40062f:       mov    0x200b3a(%rip),%rax        # 601170 <acc>	# +56.13%
          2.30 :	  400636:       add    $0x1,%rax
          0.92 :	  40063a:       mov    %rax,0x200b2f(%rip)        # 601170 <acc>
          0.92 :	  400641:       mov    -0x10(%rbp),%rax	# +43.87%
          2.30 :	  400645:       mov    %rax,%rdi
          0.00 :	  400648:       callq  400526 <lfsr>	# -100.00% (p:100.00%)
          0.00 :	  40064d:       mov    %rax,-0x10(%rbp)	# +100.00%
          1.84 :	  400651:       addq   $0x1,-0x8(%rbp)
          0.92 :	  400656:       mov    -0x8(%rbp),%rax
          5.07 :	  40065a:       cmp    -0x20(%rbp),%rax
          0.00 :	  40065e:       jb     40059f <branches+0x25>	# -100.00% (p:100.00%)
          0.00 :	  400664:       nop
          0.00 :	  400665:       leaveq
          0.00 :	  400666:       retq
      
      (Note: the --branch-filter u,any was used to avoid spurious target and
      branch points due to interrupts/faults, they show up as very small -/+
      annotations on 'weird' locations)
      
      Committer note:
      
      Please take a look at:
      
        http://vger.kernel.org/~acme/perf/annotate_basic_blocks.png
      
      To see the colors.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      [ Moved sym->max_coverage to 'struct annotate', aka symbol__annotate(sym) ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      70fbe057
  4. 30 8月, 2016 1 次提交
  5. 02 8月, 2016 2 次提交
  6. 28 6月, 2016 1 次提交
  7. 27 6月, 2016 1 次提交
  8. 23 3月, 2016 1 次提交
  9. 06 10月, 2015 1 次提交
  10. 07 8月, 2015 2 次提交
  11. 20 6月, 2015 2 次提交
  12. 17 1月, 2015 1 次提交
    • N
      perf tools: Fix segfault for symbol annotation on TUI · 813ccd15
      Namhyung Kim 提交于
      Currently the symbol structure is allocated with symbol_conf.priv_size
      to carry sideband information like annotation, map browser on TUI and
      sort-by-name tree node.  So retrieving these information from symbol
      needs to care about the details of such placement.
      
      However the annotation code just assumes that the symbol is placed after
      the struct annotation.  But actually there's other info between them.
      So accessing those struct will lead to an undefined behavior (usually a
      crash) after they write their info to the same location.
      
      To reproduce the problem, please follow the steps below:
      
        1. run perf report (TUI of course) with -v option
        2. open map browser (by pressing right arrow key for any entry)
        3. search any function (by pressing '/' key and input whatever..)
        4. return to the hist browser (by pressing 'q' or left arrow key)
        5. open annotation window for the same entry (by pressing 'a' key)
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1421234288-22758-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      813ccd15
  13. 19 11月, 2014 1 次提交
    • A
      perf annotate: Support source line numbers in annotate · e592488c
      Andi Kleen 提交于
      With srcline key/sort'ing it's useful to have line numbers in the
      annotate window. This patch implements this.
      
      Use objdump -l to request the line numbers and save them in the line
      structure. Then the browser displays them for source lines.
      
      The line numbers are not displayed by default, but can be toggled on
      with 'k'
      
      There is one unfortunate problem with this setup. For lines not
      containing source and which are outside functions objdump -l reports
      line numbers off by a few: it always reports the first line number in
      the next function even for lines that are outside the function.
      
      I haven't found a nice way to detect/correct this. Probably objdump has
      to be fixed.
      
      See https://sourceware.org/bugzilla/show_bug.cgi?id=16433
      
      The line numbers are still useful even with these problems, as most are
      correct and the ones which are not are nearby.
      
      v2: Fix help text. Handle (discriminator...) output in objdump.
      Left align the line numbers.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-9-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e592488c
  14. 02 5月, 2014 1 次提交
  15. 24 2月, 2014 1 次提交
  16. 19 12月, 2013 3 次提交
  17. 10 10月, 2013 1 次提交
  18. 09 10月, 2013 1 次提交
  19. 01 4月, 2013 1 次提交
  20. 16 3月, 2013 4 次提交
  21. 15 2月, 2013 2 次提交
  22. 10 11月, 2012 1 次提交
    • N
      perf annotate: Merge same lines in summary view · 41127965
      Namhyung Kim 提交于
      The --print-line option of perf annotate command shows summary for
      each source line.  But it didn't merge same lines so that it can
      appear multiple times.
      
      * before:
      
        Sorted summary for file /home/namhyung/bin/mcol
        ----------------------------------------------
           21.71 /home/namhyung/tmp/mcol.c:26
           20.66 /home/namhyung/tmp/mcol.c:25
            9.53 /home/namhyung/tmp/mcol.c:24
            7.68 /home/namhyung/tmp/mcol.c:25
            7.67 /home/namhyung/tmp/mcol.c:25
            7.66 /home/namhyung/tmp/mcol.c:26
            7.49 /home/namhyung/tmp/mcol.c:26
            6.92 /home/namhyung/tmp/mcol.c:25
            6.81 /home/namhyung/tmp/mcol.c:25
            1.07 /home/namhyung/tmp/mcol.c:26
            0.52 /home/namhyung/tmp/mcol.c:25
            0.51 /home/namhyung/tmp/mcol.c:25
            0.51 /home/namhyung/tmp/mcol.c:24
      
      * after:
      
        Sorted summary for file /home/namhyung/bin/mcol
        ----------------------------------------------
           50.77 /home/namhyung/tmp/mcol.c:25
           37.94 /home/namhyung/tmp/mcol.c:26
           10.04 /home/namhyung/tmp/mcol.c:24
      
      To do that, introduce percent_sum field so that the normal
      line-by-line output doesn't get changed.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1352440729-21848-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41127965
  23. 06 11月, 2012 1 次提交
  24. 25 10月, 2012 1 次提交
  25. 03 10月, 2012 1 次提交
  26. 11 9月, 2012 1 次提交
    • I
      perf tools: Use __maybe_used for unused variables · 1d037ca1
      Irina Tirdea 提交于
      perf defines both __used and __unused variables to use for marking
      unused variables. The variable __used is defined to
      __attribute__((__unused__)), which contradicts the kernel definition to
      __attribute__((__used__)) for new gcc versions. On Android, __used is
      also defined in system headers and this leads to warnings like: warning:
      '__used__' attribute ignored
      
      __unused is not defined in the kernel and is not a standard definition.
      If __unused is included everywhere instead of __used, this leads to
      conflicts with glibc headers, since glibc has a variables with this name
      in its headers.
      
      The best approach is to use __maybe_unused, the definition used in the
      kernel for __attribute__((unused)). In this way there is only one
      definition in perf sources (instead of 2 definitions that point to the
      same thing: __used and __unused) and it works on both Linux and Android.
      This patch simply replaces all instances of __used and __unused with
      __maybe_unused.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
      [ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d037ca1
  27. 08 9月, 2012 1 次提交
  28. 06 9月, 2012 1 次提交
  29. 13 5月, 2012 2 次提交
  30. 12 5月, 2012 1 次提交